Jisha Rajala jrajala6

Hi 👋 I’m Jisha

I am a Computer Science student focused on machine learning from first principles, with a specific research interest in reasoning capabilities, mathematical logic, and transformer interpretability.

My approach is dual-track: I build architectures from scratch to understand the math, and I leverage state-of-the-art open weights to push the boundaries of reasoning.

🧠 Core Machine Learning & Reasoning

🔬 Nano-Reason

Focus: Reasoning Alignment & Mathematical Logic A specialized exploration into reasoning behaviors, building upon Qwen2.5-Math-1.5B-Instruct.

Foundation: Leverages the Qwen2.5-Math architecture to study how small-to-mid-sized models handle complex logical chains.
Methodology: Focuses on optimizing Chain-of-Thought (CoT) performance and step-by-step verification.
Goal: To understand the limits of reasoning in <2B parameter models and improve logical consistency in mathematical tasks.

🤖 GPT-2 From Scratch

Focus: Architecture & Training Dynamics A ground-up implementation of the GPT-2 architecture (124M) to master the engineering behind LLMs.

Implemented Causal Self-Attention, Multi-Head Attention, and MLP blocks manually.
Wrote custom training loops and data loading pipelines without high-level trainer abstractions.
Why: To move beyond "import transformers" and understand the raw tensor operations that drive language generation.

🧪 Neural Networks (Zero-to-Hero)

Focus: The Mathematical Foundation A deep dive into the calculus of AI, based on Andrej Karpathy’s curriculum.

Built Micrograd (backpropagation engine) from scratch.
Explored loss surfaces, vanishing gradients, and activation function mechanics.
Bridged the gap between raw Python/Math and PyTorch tensors.

🛠️ Systems Engineering Foundation

I believe strong ML engineers must also be strong systems engineers. I build low-level systems to ensure I understand the infrastructure that serves my models.

HTTP Server (Python) — Handled raw sockets and request parsing to understand the transport layer.
BitTorrent Client — Implemented peer-to-peer protocols to master distributed data transfer.
Redis-like Server — Built in-memory data structures to understand efficient state management.

🎯 Research Interests

Neuro-AI Alignment: Exploring new paradigms to align model architectures with biological and neurological processes (e.g., biologically plausible learning rules beyond backprop).
Embodied AI & Robotics: Applying reasoning models to physical control systems to bridge the gap between high-level logic and real-world actuation.
Reasoning Priors: How to induce stronger logic and Chain-of-Thought in smaller parameter spaces.

📫 Connect

I am actively looking to collaborate on reasoning-centric ML or robotics research.

[(https://www.linkedin.com/in/jisha-rajala/)] • [jisharajala@gmail.com]

Thanks for stopping by ✨

Provide feedback

Saved searches

Use saved searches to filter your results more quickly