I am a Computer Science student focused on machine learning from first principles, with a specific research interest in reasoning capabilities, mathematical logic, and transformer interpretability.
My approach is dual-track: I build architectures from scratch to understand the math, and I leverage state-of-the-art open weights to push the boundaries of reasoning.
Focus: Reasoning Alignment & Mathematical Logic A specialized exploration into reasoning behaviors, building upon Qwen2.5-Math-1.5B-Instruct.
- Foundation: Leverages the Qwen2.5-Math architecture to study how small-to-mid-sized models handle complex logical chains.
- Methodology: Focuses on optimizing Chain-of-Thought (CoT) performance and step-by-step verification.
- Goal: To understand the limits of reasoning in <2B parameter models and improve logical consistency in mathematical tasks.
Focus: Architecture & Training Dynamics A ground-up implementation of the GPT-2 architecture (124M) to master the engineering behind LLMs.
- Implemented Causal Self-Attention, Multi-Head Attention, and MLP blocks manually.
- Wrote custom training loops and data loading pipelines without high-level trainer abstractions.
- Why: To move beyond "import transformers" and understand the raw tensor operations that drive language generation.
Focus: The Mathematical Foundation A deep dive into the calculus of AI, based on Andrej Karpathy’s curriculum.
- Built Micrograd (backpropagation engine) from scratch.
- Explored loss surfaces, vanishing gradients, and activation function mechanics.
- Bridged the gap between raw Python/Math and PyTorch tensors.
I believe strong ML engineers must also be strong systems engineers. I build low-level systems to ensure I understand the infrastructure that serves my models.
- HTTP Server (Python) — Handled raw sockets and request parsing to understand the transport layer.
- BitTorrent Client — Implemented peer-to-peer protocols to master distributed data transfer.
- Redis-like Server — Built in-memory data structures to understand efficient state management.
- Neuro-AI Alignment: Exploring new paradigms to align model architectures with biological and neurological processes (e.g., biologically plausible learning rules beyond backprop).
- Embodied AI & Robotics: Applying reasoning models to physical control systems to bridge the gap between high-level logic and real-world actuation.
- Reasoning Priors: How to induce stronger logic and Chain-of-Thought in smaller parameter spaces.
I am actively looking to collaborate on reasoning-centric ML or robotics research.
[(https://www.linkedin.com/in/jisha-rajala/)] • [jisharajala@gmail.com]
Thanks for stopping by ✨


