Skip to content
View jrajala6's full-sized avatar

Block or report jrajala6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jrajala6/README.md

Hi 👋 I’m Jisha

I am a Computer Science student focused on machine learning from first principles, with a specific research interest in reasoning capabilities, mathematical logic, and transformer interpretability.

My approach is dual-track: I build architectures from scratch to understand the math, and I leverage state-of-the-art open weights to push the boundaries of reasoning.


🧠 Core Machine Learning & Reasoning

Focus: Reasoning Alignment & Mathematical Logic A specialized exploration into reasoning behaviors, building upon Qwen2.5-Math-1.5B-Instruct.

  • Foundation: Leverages the Qwen2.5-Math architecture to study how small-to-mid-sized models handle complex logical chains.
  • Methodology: Focuses on optimizing Chain-of-Thought (CoT) performance and step-by-step verification.
  • Goal: To understand the limits of reasoning in <2B parameter models and improve logical consistency in mathematical tasks.

Focus: Architecture & Training Dynamics A ground-up implementation of the GPT-2 architecture (124M) to master the engineering behind LLMs.

  • Implemented Causal Self-Attention, Multi-Head Attention, and MLP blocks manually.
  • Wrote custom training loops and data loading pipelines without high-level trainer abstractions.
  • Why: To move beyond "import transformers" and understand the raw tensor operations that drive language generation.

Focus: The Mathematical Foundation A deep dive into the calculus of AI, based on Andrej Karpathy’s curriculum.

  • Built Micrograd (backpropagation engine) from scratch.
  • Explored loss surfaces, vanishing gradients, and activation function mechanics.
  • Bridged the gap between raw Python/Math and PyTorch tensors.

🛠️ Systems Engineering Foundation

I believe strong ML engineers must also be strong systems engineers. I build low-level systems to ensure I understand the infrastructure that serves my models.

  • HTTP Server (Python) — Handled raw sockets and request parsing to understand the transport layer.
  • BitTorrent Client — Implemented peer-to-peer protocols to master distributed data transfer.
  • Redis-like Server — Built in-memory data structures to understand efficient state management.

🎯 Research Interests

  • Neuro-AI Alignment: Exploring new paradigms to align model architectures with biological and neurological processes (e.g., biologically plausible learning rules beyond backprop).
  • Embodied AI & Robotics: Applying reasoning models to physical control systems to bridge the gap between high-level logic and real-world actuation.
  • Reasoning Priors: How to induce stronger logic and Chain-of-Thought in smaller parameter spaces.

📫 Connect

I am actively looking to collaborate on reasoning-centric ML or robotics research.

[(https://www.linkedin.com/in/jisha-rajala/)] • [jisharajala@gmail.com]

Thanks for stopping by ✨

Pinned Loading

  1. nano-reason nano-reason Public

    Test-Time Compute for Mathematical Reasoning using Monte Carlo Tree Search

    Python

  2. codecrafters-bittorrent-python codecrafters-bittorrent-python Public

    Python

  3. gpt2-from-scratch gpt2-from-scratch Public

    Python

  4. jrajala6-codecrafters-redis-python jrajala6-codecrafters-redis-python Public

    Building my own redis server

    Python