Hi, I'm

Thomas Vu

Developer tools, ML research, and problem solving

Second-year CS student at the University of Calgary. I build developer tools, conduct NLP research, and like solving problems in my free time.

About

I'm a second-year Computer Science student at the University of Calgary, maintaining a 3.7 GPA and recognized on the Dean's List for academic excellence.

My interests span from building developer tools and transpilers to conducting research in machine learning and NLP. I enjoy solving complex problems and shipping products that people can actually use.

Languages

Python, C++, TypeScript, Java, SQL, Bash

Frameworks

React, Next.js, FastAPI, LangGraph, PyTorch

Tools

Docker, AWS, Git, spaCy, Transformers

Experience

Undergraduate Researcher

Oct 2025 — Present

University of Calgary · Machine Learning & NLP

  • Built NLP pipelines to assess text complexity in German literature using Transformers, GPT, and spaCy
  • Used PCA and t-SNE to visualize feature space and explore clustering with KNN
  • Built a self-supervised contrastive model reaching 88% pairwise accuracy on ~50k sentences

Undergraduate Researcher

May 2025 — Aug 2025

Vision Research Lab, University of Calgary · Software Engineering

  • Built an automated PPT generator from documents with an integrated chatbot; shipped full web frontend and backend
  • Dockerized services and built serverless Python apps on AWS Lambda; optimized ffmpeg pipelines from ~30 min to ≤1 min
  • Implemented LangGraph-based chatbot controller with React/TypeScript frontend

Research

Software Engineering Research

SmartSuite

Vision Research Lab, University of Calgary

A suite of AI-driven office tools deployed on AWS. Features intelligent document processing, presentation generation, and conversational AI with LangGraph-based agents.

  • Document-to-presentation pipeline with multi-step agentic workflow
  • Microservices on AWS Lambda: TTS, image generation, RAG retrieval
  • LanceDB vector search for semantic document retrieval
Python AWS Lambda AWS CDK DynamoDB S3 Docker LangGraph LanceDB OpenAI React TypeScript
NLP Research

Text Complexity Scoring

University of Calgary

Self-supervised methods for scoring text complexity in German sentences. Developing unsupervised ranking models without requiring manual labels.

  • 38 linguistic features: lexical, syntactic, POS ratios, GPT-2 perplexity
  • Contrastive learning with feature-direction priors and curriculum training
  • Achieved 88% pairwise accuracy with 0.91 Spearman correlation
Python PyTorch Transformers spaCy BERT GPT-2 scikit-learn NumPy

Projects

Py2Cpp

Python to C++ transpiler for competitive programming. Parses Python into an AST, infers types, and generates equivalent C++ code.

  • AST-based parsing with type inference for variable declarations
  • Translates common Python patterns: loops, list ops, built-in functions
  • Web interface with Monaco Editor, plus CLI support
Python AST FastAPI React TypeScript

Algebra Solver

Symbolic algebra system that solves systems of linear equations. Built from scratch in C++ with a web frontend.

  • Custom lexer and parser building expression trees
  • Simplification, substitution, and variable isolation
  • C++ core exposed to Python via pybind11
C++ CMake pybind11 FastAPI React

Awards

Calgary Collegiate Programming Contest 2025

Division 2 — Third Place

March 2025

Calgary First Year Contest 2025

Winner

February 2025

Dean's List

Academic Excellence 2024-2025

2024-2025

Get in Touch

I'm always open to discussing new opportunities, collaborations, or just chatting about interesting problems.