Skip to content
View j-clawson's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report j-clawson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
j-clawson/README.md

Hello! I'm James Clawson, a third-year UCLA student from rural New Jersey

LinkedIn Email me!

❓ About Me

πŸŽ“ Major: Data Theory | Minors: Data Science Engineering and Korean Language
🌱 Hometown: Califon, New Jersey (Population: 900)
🏫 Where am I now?: Los Angeles, California (Population: 3.9 million)
πŸ˜„ Hobbies: Listening to music, reading and writing poetry, hiking, exploring art museums
πŸ€” Coursework: Linear Algebra, Mathematical Statistics, Optimization, Machine Learning, Data Mining, Real Analysis, Korean


πŸ’» Technical Skills

Programming Languages: Python, R, SQL, Java, C++, Bash
Libraries/Frameworks: BeautifulSoup, CatBoost, Matplotlib, Pandas, PyTorch, Seaborn, Sentence-Transformers, scikit-learn, sqlite3, XGBoost
Tools: ChromaDB, Git, Jupyter Notebooks, Microsoft Office Suite (Excel, PowerPoint, etc), Microsoft SQL Server, Tableau, VSCode


πŸ”­ Relevant Experience

Data Science Intern at Yale University

  • Selected as one of 25 undergraduates from a global pool of over 850 applicants for Yale's inaugural Big Data Summer Immersion (BDSY), engaging with Yale faculty over logistic regression, neural networks, advanced SQL querying, and the applications of generative AI models
  • Designed and led the presentation of an original research poster to an audience of ~100 Yale faculty and researchers at the Symposium on Big Data, Human Health, and Statistics (more details on this in the Projects section below!)
  • Selected from a competitive university-wide applicant pool to collaborate with peers on deep learning and natural language processing projects
  • Most recent project, a Car Manual LLM, makes use of sentence-transformers, the Chroma vector database, and a retrieval-augmented generation (RAG) pipeline --> more information in the Projects section below!

Statistical Researcher at the Lohmueller Lab

  • Implementing statistical modeling techniques via Python and Bash scripts to visualize genetic variation across the Channel Island and Gray Fox populations
  • Utilizing Hoffman2, UCLA's high-powered computing (HPC) cluster
  • Serving as a peer mentor for low-income, underrepresented undergraduates in the AAP community
  • Started as the sole Math 115A (Linear Algebra) peer tutor during the Spring 2025 quarter, now teaching Statistics 100A (Introduction to Probability)

⚑ Projects

Presented an original research poster, analyzing ~2.5 million pneumococcal cases to identify no wealth-based disparities in child mortality decline using time-series analysis, Bayesian hierarchical modeling, and unsupervised machine learning (Apriori algorithm)

  • Collaborators: Antonio Bolea (Yale University) and Kevin Truong (University of California, Berkeley)
  • Notable R Libraries: ggplot2, JAGS, arules, arulesViz

Car Manual Large Language Model - DataRes Research Team

Developed Drive and Diagnose (DAD), a scalable multimodal NLP and RAG pipeline capable of processing any car manual PDF or dashboard warning light to help drivers identify and address vehicle issues

  • Collaborators: Aryan Gupta, Christian Chen, and Parnika Chaturvedi
  • Technologies: ChromaDB, OpenAI CLIP, OpenAI GPT-4 Vision, Python, TypeScript
  • Notable Python Libraries: BeautifulSoup, PyPDF2, Sentence-Transformers

Pinned Loading

  1. bdsy-pbh-peru bdsy-pbh-peru Public

    Forked from to-ke/bdsy-pbh-peru

    Presented at the Symposium on Big Data, Human Health, and Statistics at Kline Tower on July 24, 2025

    R

  2. car-manual-llm car-manual-llm Public

    Forked from chenchristian/CarManual-LLM

    Spring 2025 DataRes Research; Presented during the concluding "Demo Day"

    Python