GauchoCourse
Inspiration
The complex process of course selection at UCSB extends beyond merely meeting degree requirements. WE ALWAYS look at ratemyprofessor.com, reddit, and college provided course websites to find which professors will give us the best edge in being successful in the course we intend to take. We wanted to create an RAG AI-powered solution that could analyze comprehensive course data and provide personalized recommendations based on each student's unique academic profile and learning style while cross refrencing relevant data to make sure there are no hallucinations. The ability to analyze student transcripts adds another layer of personalization, allowing the system to understand each student's academic journey and make truly tailored suggestions.
What it does
GauchoCourse is an intelligent chatbot that leverages RAG (Retrieval-Augmented Generation) technology to provide personalized course guidance. By combining official UCSB course data, RateMyProfessor reviews, and historical grade distributions made available through FERPA, our AI assistant helps students make data-driven decisions about their course selections. When a student shares their transcript, the system intelligently chunks this information into meaningful segments, analyzing course patterns, grade trends, and academic strengths to provide even more personalized recommendations. The system analyzes patterns in grade distributions to identify courses where students with similar academic profiles historically perform well, considers professor teaching styles from RateMyProfessor reviews, and integrates this with official course information to provide comprehensive, personalized recommendations.
How we built it
We developed a sophisticated data pipeline that combines multiple sources of UCSB academic information with intelligent transcript processing capabilities. Our Python scraping script collects and processes data from three key sources: official UCSB course catalogs and descriptions, RateMyProfessor reviews that capture qualitative insights about teaching styles and course experiences, and historical grade distribution data obtained through FERPA that provides quantitative metrics of student success. The transcript analysis system employs advanced text chunking algorithms that break down academic records into meaningful segments, considering factors like course sequences, prerequisite chains, and grade patterns. This data is vectorized and stored in Pinecone's vector database, enabling efficient similarity search and real-time retrieval. The frontend uses React with Tailwind CSS to create an intuitive chat interface, while our Python FastAPI backend handles the RAG implementation, combining retrieved contextual data with large language model capabilities to generate informed, personalized responses.
Challenges we ran into
Integrating diverse data sources presented significant challenges in data normalization and matching. Professor names often appeared differently across systems, and course codes needed careful standardization. The transcript processing system required sophisticated algorithms to handle various transcript formats and extract meaningful patterns. We had to ensure our grade distribution analysis accounted for various factors like term variations and department differences. Additionally, implementing effective RAG required careful tuning of our retrieval system to provide relevant context while maintaining conversational fluency. The chunking algorithm needed to balance granularity with context preservation to ensure meaningful analysis of transcript data. CodeBuff helped so much with debugging minor issues and really optimized our productivity a ton!
Accomplishments that we're proud of
We successfully created a comprehensive academic data ecosystem that combines official records, peer reviews, historical performance data, and intelligent transcript analysis. Our system effectively processes this information to provide nuanced course recommendations that consider both qualitative and quantitative factors. The transcript chunking system successfully identifies meaningful patterns in academic histories, enabling truly personalized recommendations. The RAG implementation achieves high relevance in its responses while maintaining natural conversation flow, effectively incorporating insights from processed transcript data.
What we learned
This project deepened our understanding of RAG architectures and their practical implementation. We gained expertise in data scraping and normalization across multiple sources, vector database operations, and the intricacies of building context-aware AI systems. Developing the transcript analysis system taught us about effective text chunking strategies and pattern recognition in academic data. Additionally, we learned how to effectively combine different types of academic data to create meaningful insights for students.
What's next for GauchoCourse
We plan to enhance our system with more sophisticated analysis of grade distribution patterns, professor teaching styles, and transcript data. Future updates will include predictive modeling to better match student learning preferences with professor teaching methods, incorporating historical performance data from processed transcripts. We aim to develop more advanced transcript analysis features that can identify subtle patterns in academic performance and provide more targeted recommendations. We're also exploring features to help students find study groups based on shared courses and learning styles, potentially using transcript data to match students with complementary academic strengths.
Built With
- anthropicapi
- flask
- github
- javascript
- langchain
- localllmembedding
- lucide-react
- next.js
- node.js
- pinecone
- python
- react
- shadcn-ui
- tailwindcss
- typescript
- vercel
- webscraping




Log in or sign up for Devpost to join the conversation.