Inspiration

More than 2.6M students take AP courses each year (Forbes). AP courses are widely considered to be highly challenging due to the significant amount of information and concepts included in them. In our experience taking many AP courses, targeted tutoring and easily digestible information make all the difference between succeeding and struggling.

However, many people do not have access to such tutoring. Our AP World History teachers weren't able to grade more than two LEQs (long essay questions) with proper feedback because LEQs take so long to grade for ~150 students. Additionally, due to the large amount covered in AP classes, it ends up losing its structure-- and it is difficult to learn without proper structure.

What it does

Chat with your AP courses: We currently offer 6 AP courses that you can chat with. Trained on tens of thousands of pages from textbooks and curated websites, you get a citation and a deep explanation for every question. Our math model seamlessly accesses Wolfram Alpha, meaning even the toughest Calculus problems are easy for Archibald. Struggling with an integral? Ask Archibald. Don't want to make a SPICE chart for the Incan civilization? Ask Archibald. Can't understand as CSA concept? Archibald can explain it simply.

Get Grades on your LEQs and SAQs instantly: Most teachers simply don't have the time to grade their hundreds of students' LEQs and SAQs with comprehensive feedback and suggestions. This disadvantages many students when it comes to the written portion of the History APs. We provide:

  1. Constructive line-by-line feedback with kudos and improvement suggestions
  2. An unbelievably accurate LEQ/SAQ number grade (seriously, you have to see it to believe it)
  3. What points you earned/lost on the rubric including an explanation for why

Definitions: Hover over terms that the model responds with to see a definition. We have over 15 thousand AP terms with definitions scraped.

Automatic MCQ generator: For our AP Biology and AP Computer Science A courses, we have an MCQ generator: you tell it what topic you need an MCQ on, and it will generate a question with four choices. After you choose one, it informs you if you were correct or not, and gives a complete explanation for the correct answer.

How we built it

Front-end:

  1. React + Tailwind CSS + Framer motion for some animation. Some components are from the Aceternity library, but most are completely custom.
  2. GIGAMD our custom, completely hand-written Markdown parser & renderer built specially for this hackathon.

Back-end: Next JS + Supabase Postgres & Auth.

AI:

  1. Google Cloud Vertex AI for access to the Claude 3.5 Sonnet LLM
  2. Langchain.js (our custom patched version including a bug fix and major feature addition)
  3. Supabase PGVector as our vector store for RAG
  4. Wolfram Alpha API for our AP Calculus models

Scraper:

  1. We wrote custom web scrapers to collect information on/about AP courses
  2. Our custom scrapers have scraped thousands of pages worth of information from the internet
  3. Additionally, we have over 15 thousand AP terms with definitions scraped

Challenges we ran into

  • Budget: we wanted to make this for $0. This was not easy
  • Email Server: We send users a confirmation e-mail when they sign up. Finding an SMTP email server that would allow us to do this for free was difficult
  • Because of our "budget constraints", we were forced to use Google Cloud Vertex AI because it provided us with a free trial of $150 cloud credits. However, the LLM framework we chose, Langchain.js did not support non-gemini models (aka did not support the good models) hosted in Vertex AI. This turns out to be a highly requested feature, and we ended up patching Langchain to support Claude 3.5 Sonnet in the Vertex AI. The community has since evolved our patch into a pull request.
  • We also ended up writing a patch to a known but unsolved Langchain bug that we encountered. A pull request will be created after this hackathon.
  • Our MCQ generation model is still a bit rough. It was rushed, and hence its CSA problem generator lacks markdown formatting (we ran into unforeseen issues at the last minute implementing this). However, it still works quite well, and is very fun to play wiht.
  • GIGAMD, our custom markdown parser and renderer was born out of frustrations with the inflexibility of existing markdown renderers. We implemented it as a one-pass recursive descent parser, while we later learned that markdown parsers are supposed to be two-pass :( This resulted in a _ very _ difficult time, and we ended up having to make some concessions when it comes to formatting lists (so 95% of responses from our LLM are perfectly formatted, but lists nested in bullet points... don't render properly)
  • Streaming: You would be surprised how difficult streaming the output from the AI model to the front-end in real-time is. It led to all sorts of nasty and unreplicable bugs.

Accomplishments that we're proud of

  • Building a complete prototype that works extremely well
  • Beautiful UI with animations
  • Indexing tens of thousands of pages from textbooks through our RAG pipeline
  • Scraping thousands of web pages
  • Custom markdown parser
  • Langchain bug & feature patches

What we learned

  • Some of our team members had never worked with Next.js before
  • Most team members were unfamiliar with RAG pipelines and how these models are supposed to work
  • We learned to not use Langchain again. It is very popular and used often in the industry, but it is over-abstracted and inflexible. There are abstractions on top of abstractions which led us to spend more time debugging and patching langchain than debugging our own models (metaphorically)
  • Parse markdown in two passes, not one

What's next for Archibald.ai

We are going to continue to work on this project!

Built With

Share this project:

Updates