Inspiration
While in middle school, have you ever wondered why you were studying whatever concepts that were being taught back then? Guess what? You are not alone.
Not everyone, including teachers, can study all the textbooks across grades and subjects and recall how a specific concept that is being taught will be used down the lane.
In India, due to globalization, everybody wants to send their kids to English medium schools. But the problem is that not many teachers in Indian public schools know English. To tackle this issue, we at Project Abhyas(www.tinyurl.com/Project-Abhyas) are creating lesson readings. These are audio recordings of textbook chapters in English, aided by vernacular explanations. (Our site has more details about lesson readings and other solutions.)
While writing scripts for the pilot run, our writers wanted to know how a specific concept taught in a lower grade is used in other subjects and subsequent grades. From there came the idea to build a tool to find similar topics across textbooks to help not only our scriptwriters but also the teachers and the students to understand the bigger picture.
And that's how Abhyas Edu Context came into being. We are building a search engine for education to put concepts in context.
How we built it
We parsed textbooks across grades and extracted keywords from each sentence using BERT-based deep learning models. These keywords are stemmed and stored in a TigerGraph along with contextual information like the sentence, page number, grade, and subject.
When the user inputs a query, we extract keywords and then query TigerGraph for sentences connected to these keywords. We then convert the resulting sentences into embeddings using deep learning models to perform sentence similarity and show the topics with the best context match.
Challenges we ran into
Thinking in multiple dimensions as opposed to 2D in traditional DBs took some time.
Accomplishments that we're proud of
Creating a submission that is actually relevant to our non-profit by learning graph analytics is what we're proud of.
What we learned
The list is long: Graph DBs, TigerGraph, Graph Studio, BERT-based embeddings, similarity matching, graph-based search engines.
What's next for Abhyas Edu Context
We are building this as a solution for our non-profit Project Abhyas(www.tinyurl.com/Project-Abhyas). The solution can be extended to include n-grams and similarities at para, section, and chapter levels by extracting key components at respective levels and creating embeddings.
With further additions to data processing and schema, we can build a general-purpose dashboard/API-based platform to allow anyone to upload documents and provide context-aware search. A public, domain-specific search library for common use cases (board/stream specific textbook search) can be pre-loaded for usage at scale. This transcends the education domain and has applications across industries.
Built With
- bert
- graph
- nltk
- python
- tigergraph
Log in or sign up for Devpost to join the conversation.