Inspiration

Educational inequality remains one of the most persistent social challenges. Students, especially those from marginalized communities, often lack access to personalized academic support and timely feedback, which leads to widening achievement gaps over time. Traditional learning platforms frequently rely on one-size-fits-all teaching methods, making it difficult to identify individual learning gaps. To combat this, we developed a model that makes it easier for instructors to identify struggling students and intervene early, backed by data-driven insights into what each student is struggling with.

What it does

By analyzing students’ performance data (correctness in answering questions), our Deep Knowledge Tracing (DKT) model identifies learning gaps and tracks individual progress over time. This enables teachers to deliver targeted, data-driven interventions tailored to each student’s needs, helping them strengthen understanding, catch up to peers, and improve overall outcomes. By providing continuous insight into student development—especially for marginalized groups—the model supports equitable access to digital learning tools and contributes to reducing achievement gaps at both the classroom and school levels.

How we built it

We preprocessed the XES3G5M dataset to work with an LSTM model that takes in sequences of student answers to learn what a student does or does not know. For features, we vectorized individual questions, incorporated tags describing each question, and included other metrics such as time taken, prior student performance, and question type (MCQ vs. FRQ).

Challenges we ran into

The XES3G5M dataset is very large and contains a significant amount of data that wasn’t relevant to our model, so trimming and preprocessing it was challenging. Since we were working with this data on our own laptops rather than in the cloud, each preprocessing attempt was time-consuming and computationally intensive.

Accomplishments that we're proud of

Our model achieves an AUC of 0.82, which we are very proud of. Through several iterations of feature selection, hyperparameter tuning, and model training, we were able to surpass 0.80 AUC, which is considered strong performance.

What we learned

We learned a lot about working with big datasets and creating analytics that could help us figure out how to tune our model, especially since it took a lot time to preprocess data and train the model.

What's next for Lumina

In the future, we hope to integrate our model into a Learning Management System (LMS) and evaluate its utility in real-world settings.

Built With

Share this project:

Updates