Skip to content

dothanhtam91/EduTag---Rice-Datathon-2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

How to Run This Project on Kaggle

This project is fully reproducible and designed to run directly in the Kaggle Notebook environment.

Step 1: Open the Notebook

Access the public Kaggle notebook here:

👉 Kaggle Notebook
https://www.kaggle.com/code/thanhtamdo91/openstax-2026-rice-datathon-mapping-standards-i


Step 2: Fork the Notebook

  1. Click “Copy & Edit” (top right)
  2. This will create your own editable copy in your Kaggle workspace

Step 3: Enable GPU

  1. In the notebook, click Settings
  2. Set:
    • Accelerator: GPU
    • GPU Type: T4 ×2 (recommended)
  3. Save settings

Step 4: Add Gemini API Key (Optional but Recommended)

This project uses the Gemini API for data augmentation.

  1. Go to Kaggle → Settings → Secrets
  2. Add a new secret:
    • Name: GEMINI_API_KEY
    • Value: your Gemini API key
  3. The notebook automatically loads it using kaggle_secrets

If the API key is not provided, the notebook will still run using the base dataset.


Step 5: Run All Cells

  1. Click Run All
  2. The notebook will:
    • Parse and flatten the OpenStax JSON data
    • Perform EDA
    • Generate embeddings and FAISS index
    • Run semantic retrieval + re-ranking
    • Output evaluation metrics and visualizations

Expected Runtime

  • GPU Enabled: ~10–15 minutes
  • CPU Only: Not recommended (embedding + FAISS steps are slow)

Outputs

  • Evaluation metrics (Hit@K, F1, Hamming Loss)
  • Retrieval performance analysis
  • Error distribution across standard codes
  • Augmented training artifacts (optional)

Notes

  • No local setup is required
  • No additional dependencies need to be installed
  • All datasets are loaded directly from Kaggle inputs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors