π Inspiration
History is often locked behind dense textbook paragraphs and dry timelines. We noticed that younger students disconnect from cultural heritage because it feels static.
With Hack Days Ankara focusing on the power of the Google Gemini API, we asked ourselves: What if history could talk back?
We were inspired to build Local-Lore Storytellerβan interactive, AI-driven educational platform that transforms physical landmarks, ancient artifacts, and global history into immersive, gamified adventures told through unexpected character personas.
π οΈ How We Built It
We built a lightweight, ultra-responsive web application designed for maximum engagement.
- Frontend Architecture: Built entirely using Streamlit to deliver a modern, dark-themed UI that updates dynamically without heavy page reloads.
- Core Engine: Integrated the brand new, cutting-edge Google GenAI SDK using the
gemini-2.5-flashmodel. - Structured JSON Extraction: For the Adventure Mode, we used Gemini's Structured Outputs (JSON mode) to safely force the model to return a predictable data structure containing a clean story narrative, multiple-choice options, and a correct answer index simultaneously.
- Multimodal Vision: For the Image Scanner Mode, I leveraged Gemini's native multimodal capabilities to instantly analyze raw image bytes (
PIL.Image) alongside user chat prompts.
π§ Challenges We Overcame
- State Management in Streamlit: Because Streamlit reruns the script on every user click, our interactive quiz kept resetting. I solved this by implementing
st.session_stateto securely cache the generated historical data, ensuring a smooth user experience when submitting quiz answers. - JSON Validation under Pressure: Initially, standard markdown wrapping around the JSON code blocks caused parsing errors in Python. Enforcing
response_mime_type="application/json"directly inside the API configuration fixed this instantly, ensuring zero parsing crashes. - Context Retention in Vision Chat: Ensuring that follow-up text questions remembered the contents of the uploaded image was a challenge. I resolved this by continually passing the image object back to the multimodal model alongside the growing user text history.
π What We Learned
I learned that the gemini-2.5-flash model is incredibly versatile. It handles complex, multi-variable context switches (Landmark + Persona + Language) seamlessly. I also discovered that building an intuitive user loop matters far more than writing a massive codebase when building under a 4-hour sprint deadline.
π What's Next for Local-Lore Storyteller
I plan to introduce a Text-to-Speech (TTS) audio engine to read the stories aloud in their generated character voices, making the platform accessible to visually impaired students. I also plan to integrate micro-maps so users can view the physical coordinates of the landmarks they are exploring.
Log in or sign up for Devpost to join the conversation.