Inspiration
Every time I've had an idea for video or content, I've struggled with how to start or what to say. I soon realized many of my friends faced the exact same challenge: they could use AI to generate images, voices, or even videos, but they were stuck on the fundamental – what to actually write or say. This common struggle is what truly inspired me to create Vocal Script AI, a tool designed to effortlessly transform raw ideas into structured scripts and professional voiceovers, making content creation genuinely easier for everyone.
What it does
Vocal Script AI is a tool that transforms a user’s typed or spoken input into a well-structured, ready-to-use script. It also offers an optional voiceover feature — so users can listen to the script, which is especially helpful for those who prefer audio over reading.
How we built it,
Frontend
-HTML, CSS, Vanilla JS: For a responsive and intuitive user interface.
-Rendering Library: To efficiently display dynamic content.
Backend
-Node.js, Express.js: A powerful and flexible runtime environment and web application framework.
-TypeScript: For type-safe and maintainable backend development.
AI Integration:
-HuggingFace: Utilized for advanced text generation and text-to-speech functionalities.
-OpenAI: Integrated to support and enhance specific AI models via HuggingFace.
Key Libraries
-dotenv: Securely manages API keys and configuration settings.
-kokoro-js: Powers our text-to-speech engine.
-onnxruntime-node: A crucial dependency for kokoro-js.
-nodemon, tsx: Dev-time tools for running backend code (e.g., via npm run dev)
Challenges we ran into
-It was my first time integrating AI models in the backend. -First experience using MediaRecorder API.
-Learning to work inside the Adobe Express Code Playground environment.
-Finding free, reliable AI models that work well with speech and text.
Accomplishments that we're proud of
-I'm proud that I was able to successfully complete the project and bring the full idea to life.
-The add-on now converts voice into text, then turns that text into a well-structured script, and finally generates a realistic voiceover from that script. -All the core AI models and goals of the project were achieved.
-I also managed to connect everything end-to-end — from input to final output — using my own backend and AI integrations.
What we learned
-How to work with AI APIs like Hugging Face and integrate them with backend
-Basics of using Adobe Express Add-on SDK and Code Playground
-How to record and process voice input using MediaRecorder API
What's next for Vocal script ai
-Migrate to paid AI models for improved performance and reliability.
-Refactor the codebase to be more modular, maintainable, and follow clean architecture and SOLID principles.
-Adding Tests
-Enhance the user interface to match Adobe Express design standards.
-Implement persistent storage for user data and history.
-Prepare for a full launch on the Adobe Express Add-on Marketplace.
Built With
- adobe-express-code-playground
- css
- dotenv
- ejs
- express.js
- github
- html
- huggingface-interface-api
- javascript
- kokoro-js
- node.js
- nodemon
- onnxruntime-node
- openai
- postman
- tsx
- typescript
- vscode
Log in or sign up for Devpost to join the conversation.