Inspiration

Every time I've had an idea for video or content, I've struggled with how to start or what to say. I soon realized many of my friends faced the exact same challenge: they could use AI to generate images, voices, or even videos, but they were stuck on the fundamental – what to actually write or say. This common struggle is what truly inspired me to create Vocal Script AI, a tool designed to effortlessly transform raw ideas into structured scripts and professional voiceovers, making content creation genuinely easier for everyone.

What it does

Vocal Script AI is a tool that transforms a user’s typed or spoken input into a well-structured, ready-to-use script. It also offers an optional voiceover feature — so users can listen to the script, which is especially helpful for those who prefer audio over reading.

How we built it,

Frontend

-HTML, CSS, Vanilla JS: For a responsive and intuitive user interface.

-Rendering Library: To efficiently display dynamic content.

Backend

-Node.js, Express.js: A powerful and flexible runtime environment and web application framework.

-TypeScript: For type-safe and maintainable backend development.

AI Integration:

-HuggingFace: Utilized for advanced text generation and text-to-speech functionalities.

-OpenAI: Integrated to support and enhance specific AI models via HuggingFace.

Key Libraries

-dotenv: Securely manages API keys and configuration settings.

-kokoro-js: Powers our text-to-speech engine.

-onnxruntime-node: A crucial dependency for kokoro-js.

-nodemon, tsx: Dev-time tools for running backend code (e.g., via npm run dev)

Challenges we ran into

-It was my first time integrating AI models in the backend. -First experience using MediaRecorder API.

-Learning to work inside the Adobe Express Code Playground environment.

-Finding free, reliable AI models that work well with speech and text.

Accomplishments that we're proud of

-I'm proud that I was able to successfully complete the project and bring the full idea to life.

-The add-on now converts voice into text, then turns that text into a well-structured script, and finally generates a realistic voiceover from that script. -All the core AI models and goals of the project were achieved.

-I also managed to connect everything end-to-end — from input to final output — using my own backend and AI integrations.

What we learned

-How to work with AI APIs like Hugging Face and integrate them with backend

-Basics of using Adobe Express Add-on SDK and Code Playground

-How to record and process voice input using MediaRecorder API

What's next for Vocal script ai

-Migrate to paid AI models for improved performance and reliability.

-Refactor the codebase to be more modular, maintainable, and follow clean architecture and SOLID principles.

-Adding Tests

-Enhance the user interface to match Adobe Express design standards.

-Implement persistent storage for user data and history.

-Prepare for a full launch on the Adobe Express Add-on Marketplace.

Built With

Share this project:

Updates