VoiceAgents

Website: https://anubhav-gupta-software.github.io/voiceagents/

VoiceAgents

VoiceAgents is a dual-agent accessibility project:

chromium-voice-agent/ for voice-first web navigation in Chromium
lmmsagent/ for voice/text control of LMMS through an in-app AgentControl plugin

The practical goal is simple: reduce the operational burden of complex software by turning spoken intent into safe, actionable steps.

Why this approach

Both projects use a layered command strategy:

deterministic commands for speed and reliability on known actions
fuzzy normalization for common speech-to-text errors and phrasing variation
LLM fallback only when needed, with guardrails, so unrelated speech does not trigger destructive actions

This design is intentional:

deterministic paths keep common commands fast and predictable
fallback intelligence improves real-world usability when transcription is imperfect
safety gates preserve trust by refusing unrelated or low-confidence commands

Accessibility impact

These agents are built to support users who may face barriers with mouse-heavy, menu-dense software, including:

people with motor/physical disabilities who benefit from reduced fine-pointer demands
people with learning disabilities or cognitive load sensitivity who benefit from intent-level commands
beginners who know what they want to do but not where to click

The objective is not to replace UI knowledge; it is to lower entry cost, reduce fatigue, and make advanced tools more reachable.

Why Chromium voice control is effective

Web workflows are full of repetitive mechanics: tab switching, scrolling, opening tools, confirming dialogs, and navigating deep page layouts.
chromium-voice-agent/ targets these mechanics directly and allows users to operate the browser by intent rather than pointer precision.

For users with disabilities, this is especially valuable because it:

reduces repetitive cursor travel and click strain
shortens multi-step UI paths into one spoken action
keeps interaction in a single modality when context switching is costly

Why LMMS voice control matters

Digital Audio Workstations are powerful but highly complex. LMMS has many windows, tracks, editors, and plugin workflows that can overwhelm first-time users.

lmmsagent/ focuses on that exact problem:

opening and focusing the right tool windows
creating tracks and patterns with direct commands
importing files and controlling common slicer workflows
normalizing noisy spoken commands into executable LMMS actions

For beginners, this turns DAW navigation from “discover hidden UI pathways” into “state musical intent and iterate.”
For accessibility users, it reduces the interaction complexity of dense production interfaces.

Project layout

`chromium-voice-agent/`

Browser automation prototype for voice-driven web control.

Key files:

chromium-voice-agent/manifest.json
chromium-voice-agent/background.js
chromium-voice-agent/speech.js
chromium-voice-agent/popup.html
chromium-voice-agent/popup.js

`lmmsagent/`

LMMS automation project for controlling LMMS through a local plugin boundary.

Key directories:

lmmsagent/integrations/lmms/AgentControl/ - LMMS plugin source
lmmsagent/integrations/lmms/patches/ - minimal LMMS host patch set
lmmsagent/lmms-text-agent/ - local text command client
lmmsagent/lmms-voice-agent/ - local voice bridge
lmmsagent/shared/ - shared LMMS socket client and command normalization
lmmsagent/scripts/ - install and build scripts for an external LMMS checkout
lmmsagent/docs/ - architecture, command map, and demo notes
lmmsagent/demo/ - smoke-test commands

Intended use

use chromium-voice-agent/ for browser-side voice accessibility and automation experiments
use lmmsagent/ for accessible LMMS control, beginner onboarding, and workflow acceleration

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
chromium-voice-agent		chromium-voice-agent
lmmsagent		lmmsagent
.gitignore		.gitignore
DEVPOST_ABOUT.md		DEVPOST_ABOUT.md
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceAgents

Why this approach

Accessibility impact

Why Chromium voice control is effective

Why LMMS voice control matters

Project layout

`chromium-voice-agent/`

`lmmsagent/`

Intended use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoiceAgents

Why this approach

Accessibility impact

Why Chromium voice control is effective

Why LMMS voice control matters

Project layout

chromium-voice-agent/

lmmsagent/

Intended use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`chromium-voice-agent/`

`lmmsagent/`

Packages