VoCal — Photo nutrition with voice context

Photo and voice become one meal.

The video shows the spread of ingredients. VoCal combines what the camera sees with what you say is hidden, mixed in, or customized.

1 / 3

Carrots

Cucumbers

Broccoli

Greens

Protein

+ Rice Underneath

+ Lentils Underneath

VoCal result

Chicken rice bowl

91% confidence

742

kcal

38g

protein

+22g

vs photo

Photo-only logs miss hidden layers. Voice context turns the same image into a meal you can actually trust.

Why VoCal

4-5x faster than the dominant alternative.

Manual logging takes 2-3 minutes. CalAI-style photo logging is about 90 seconds. VoCal captures the meal in roughly 20 seconds, while voice adds the hidden details photos miss.

20s

VoCal log time

4.5x

400-500% speedup

Average meal logging time

Seconds per complete log

Lower is better

Manual entry2-3 min

CalAI~90s

VoCal~20s

7.5x faster than manual

Compared with a 150-second manual midpoint.

More accurate context

Voice captures buried, mixed, off-frame, and customized ingredients.

Why VoCal

Every feature earns its place in the log.

Any food, any description.

Talk like a normal person. VoCal turns messy voice notes into structured food, portions, modifiers, and confidence signals.

voice context parsed

hidden protein portion clue higher confidence

Voice note

"Half a Cava bowl, extra chicken, dressing on the side."

BaseRice bowl, 1/2

ProteinChicken +60g

ModifierDressing separate

Restaurant lookup, matched automatically.

Name the place or dish and VoCal can search restaurant foods, compare likely menu items, and anchor the estimate to a real meal.

chipotle chicken bowl, no cheese

Best match

Chicken burrito bowl

640 kcal 44g pro menu fit

Follow-up questions when it matters.

If the camera and voice still leave ambiguity, VoCal asks one or two targeted questions before locking the log.

VoCal asks

Was the chicken grilled or fried, and did you use all of the sauce?

You answer

Grilled chicken, about half the sauce, and rice underneath.

sauce adjusted prep method set 91% confidence

A progress tab that shows movement.

Weight, protein consistency, calories, and check-ins sit together so the user sees trend, not just today's log.

Weight trend

181.4 -> 178.9

-2.5 lb

Body composition, treated as a trend.

Progress photos can estimate body-fat direction over time, while the interface keeps it framed as guidance, not a diagnosis.

scan

Estimate range

17-19%

trend confidence rising

Wearables feed the next step.

Apple Watch, Whoop, Apple Health, and other fitness trackers help VoCal adjust coaching to recovery and output.

Apple WatchLog by voice

WhoopRecovery synced

Health dataSteps + sleep

Three seconds

From plate to proof.

The photo gives VoCal a starting point. Your voice fills in what the lens cannot know, then the app turns both into a cleaner nutrition record.

01
Snap the plate
VoCal identifies what is visible on the surface: rice, greens, sauce, drink, package, or restaurant dish.
02
Add the missing context
Say what is underneath, mixed in, customized, or from a restaurant so the estimate is not limited to the photo.
03
Answer any smart follow-up
If something is uncertain, VoCal asks about the exact thing that changes the estimate: sauce amount, prep method, hidden base, or portion size.

Pricing

Honest, and quietly priced.

Taste

Free

Photo + voice logging, daily macros, 7-day history.

Start free

VoCal Pro

Most loved

$4.99/month

Unlimited history, nutrition coach, progress imaging, wrist app, full export.

Go Pro

Clinic

Custom

Practitioner dashboards, client sharing, HIPAA terms.

Talk to us

Questions

The things people ask.

A photo can only see the surface. VoCal uses voice to add hidden ingredients, restaurant details, cooking methods, and rough portions, then reconciles that context with the image.

Yes. Capture works fully offline and syncs when you reconnect. The Apple Watch flow is voice-first, so you can add context mid-walk without your phone.

Audio is transcribed on-device and discarded immediately — only the structured meal is stored. Nothing is sold or used for advertising.

Stop guessing from a flat photo. Start adding context.

Free on iOS and Android. Snap the meal, say what the camera missed, and keep your progress moving.

Download VoCal Watch the 40s demo

Scan for beta TestFlight invite