Cadence — Daily Spoken Language Practice with an AI Tutor
A daily speaking-practice app that puts learners face to face with an AI language tutor. Speak it, type it, or hear a native speaker first, then get sub-second feedback on pronunciation, accent, and rhythm.
Project Overview
A language-learning publisher that had been shipping textbook-style courses wanted a daily speaking-practice product. Reading and listening were already solved by their existing catalogue. Speaking out loud, with quick honest feedback, was the gap.
Cadence is built around one tight loop. The AI tutor presents a target phrase in the learner's chosen language. The learner answers it out loud (or types if they cannot speak right now). The AI scores pronunciation, accent, and rhythm in under a second and shows them what landed, syllable by syllable, against a native speaker recording.
The whole experience is daily-habit shaped. Learners come back for the streak first, the mastery follows. Adaptive progress tracks which sounds and rhythms each learner has internalised and which still need work, then surfaces those in tomorrow's session.
Currently shipping with Spanish, French, and Portuguese. The voice engine and rubric are language-agnostic, so adding the next language is a configuration change rather than a rebuild.
What's Inside
Three languages, one engine
Spanish, French, and Portuguese on day one. Adding the next language is configuration, not a rebuild.
Daily target phrase
A short dialogue or phrase tied to the learner's current level, surfaced fresh each day.
Speak, type, or listen first
Voice (streaming Whisper) is the primary input. Chat fallback for noisy contexts. Native audio playback for confidence.
Native speaker audio
Each phrase carries a clean recording of a native speaker (one per language) plus a slow-down view with syllable highlights.
Sub-second pronunciation eval
A custom waveform comparison engine plus an LLM critic against a per-language pronunciation rubric. Returns in under a second.
Side-by-side voice playback
The learner's voice plays back next to the native speaker, with stacked waveforms so the gaps are easy to see and hear.
Per-syllable performance report
Each session closes with a syllable-by-syllable readout: what landed, what to revisit, surfaced in tomorrow's session.
Adaptive progress and streaks
Daily streak counter on the home dashboard. Adaptive engine routes the next session toward the sounds and rhythms still on the edge.
What the Client Provided
- A target phrase bank per language (CEFR A1 to B2), authored by their in-house language team
- Native-speaker reference recordings for every phrase (one per language, studio-grade)
- Per-language pronunciation rubric (phonemes, accent patterns, common L1 interference)
- Brand guidelines and a small set of marketing keywords (streak, daily, real conversation)
- Access to 30 beta learners for the first iteration round
Design and Build Process
Pronunciation Rubric and Phrase Bank
Worked with the client's in-house language team to translate their per-language pronunciation rubric into a structured prompt the AI can score against. Loaded the target phrase bank (CEFR A1 to B2) into a per-language store with native-speaker reference recordings attached.
Voice Capture and Streaming Eval
Streaming Whisper for the learner mic so transcription starts before the learner finishes the sentence. Audio is also captured raw for the waveform comparison pass that runs in parallel.
Waveform Comparison Engine
Custom comparison engine that aligns the learner's waveform against the native reference recording, then surfaces per-syllable timing, pitch, and rhythm deltas. The visual side-by-side view comes straight from this.
LLM Critic Against the Rubric
Claude scores the response against the rubric and writes the short feedback bullets the learner sees. The waveform deltas feed the prompt so the critic is grounded in the actual acoustic data, not an LLM guess.
Daily Loop, Streak, Adaptive Routing
Designed the daily session shell, the streak counter, the syllable-detail report, and the adaptive logic that surfaces the next session's phrases based on what the learner missed today.
Beta and Tuning
30 beta learners across the three languages. Tuned the rubric on real failure cases, cut three confusing UI patterns, then opened it to the wider catalogue audience.
Tools and Stack
Deliverables
- Live daily-practice web app shipping with Spanish, French, and Portuguese
- Pronunciation rubric translated into a structured prompt, per language
- Waveform comparison engine with per-syllable timing, pitch, and rhythm deltas
- Per-syllable performance report at session close
- Daily streak shell, adaptive next-session router, and progress dashboard
- Figma design system, dashboard, and per-language phrase card patterns
- Runbook for adding the next language (configuration, not rebuild)
Results and Impact
Daily active use is the metric that matters here, and it held. Beta cohort averaged 4.2 sessions per week without prompting, compared to under 1 per week on the publisher's prior reading-and-listening product.
The syllable-detail report is the piece learners screenshot most. The team has since fed it into the marketing site as a hero element, since it captures the sub-second feedback story better than any description.
Adding the next language took 9 days end-to-end (rubric ingest + phrase bank + native recordings + light per-language UI tuning), against the 8-week original build. The configuration-not-rebuild path held up.
Other Relevant Work

AI Rapid Fire — Sales Objection Drills with an AI Coach
A rapid fire AI tutor that drills sales reps on customer objections with type, click, and voice input, with sub-second feedback.
Cinematic AI Avatar Lesson Video
A polished AI avatar lesson video with two hosts, sound design, designed slides, micro-animations, and hand-crafted B-roll.

From LearnWorlds to Teachable, with AI Interactive Lessons
A national CTI training course migrated from LearnWorlds to Teachable, rebuilt with refreshed video and a flagship AI interactive lesson learners can chat with.









