Ali Mrani Alaoui·Portfolio
CADENCE · AI LANGUAGE TUTOR · CASE STUDY

Cadence — Daily Spoken Language Practice with an AI Tutor

A daily speaking-practice app that puts learners face to face with an AI language tutor. Speak it, type it, or hear a native speaker first, then get sub-second feedback on pronunciation, accent, and rhythm.

AI Language TutorVoice + Chat InputPronunciation Coaching
Input
Speak, type, or listen first
AI Evaluation
Sub-1s pronunciation feedback
Habit
Daily streaks, adaptive progress
Learners take part in a daily speaking practice with an AI language tutor. Spanish, French, or Portuguese — pick a language, hit the streak.
The AI presents a target phrase or short dialogue to practice in the chosen language.
The learner responds by speaking the phrase out loud. A live waveform shows what the tutor is hearing.
If a learner cannot say it out loud right now, they can type the answer instead. Same evaluation flow.
Or they tap to hear a native speaker say the phrase first, with the audio waveform visualised.
Or they slow it down to pick out each syllable. The UI highlights syllable by syllable.
The AI evaluates pronunciation, accent, and rhythm in under a second. Feedback chips appear next to the learner.
It plays the learner voice back next to a native speaker side by side, with stacked waveforms so the gap is easy to see.
Syllable-by-syllable, the learner sees what landed and what to revisit. The report is the coaching, not just a score.
Adaptive progress and daily streaks track the learning journey across weeks. Habit first, mastery second.

Project Overview

A language-learning publisher that had been shipping textbook-style courses wanted a daily speaking-practice product. Reading and listening were already solved by their existing catalogue. Speaking out loud, with quick honest feedback, was the gap.

Cadence is built around one tight loop. The AI tutor presents a target phrase in the learner's chosen language. The learner answers it out loud (or types if they cannot speak right now). The AI scores pronunciation, accent, and rhythm in under a second and shows them what landed, syllable by syllable, against a native speaker recording.

The whole experience is daily-habit shaped. Learners come back for the streak first, the mastery follows. Adaptive progress tracks which sounds and rhythms each learner has internalised and which still need work, then surfaces those in tomorrow's session.

Currently shipping with Spanish, French, and Portuguese. The voice engine and rubric are language-agnostic, so adding the next language is a configuration change rather than a rebuild.

What's Inside

Three languages, one engine

Spanish, French, and Portuguese on day one. Adding the next language is configuration, not a rebuild.

Daily target phrase

A short dialogue or phrase tied to the learner's current level, surfaced fresh each day.

Speak, type, or listen first

Voice (streaming Whisper) is the primary input. Chat fallback for noisy contexts. Native audio playback for confidence.

Native speaker audio

Each phrase carries a clean recording of a native speaker (one per language) plus a slow-down view with syllable highlights.

Sub-second pronunciation eval

A custom waveform comparison engine plus an LLM critic against a per-language pronunciation rubric. Returns in under a second.

Side-by-side voice playback

The learner's voice plays back next to the native speaker, with stacked waveforms so the gaps are easy to see and hear.

Per-syllable performance report

Each session closes with a syllable-by-syllable readout: what landed, what to revisit, surfaced in tomorrow's session.

Adaptive progress and streaks

Daily streak counter on the home dashboard. Adaptive engine routes the next session toward the sounds and rhythms still on the edge.

What the Client Provided

  • A target phrase bank per language (CEFR A1 to B2), authored by their in-house language team
  • Native-speaker reference recordings for every phrase (one per language, studio-grade)
  • Per-language pronunciation rubric (phonemes, accent patterns, common L1 interference)
  • Brand guidelines and a small set of marketing keywords (streak, daily, real conversation)
  • Access to 30 beta learners for the first iteration round

Design and Build Process

01

Pronunciation Rubric and Phrase Bank

Worked with the client's in-house language team to translate their per-language pronunciation rubric into a structured prompt the AI can score against. Loaded the target phrase bank (CEFR A1 to B2) into a per-language store with native-speaker reference recordings attached.

02

Voice Capture and Streaming Eval

Streaming Whisper for the learner mic so transcription starts before the learner finishes the sentence. Audio is also captured raw for the waveform comparison pass that runs in parallel.

03

Waveform Comparison Engine

Custom comparison engine that aligns the learner's waveform against the native reference recording, then surfaces per-syllable timing, pitch, and rhythm deltas. The visual side-by-side view comes straight from this.

04

LLM Critic Against the Rubric

Claude scores the response against the rubric and writes the short feedback bullets the learner sees. The waveform deltas feed the prompt so the critic is grounded in the actual acoustic data, not an LLM guess.

05

Daily Loop, Streak, Adaptive Routing

Designed the daily session shell, the streak counter, the syllable-detail report, and the adaptive logic that surfaces the next session's phrases based on what the learner missed today.

06

Beta and Tuning

30 beta learners across the three languages. Tuned the rubric on real failure cases, cut three confusing UI patterns, then opened it to the wider catalogue audience.

Tools and Stack

AI Engine
Anthropic Claude for the rubric-based pronunciation critic and the short feedback bullets.
AI Voice
OpenAI Whisper for streaming learner speech. ElevenLabs for the native speaker reference voice on phrases the client could not pre-record.
Audio
Custom waveform comparison engine for per-syllable timing, pitch, and rhythm deltas against the native reference.
Custom code
Custom-built daily session shell, streak counter, syllable-detail report, and adaptive phrase router.
Design
Figma for the design system, dashboard, and per-language phrase card patterns.

Deliverables

  • Live daily-practice web app shipping with Spanish, French, and Portuguese
  • Pronunciation rubric translated into a structured prompt, per language
  • Waveform comparison engine with per-syllable timing, pitch, and rhythm deltas
  • Per-syllable performance report at session close
  • Daily streak shell, adaptive next-session router, and progress dashboard
  • Figma design system, dashboard, and per-language phrase card patterns
  • Runbook for adding the next language (configuration, not rebuild)

Results and Impact

Daily active use is the metric that matters here, and it held. Beta cohort averaged 4.2 sessions per week without prompting, compared to under 1 per week on the publisher's prior reading-and-listening product.

The syllable-detail report is the piece learners screenshot most. The team has since fed it into the marketing site as a hero element, since it captures the sub-second feedback story better than any description.

Adding the next language took 9 days end-to-end (rubric ingest + phrase bank + native recordings + light per-language UI tuning), against the 8-week original build. The configuration-not-rebuild path held up.