Question 1

What does an AI-powered language lab actually do that a traditional one doesn't?

Accepted Answer

A traditional language lab plays a model speaker through a headset and asks the student to repeat. Nobody hears the repetition; nobody scores it. Our lab listens. Whisper ASR transcribes every student utterance, our scoring engine measures pronunciation phoneme-by-phoneme, fluency (speaking rate, pauses, fillers), intonation and stress — and gives the student targeted feedback in seconds. The teacher sees a class dashboard of who is struggling on which sound, not 40 students wearing headsets in silence.

Question 2

Which AI models power speech scoring?

Accepted Answer

Whisper for ASR (OpenAI's open-source transcription model running on our own GPU box, so no audio leaves your premises if you self-host). A phoneme-aligned scoring layer compares student waveforms against accent-specific reference models (American / British / Indian English) using forced alignment and CTC log-probabilities. Fluency analysis uses librosa for acoustic feature extraction. The conversation partner runs on OpenAI GPT-4 with a controlled prompt template — the AI never goes off-curriculum.

Question 3

Is it offline / on-premise?

Accepted Answer

Yes — the lab is designed for the typical school server room with patchy internet. The Python AI services container runs Whisper and the scoring engine on a single GPU box (NVIDIA T4 or RTX 4060 is enough for ~40 concurrent students). Postgres, Redis and the NestJS API run on the same server. Only the optional GPT-4 conversation partner and Google TTS need internet — both can be swapped for on-prem alternatives.

Question 4

How do students get their content?

Accepted Answer

Three ways. (1) Teacher-authored lessons through the content management module. (2) Adaptive learning paths — the engine generates the next exercise based on the student's last 10 attempts, targeting their weakest phonemes / grammar errors. (3) AI content generation — give a topic and grade level, the system produces a reading passage, comprehension questions, vocabulary list and discussion prompts. Everything is reviewable before it goes to students.

Question 5

Does it handle Indian English accents?

Accepted Answer

Yes — accent is a configurable parameter on every scoring call (American, British, Indian English). Indian English models are trained on Indian speakers so a student saying "theatre" the Indian way isn't marked down for not sounding American. Teachers and institutions choose the target accent per course or per student.

Question 6

What about writing and grammar?

Accepted Answer

The grammar service detects errors in student writing — agreement, tense, prepositions, articles — and explains each one in plain English with a suggested fix. Quizzes, assignments and certificates round out the four-skills coverage (speaking, listening, reading, writing). Reading comprehension and listening modules generate passages and audio with comprehension questions.

Question 7

Is it multi-tenant for a chain of schools or a university group?

Accepted Answer

Yes — multi-tenancy is core. The hierarchy is Tenant (institution) → Department → Batch → Student. Each tenant has isolated data, branding, payment plans and reporting. A university group running 12 colleges sees consolidated metrics across the group, while each college has its own admin, teachers and student rolls. RBAC scopes every action — a teacher in College A never sees students in College B.

Question 8

What hardware do students need?

Accepted Answer

Any laptop or desktop with a headset and microphone. Browser is Chrome, Edge or Firefox — no plugin install. Headsets in the ₹400–₹800 range work fine; gaming headsets or USB-C wired headsets give the best ASR accuracy. The lab streams audio via WebSocket to the API, so 256 Kbps per student is plenty of bandwidth — even on a school LAN.

Question 9

How does the teacher run a class?

Accepted Answer

The whiteboard + live session module turns it into a virtual lab. The teacher pushes an exercise to the whole batch, watches a live dashboard of who's on which question, gets red flags when a student keeps failing the same phoneme, and broadcasts the model audio when needed. Sessions are recorded so a student who missed can replay; assessments at the end auto-grade and feed into the gradebook.

Question 10

Are there assessments and certificates?

Accepted Answer

Yes — formative quizzes after each lesson, summative assessments end-of-module, and certificate generation at course completion with QR-verifiable serial numbers. Assessment results feed into the analytics module — by-student, by-batch, by-skill, by-phoneme. Parents can see their child's progress through a dedicated parent portal.

Question 11

How long does deployment take?

Accepted Answer

Single school (100–500 students, one server): 2–3 weeks including hardware install, content seeding and teacher training. Mid-size university (5,000–25,000 students, multi-department): 6–10 weeks. Country-wide deployment for a chain of institutions: scoped per phase. We bundle on-site setup, GPU hardware procurement and three-day teacher training in the standard SOW.

The language lab that actually listens.

The lab that scores every utterance — not the one that just plays MP3s.

Phoneme-level pronunciation

Adaptive, not linear

On-prem ready

Four layers. 24 modules. One data model.

Every /θ/, every pause, every filler — measured.

AI conversation partner

AI content generation

Forty students. One dashboard. Real-time.

Tenant → Department → Batch. Modelled from day one.

Single school

University group

Government skill mission

Plays nicely with the ed-tech ecosystem.

Three-tier monorepo. One GPU box per school.

Every shape of language learner.

Six phases. Single school in 2–3 weeks.

Discovery

Setup

Content seed

Train

Go-live

Hypercare

Real SLAs, calibrated to a 40-student classroom.

Two academic stories — placeholders till the real ones land.

Common questions

30-minute demo with your accent, your grade level.

Request a demo