Multilingual Voice Input

Speak in 99 languages

Automatic detection for 99 voice-input languages on every platform.

See All Languages
Works on
macOSWindowsLinuxiOSAndroid

The short answer: AICHE transcribes voice input in 99 languages on every platform. Press your hotkey (⌃+⌥+R on Mac, Ctrl+Alt+R on Windows/Linux), speak in any supported language, and AICHE detects the language and inserts clean text at your cursor in 2-3 seconds. No language selection, no settings to configure.

Traditional dictation software forces you to select language before recording. If you work in multiple languages daily, this constant switching becomes a workflow killer.

How It Works

  1. Press your recording hotkey (⌃+⌥+R or Ctrl+Alt+R) and speak in any supported language. AICHE detects it automatically.
  2. Automatic detection happens during transcription. You never choose a language manually.
  3. AICHE transcribes in 2-3 seconds, regardless of language. 15 minutes of audio comes back in roughly 3 seconds.
  4. For multilingual recordings (switching languages mid-sentence), the engine handles the transition automatically without configuration.

Heads-up: transcription accuracy varies by language. Major languages (English, Spanish, French, German, Chinese, Japanese) achieve 95%+ accuracy, while less common languages benefit from a Custom Vocabulary entry for names, brands, and technical terms.

The pro-tip: combine auto-detection with the Translate to English enhancement to speak in your native language and get English output instantly. Perfect for international teams where English is the common language but not everyone's first language.

How Automatic Detection Works

Traditional dictation software requires you to choose a language before recording. If you regularly switch between English and Spanish, or English and Japanese, you're changing settings before every recording. With AICHE, the AI identifies the language during transcription - you just speak.

Detection happens at the sentence level, not the word level. This means multilingual sentences work naturally. If you say a sentence in English with a Spanish phrase embedded ("We need to fix the problema with the database"), the AI handles the code-switch and transcribes both languages correctly.

The detection is also fast enough that it doesn't add processing time. Whether you speak English, Mandarin, or Arabic, transcription completes in the same 2-3 second window.

Voice Input vs UI Localization

These are two different things, and the distinction matters.

Voice input: 99 languages, every platform. The transcription engine accepts speech in all 99 languages on macOS, Windows, Linux, iPhone, iPad, Apple Watch, Android, the Chrome extension, the Obsidian plugin, and the REST API. A Russian-speaking developer on Linux gets Russian transcription. A Japanese writer on Windows gets Japanese transcription. Voice input does not care which OS you are on or which language your menus are in.

UI localization: 28 languages on mobile. The app's menus, settings, and buttons are translated on iPhone, iPad, Apple Watch, and Android - including right-to-left layouts for Arabic and Hebrew, with 450+ strings localized per language. If you want a Spanish menu, you'll find it on mobile.

The practical version: pick AICHE based on whether you need to dictate in your language. The interface language is a separate question that only matters on mobile.

Accuracy by Language Tier

Not all languages are transcribed with equal accuracy. The difference comes from training data - languages with more digital text and audio data produce better models.

Tier 1: 95%+ Accuracy

English, Spanish, French, German, Italian, Portuguese, Chinese (Mandarin), Japanese, Korean, Dutch, Russian, Polish, Hindi, Arabic. These languages have extensive training data and produce highly reliable transcriptions even in noisy environments.

Tier 2: 90-95% Accuracy

Turkish, Vietnamese, Thai, Czech, Romanian, Hungarian, Swedish, Danish, Norwegian, Finnish, Greek, Indonesian, Malay, Ukrainian, Hebrew. Accurate for most content; technical jargon and uncommon proper nouns benefit from adding entries to your Custom Vocabulary.

Tier 3: 85-90% Accuracy

Less commonly digitized languages where training data is more limited. Results are usable but benefit significantly from Custom Vocabulary entries with sounds-like phonetic hints. Names, technical terms, and fast speech are the most common error sources.

For any language, speaking clearly at a natural pace (not artificially slow) and using a quality microphone produce the best results.

Multilingual Workflows

Code-Switching in Conversation

Bilingual speakers naturally switch between languages mid-conversation. AICHE handles this without configuration - start a sentence in English, switch to Hindi for a technical explanation, finish in English. The transcription captures both languages accurately.

Team Communication Across Languages

A distributed team can use AICHE with Translate to English enabled so everyone speaks their native language but all output is in English. The German developer speaks German, the Brazilian developer speaks Portuguese, and both produce English text for the shared Slack channel.

Documenting in Multiple Languages

If you maintain documentation in multiple languages, AICHE handles each recording in whatever language you speak without requiring settings changes between recordings.

Supported Languages

AICHE transcribes voice input across 99 languages:

A-C: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chinese (Simplified/Traditional), Corsican, Croatian, Czech

D-G: Danish, Dhivehi, Dutch, English, Esperanto, Estonian, Filipino (Tagalog), Finnish, French, Frisian, Galician, Georgian, German, Greek, Gujarati

H-K: Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Krio, Kurdish, Kyrgyz

L-O: Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Meiteilon (Manipuri), Mongolian, Myanmar (Burmese), Nepali, Norwegian, Nyanja (Chichewa), Odia (Oriya)

P-S: Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scots Gaelic, Serbian, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish

T-Z: Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uyghur, Uzbek, Vietnamese, Welsh, Xhosa, Yiddish, Yoruba, Zulu

What You Get

  • 99 transcription languages. Same engine on desktop, mobile, Chrome, Obsidian, and the REST API.
  • Automatic detection. No language picker, no settings switching between recordings. Code-switching mid-sentence is handled.
  • Auto-translation to English. Speak in your native language, ship clean English. AI cleanup polishes the output.
  • AI cleanup in every language. Filler words removed, punctuation and paragraph breaks added.
  • Custom vocabulary. 50 entries per user for names, brands, and jargon. Especially useful for Tier 2 and Tier 3 languages.
  • Mobile UI in 28 languages. Menus and settings on iPhone, iPad, Apple Watch, and Android, with 450+ strings per language and right-to-left layouts for Arabic and Hebrew.
  • Zero-retention audio. Audio is streamed to Groq, processed, and discarded immediately after processing, within 1 second. Cloud sync is opt-in and AES-256-GCM encrypted with an Argon2id-derived key. Modern TLS in transit, with TLS 1.3 where the OS supports it.

Personal is $4.99/mo monthly or $3.99/mo on annual. Pro is $9.99/mo monthly or $8.33/mo on annual. From $3.99/mo with a 7-day free trial, no credit card. See pricing.

Result: you conduct a meeting with team members speaking Spanish, French, and English. Record three separate notes in three languages without touching settings. AICHE transcribes all three correctly in under 10 seconds total.

Do this now: press your hotkey and speak a sentence in any language you know. Watch AICHE detect and transcribe it automatically without configuration.

Tags

productivityworkflowcollaboration