Short version: every voice typing app is some combination of features its team decided to ship and design choices its team decided not to make. The features that look most magical, the ones the marketing pages lead with, usually require the app to know more about you than you might assume from the demo video. This article walks through four of those features, what each one mechanically requires from your device, and the choices we made about which of them to build. The goal isn't to tell you which app to use. The goal is to give you the questions worth asking of whatever app you end up with.
Voice typing apps are easy to install and hard to audit
You install a voice typing app, you press a button, you talk, text appears. The interaction is so simple that the part underneath, the part where the app decides what to read from your computer to do its job, mostly stays invisible.
Most modern voice typing apps in 2026 advertise some combination of these features:
- They adapt to the app you're working in. Different cleanup for Slack than for Gmail. Tuned for your IDE versus your email.
- They paste intelligently. Aware of the context you're pasting into, adjusting tone or formatting as needed.
- They start listening when you start talking. Hands-free, no button required.
- They learn over time. They get smarter as you use them. They remember your preferences.
Each of those features is real, useful, and shipping in voice apps you can install today. Each of them also has a mechanical requirement that isn't always on the marketing page. Below is what each one needs from your computer to work, and what we chose to build (or not build) on our side.
How we think about privacy at AICHE
Before the worked examples, the frame we apply to product decisions:
Reduce the leak surface. Every feature an app ships has some implication for what it has to know about you. We don't decide what's right or wrong in the abstract; we look at each feature and ask whether the value it gives the user is worth the data it touches. When the answer is "the feature is great but it requires us to listen to or look at things we don't want to handle," we either find a different way to ship the feature or we don't ship it.
Broad permissions create a broad surface to worry about. We try to keep ours narrow. That's a design discipline, not a slogan; the practical effect is that some features other apps advertise aren't on our roadmap because they'd push our permissions beyond what we want to ask for.
Stay honest about what we do. Audio leaves your machine for our named cloud transcription provider (Groq), gets processed in seconds, and gets discarded. Transcripts live on your machine by default; cloud sync of your notes is opt-in and end-to-end encrypted with a key you set, so we cannot read your synced content on our servers. We don't claim local-only because we aren't local-only. We don't claim zero data movement because that would also be false. We claim a specific, scoped data path and document it.
That stance is what the next four sections apply.
Adapting to the app you're in
Several voice typing apps advertise that they work better in specific applications: cleaner output in Slack than in Gmail, code-aware behavior inside an IDE, tone adjustment depending on whether you're writing a calendar invite or a Jira ticket. Wispr Flow's product page, for example, describes how "Flow learns your words, phrases, and tone, and keeps them consistent across every app and device." Aqua Voice's marketing leads with "Your screen is its dictionary" and describes the app as understanding "what's on your screen, from code syntax to everyday text."
To make a voice typing tool adapt to the app you're in, the tool has to know which app is active, and usually has to read some of what's on the screen to tune itself further. That means asking your operating system for ongoing awareness of which app is in front and often some access to what's on screen, for as long as the app is running.
We chose not to build per-app contextual adaptation for the same reason. Our recognition pipeline (filler-word removal, paragraph normalization, AI cleanup, custom vocabulary enforcement) runs the same way regardless of which app has focus. We don't read which app you're in. We don't read your window titles. We don't ask the OS for ongoing window-monitoring permissions. The cleanup is good enough across contexts to justify giving up the per-app tuning, and the data surface stays narrow.
The honest tradeoff: AICHE's output is consistent across apps; it isn't tuned to know that you're in Slack vs Gmail vs a terminal. Some users prefer the per-app tuning; for those users, the apps that build it are real options. The question is whether the per-app polish is worth handing over which-app-am-I-in awareness as a permanent capability of the app.
Pasting smart
A related feature: "smart paste" or context-aware pasting. The app detects where you're pasting and adjusts the output, sometimes pulling context from the clipboard or the document around the paste location to inform what gets typed.
To do that mechanically, an app needs to read your clipboard whenever it's running, and often needs to read the surrounding content of whatever app the paste lands in.
We chose not to monitor your clipboard. AICHE inserts cleaned text at your cursor through your operating system's standard insertion path (Smart Insert on desktop). We don't read what's already in your clipboard. We don't read what's around your paste destination. We don't keep a rolling memory of recent clipboard content. The text we insert is the text you just dictated, in the form our pipeline produced.
The honest tradeoff: AICHE doesn't adapt its output based on what you're pasting near. If you copy a JIRA ticket title and then dictate, AICHE won't read the JIRA ticket and tune its output. Some users find that adaptive behavior valuable. Some users would rather their voice typing app not have ongoing clipboard access.
Start talking, we're listening (wake-on-voice and hands-free)
Several voice apps offer hands-free or wake-on-voice modes: you don't press a button to start recording; the app starts capturing when you start talking. OtterPilot is a meeting-focused version of this idea, automatically joining and capturing calls. Wispr Flow and a few others have moved toward more push-button-free interactions in their recent releases (Wispr Flow hands-free docs).
To make hands-free work, an app needs the microphone open whenever the app is running, and needs to be continuously analyzing the audio stream to decide when speech starts. The mic isn't necessarily recording-to-disk in that design, but it needs to be listening to know when to start.
AICHE's default mode is push-button: you press ⌃+⌥+R on Mac or Ctrl+Alt+R on Windows / Linux to start recording, you press the same combo to stop. On mobile, you tap the mic or use the home-screen widget. The mic is not open until you ask for it to be open.
Voice Code (Pro), our continuous-listening mode, exists as a separate, opt-in feature designed for piping voice into AI coding agents (Claude Code, Codex, Cursor, Antigravity). It is off by default. When you turn it on, a visible floating bar shows on screen at all times while it runs; the bar has a mute control. While Voice Code is active, the mic is open continuously and whatever you say is typed into your active cursor. That is the feature. It's used by a minority of our users today and is documented exactly as it works rather than dressed up. We mention it here because pretending we don't have a continuous-listening mode at all would be inaccurate; we'd rather describe it accurately and let you decide if you want it on.
The honest tradeoff: AICHE's default workflow has one more button press than "just start talking" apps. We think the visibility is worth it. Some users prefer hands-free start; for those users, the apps that build it are real options. The question is whether you want the mic available whenever the app is running, or only when you've asked for it.
Learning over time
A lot of voice apps advertise getting smarter with use. They learn your style. They remember corrections you made. They build a personal model of how you write.
The mechanical version of this varies by app. Some products learn by updating a shared dictionary applied to all users. Some learn by recording each correction you make and adding that pair (what you said, what you fixed it to) to a per-user model that stays on their servers. Some pass everything you write through a personalization layer that builds a profile of your phrasing and vocabulary preferences.
We took a narrower approach. AICHE's only per-user "learning" surface is your custom vocabulary: a list of 50 entries you maintain yourself (names, brands, acronyms, code identifiers, internal jargon) that our recognition pipeline applies to every transcript. You add entries deliberately. The vocabulary syncs across your devices via the same end-to-end encrypted sync as your transcripts, keyed by your passphrase. We do not silently record your corrections. We do not build a behavioral model of how you write. We do not train on your audio. The pipeline that polished your transcript yesterday is the same pipeline that polishes it today.
The honest tradeoff: AICHE doesn't get more personalized to you the longer you use it. Some products do, and some users want that. If you'd rather an app build an evolving picture of you over time, the products that advertise this are real options. The question is whether the personalization is worth your corrections being kept somewhere as a record of what you typed and how you fixed it.
Permissions, briefly
We're not going to walk you through screenshots of every app's permission dialogs. That's not the right comparison shape, and most users don't audit it that way anyway.
The principle: broad permissions create a broad surface to worry about. We try to keep ours narrow. AICHE doesn't request ongoing access to your active window, your clipboard, your microphone when not recording, your keystrokes, or your accessibility tree. We ask for the microphone when you press the hotkey (or when you turn on Voice Code and accept that the mic stays open until you mute or close it). We ask for the file-system access we need to save your transcripts and offline queue. That's about it.
When you install any voice typing app, the dialog your operating system shows you is worth reading. If an app asks for permissions broader than what its core feature seems to need, that's the moment to ask why.
Where AICHE doesn't win on privacy (and what does)
Honesty section. There are categories of voice tool that beat us on specific privacy dimensions, and we'd rather name them than pretend we win every axis.
Fully local processing. AICHE streams audio to a named cloud provider (Groq) for processing in seconds. The audio leaves your machine, even though it doesn't stay anywhere outside the brief processing window. If "audio strictly local, no cloud round-trip ever" is your bar, the tools that meet it are the local-Whisper category: MacWhisper on Mac, VoiceInk on Mac, Speech Note on Linux (Flatpak), or a hand-rolled Whisper.cpp setup on any platform. They give up the polish pipeline that turns Whisper output into finished text in 3 seconds, but they hold the line on local-only audio.
One-time purchase, no cloud account. Some users want to buy software once and have no relationship with a vendor's cloud at all. MacWhisper offers a one-time license on Gumroad (€59, ~$69 USD) for Mac users who want this. AICHE is subscription-only and there is no "buy outright" option.
If either of those is your hard constraint, we're saying so on our own page, and pointing you at the alternatives that meet it. That's the same discipline we apply to feature comparisons: we publish the categories where we don't win.
Questions worth asking about any voice typing app
Not accusations. Questions. Useful before you install anything.
- Does it listen when I haven't pressed record? If yes, is the listening mode opt-in and visible, or default-on and invisible?
- Does it read my clipboard? When? Only after I paste, or whenever it's running?
- Does it know which app I'm in? If yes, is that information used only for tuning, or is it sent anywhere?
- Does it read my window titles? When? In what contexts?
- Does it record corrections I make and keep them anywhere? Locally on my machine, or on their servers? For how long?
- What audio does it keep, and for how long? Where is "deleted" defined - off the server, or moved to cold storage?
- What happens if I cancel my subscription? Are my transcripts portable? Are they deleted?
- What third-party SDKs does the app ship with? Analytics, advertising, crash reporting - what data goes where?
A voice typing app's privacy posture is a combination of features it built, features it chose not to build, and features it built but documented honestly. None of that is visible from a marketing page. The questions above push you toward what the app actually does, not what it markets.
What AICHE's answers look like
For the same eight questions, applied to us:
- Listening when you haven't pressed record: Default mode is push-button only. Voice Code (Pro) is opt-in, off by default, shows a visible floating bar while running, has a mute control.
- Clipboard: Not read.
- Which app you're in: Not read. Cleanup pipeline runs the same way regardless of focused app.
- Window titles: Not read.
- Corrections: Custom vocabulary is the only per-user learning surface; you add entries deliberately. We do not silently record corrections or build a behavioral model.
- Audio: Streamed to Groq for processing in seconds, then discarded. No persistent audio storage on our servers.
- Cancellation: Transcripts live on your device by default; you keep them. Cloud sync of your notes (opt-in, end-to-end encrypted with your key) deletes from our servers when you cancel.
- Third-party SDKs: Desktop apps ship with no analytics SDK. Mobile apps ship Firebase for ad attribution only, named in our privacy policy. No fingerprinting, no session replay, no behavioral telemetry beyond the named Firebase scope.
If any of those answers don't match what you want from a voice typing app, the alternatives in this category exist and we're not pretending otherwise.
Try AICHE
7-day free trial, no credit card. Personal $3.99/mo on annual ($4.99/mo monthly). Pro $8.33/mo on annual ($9.99/mo monthly). Available on Mac, Windows, Linux, iPhone, iPad, Apple Watch, Android, Chrome, Obsidian, and via REST API.