Yes. MIT licensed open source. Groq and Gemini both have free tiers that are enough for daily use. Ollama is completely free with no limits. There is no pro plan, no premium tier, no credit card needed.

Can I use it without internet?

Yes. Install Ollama, download a Whisper model (small is 480MB), and Rota works 100% offline. No API keys, no accounts, no internet. Your voice data never leaves your machine.

Is Rota AI a free alternative to Wispr Flow?

Yes, Rota AI is a free, open source alternative to Wispr Flow. It offers similar features like AI-powered text cleanup, context-aware dictation, and works across multiple apps. Unlike Wispr Flow, Rota AI is completely free with no subscriptions, offers offline mode, has zero telemetry, and the full source code is available under MIT license.

Rota AI has zero telemetry. The desktop app never phones home. Your voice data goes only to the transcription service you choose (Groq, Gemini, or local Ollama). API keys are encrypted at rest using your OS keychain.

Does Rota AI work with VS Code?

Yes. Rota detects VS Code and preserves camelCase, snake_case, and code syntax. Your code comments come out clean without extra punctuation. It also works in terminals and IDEs.

What are the system requirements?

Windows 10/11, macOS 13+, or Linux (Ubuntu 20.04+, Fedora 36+, Arch). 4GB RAM minimum, 8GB recommended. For local GPU transcription: NVIDIA GPU with 4GB+ VRAM. CPU-only works on any modern quad-core.

How to Customize Rota AI Settings for Your Workflow

TL;DR: Rota AI has a lot of settings. You only need to touch a few. Hotkey, VAD threshold, transcription backend. That is 90% of the optimization. Everything else is fine at defaults. Here is exactly what I changed and why.

I spent way too long poking through every single setting when I first installed Rota AI. Honestly, most of it does not need your attention. But the stuff that does? Huge quality of life upgrades.

Let me save you the trial and error.

The Settings That Actually Matter

Hotkey Change (Do This First)

The default hotkey is F9. I changed mine to F7.

Why? F9 is too far up the function row. Pinky stretch. Gets old fast when you are dictating all day. F7 sits closer to home row, and my hand barely moves.

This is the first thing I change in any transcription app. Goes for Wispr Flow, Superwhisper, all of them.

How to change it: Settings → Hotkey → click the field → press your new combo. Done. Takes five seconds honestly.

VAD Threshold (The Sweet Spot)

VAD is voice activity detection. Basically, how loud you need to be before Rota AI starts actually transcribing.

The slider goes from 0.0 to 1.0. Default is usually somewhere around 0.5.

My go-to range is 0.3 to 0.7 depending on the environment:

VAD Level	What It Does	When to Use It
0.3	Picks up quiet speech and ambient noise	Quiet room, close mic
0.5	Balanced	Default, safest choice
0.7	Ignores everything below a firm voice	Noisy office, coffee shops

If getting random garbage from room noise, bump it up. If Rota AI is cutting off the start of your words, bump it down.

I run at 0.3 at home with a decent mic. 0.6 at the library where people are weirdly loud for a library.

your mileage may vary on exact values. But the range is the range.

Transcription Backend

This one is more impactful than the hotkey or VAD combined, tbh.

Rota AI lets you pick your transcription backend. Not all backends are equal. Here is the breakdown I actually use:

Groq. Fastest option by a mile. Runs on Groq edge hardware. Accuracy is solid for most English dictation. The speed alone makes this my default.
Gemini. More accurate than Groq on edge cases. Technical terms, accented speech. Slightly slower but barely noticeable. Switched to Gemini exclusively lately for the accuracy boost.
Ollama. Fully local. Zero external API calls if needed. Accuracy depends entirely on which model you load. whisper-large-v3 is great but needs serious RAM.

My current setup: Gemini for daily driving. Groq as backup when API limits hit. Ollama for flights and offline work.

How to switch: Settings → Transcription → Backend → pick from dropdown → restart the app.

Fr, this single choice matters more than any other setting. The backend is where the magic happens.

Settings to Leave Alone

Some settings look important but causes problems when you touch them. I learned this the hard way.

Advanced Audio Settings

Leave sample rate, buffer size, and channel config at defaults. Rota AI detects your hardware right. Messing with these creates latency or dropouts that are hard to debug.

Network Timeout

Default works. Changing it higher just means longer waits before error messages. Changing it lower means failed transcriptions on slow connections. Where is the win.

Logging Level

Unless you are filing a bug report and a dev asks you for logs, keep it at default. Debug logging eats disk space fast and slows things down.

Not saying these settings do not exist for a reason. Just saying normal users should not need to touch them.

Personal Dictionary

Whisper models are good but they still butcher certain words. Technical terms especially.

I added about a dozen words to my personal dictionary:

Course-specific terminology from my classes
Proper nouns Whisper kept "correcting" to common words
Abbreviations I say a lot

How it works: Settings → Personal Dictionary → Add new entry → type the word as you want it → save.

Game changer for anyone in a specialized field. Medicine, law, engineering… whatever. If domain specific words come up a lot, add them.

Mine took about two minutes to set up and fixed probably 80% of my recurring corrections.

Snippets

Rota AI has this trigger prefix system. Some text that, when spoken as a snippet, expands out to a longer block of text.

Think of it like text replacement but voice activated.

I have about 15 snippets set up. Email templates, code boilerplate, common phrases. The trigger prefix ensures they only fire when I want them to (so transit dictation does not accidentally expand something random).

My setup:

Trigger prefix: ;;
Snippets: email signoffs, common responses, code blocks, addresses

How to set up: Settings → Snippets → Add New → set trigger text → set expansion → done.

If finding yourself dictating the same blocks of text over and over, snippets are the answer. Takes an hour to set up a good library and saves hundreds of hours.

Appearance

Dark mode, light mode, or follow system. Pick what looks good to you.

I run dark mode because I stare at screens all day and my eyes would file a complaint otherwise.

Settings → Appearance → Theme → Dark. That is about the whole journey there.

Not going to pretend this is a deep customization. But hey, it is nice to have and the battery savings on OLED screens are real.

How to Reset to Defaults

Actively afraid you messed something up? Reset it.

Settings → scroll to bottom → Reset All Settings → confirm.

Back to factory state. Good as new. Do not be afraid to experiment if you can always go back.

I actually reset once when audio settings broke in a way I could not trace. Five seconds later everything was fine.

FAQ

Can I customize the hotkey to any key combo? Most combos work. Some system reserved shortcuts (like Ctrl+Alt+Del on Windows) won't be captured. Stick to single function keys or Alt/Ctrl combos and you're golden.

Will changing the transcription backend delete my settings? No. Backend is separate. Hotkey, VCD, snippets, dictionary, all stay intact when switching backends.

Is local transcription enough with Ollama? Depends on your hardware. A GPU with 8GB VRAM runs whisper-large-v3 decently. Without a GPU, expect slower transcription and some patience. tbh for a premium experience, cloud backends are worth it.

How many snippets can I add? Never hit a limit personally. Dozens work fine. I have 15 but people in the community run 50 plus. No issues reported.

Do I need to restart the app after changing settings? Backend changes require a restart. Everything else applies on live. The app will tell you if a restart is needed.

What is the best VAD threshold for dictating in noise? 0.6 to 0.7. Higher threshold ignores more ambient sound. You need to project your voice a bit more but the transcription is cleaner. I use 0.65 at coffee shops and it catches the start of every sentence.

Bottom line. Pick a comfortable hotkey, find your VAD sweet spot for your environments, pick a backend, add your recurring words to the dictionary, set up snippets.

That is the whole customize rota ai settings journey. Everything else is polish.

Now go make Rota AI yours.

How to Customize Rota AI Settings for Your Workflow

How to Customize Rota AI Settings for Your Workflow

The Settings That Actually Matter

Hotkey Change (Do This First)

VAD Threshold (The Sweet Spot)

Transcription Backend

Settings to Leave Alone

Advanced Audio Settings

Network Timeout

Logging Level

Personal Dictionary

Snippets

Appearance

How to Reset to Defaults

FAQ

Related articles

Otter.ai vs Rota AI: Free & Private Voice Dictation Alternative

Rota AI & OpenAI Whisper: The Complete Dictation Apps Guide

I Tried Voice Coding with Cursor Using Rota AI. Here Is What Happened.