Not a chat app. An agent kernel.

Download for Mac

Free 7-day trial · No account required · Perpetual license

The architecture

Every interaction
dispatches an agent.

Other apps wrap a chat box around an API. QARK gives you an agent runtime — composable, controllable, yours.

Agents

Agents all the way down

Build agents that think. Chain agents that execute. Watch them work together in real-time.

Agents call other agents as tools — nested execution with full tracking
Each agent: system prompt, tools, temperature, model, context strategy
Create a Research Agent, Writing Agent, and a PM Agent that dispatches both
Multiple context strategies — from simple truncation to smart summarization

Sparks

Summon. Solve. Vanish.

Cmd+Alt+Space. A floating overlay from any app. Highlight, transform, paste back — without leaving your workflow.

Floating overlay appears over any app — no window switching
Pick a Flow, transform selected text, paste result back at cursor
Output actions: replace, insert before/after, copy, diff preview
Works with any app — emails, code editors, documents, browsers

RAG

Reason over everything

Advanced retrieval, intelligent chunking, multi-stage reranking, cited sources. Not just search — comprehension.

Smart routing: auto-decides between direct context, vector search, or hybrid per query
Multi-stage retrieval — hypothesize, search, rerank, cite. Every answer earns its sources
Code-aware chunking that understands syntax structure — not just paragraph breaks
Every answer cites its source: document, section, chunk, relevance score

Flows

Your best prompts, bottled

Pre-configured LLM pipelines: system prompt → model → post-processing → output action. One click, every time.

"Summarize in 3 bullets" → copies to clipboard
"Review code for bugs" → opens in chat with diff view
Chain with Sparks for instant text transformation from any app
Model overrides, post-processing, and custom output actions per Flow

Conversations

Not a chat list. A workspace.

Branch, fork, search, export. Groups, split view, batch ops. Your conversations are a knowledge base.

Smart groups with custom icons and colors — auto-grouping sorts conversations for you
Auto-naming via embedded local LLM — no API call, no latency, no cost
Fork from any message, branch navigation, split view with resizable panes
Full-text search, batch ops, archive, trash with restore, export to Markdown/HTML/PDF

Bring your own keys

Every provider. Every model.
Zero lock-in.

From frontier models to local inference — switch mid-conversation, compare outputs, control the spend.

Anthropic Claude

OpenAI GPT / o-series

Google Gemini

Groq Ultra-fast

DeepSeek Reasoning

Ollama Local models

LM Studio Local models

OpenRouter Any model

Together AI Open source

xAI Grok

Perplexity Search

Cohere Embeddings

Voyage AI Embeddings

Jina AI Reranking

Everything you need

25 100+ reasons to never open
another AI app.

Every feature. Day one. No tiers, no waitlist.
Explore all features in detail →

Runtime

Agents

Composable agents that call other agents as tools — nested execution with full tracking. Each gets its own system prompt, model, temperature, and context strategy. Build a Research Agent, Writing Agent, and a PM Agent that dispatches both.

Runtime

Sparks

Cmd+Alt+Space. A floating overlay from any app. Pick a Flow, transform text, paste back.

Models

Every Major Provider

Anthropic, OpenAI, Google, Groq, DeepSeek, Ollama, OpenRouter, Together, xAI, Perplexity, and more — every frontier model in one app. Switch providers mid-conversation without losing context.

Runtime

RAG

Vector search, hybrid retrieval, and reranking. Every answer cites its source.

Runtime

Flows

Pre-configured LLM pipelines: system prompt → model → post-processing → output. "Summarize in 3 bullets" → clipboard. "Review code" → diff view. Chain with Sparks for instant text transforms.

Runtime

Conversations

Smart groups with custom icons and colors, auto-grouping, auto-naming via local LLM. Fork from any message, branch navigation, split view with resizable panes. Full-text search, batch ops, archive, export to Markdown/HTML/PDF.

Tools

Web Search

Search the web from any conversation — via Tavily, Brave, Exa, Jina, Perplexity Sonar, Valyu, Parallel, or your own local browser for completely free, zero-API searches. Built-in provider search via Anthropic, OpenAI, and Gemini grounding too. Results are parsed, cited, and fed into context.

Tools

Image & Video Generation

Generate images with DALL·E, Gemini Imagen, and xAI Aurora — and videos with Sora, Gemini, and xAI — inline in any conversation. Preview, iterate on prompts, upscale, download or share directly. Multiple providers for style, quality, and cost.

Coming Soon

Tools

Speech Support

Voice in, voice out. macOS native or OpenAI Whisper.

Tools

Unix Commands

Execute shell commands from within conversations — safe, sandboxed, with full output captured in context. Run scripts, query databases, process files, and pipe results directly into your AI workflow.

Tools

Web Fetch

Fetch and parse any URL — articles, docs, API responses. Uses Jina Reader, Tavily Extract, Parallel, Valyu, Ollama, or your own local browser to extract content for free with no API key. Cleaned and added to context automatically.

Tools

File Support

Attach PDFs, DOCX, images, spreadsheets, and code files — everything is parsed, chunked, and indexed for instant retrieval. Drag and drop or paste from clipboard.

Models

Any Model You Want

Frontier to open-source. Switch models mid-conversation.

Privacy

100% Local

Every conversation, every setting, every attachment — stored locally in a SQLite database on your Mac. No cloud sync, no telemetry, no tracking. API calls go direct from your machine to the provider. QARK never sees your data.

Privacy

Bring Your Own Keys

Connect directly to each provider with your own API keys. Pay only for what you use — no markup, no middleman, no subscription. Full cost transparency across every conversation.

Models

Local Models

Run Ollama and LM Studio models completely offline on your hardware. Full privacy, zero cost, no API key needed. Perfect for air-gapped environments and sensitive work.

Privacy

Budget Control

Set spending limits per provider, per day, or per month. Real-time cost tracking per message shows exactly what each response costs. Automatic alerts when you approach limits. Full cost breakdown across all providers in one dashboard.

Privacy

Encrypted Keys

All API keys encrypted on disk with a randomly generated encryption key unique to your installation. Keys are never stored in plaintext and never sent to our servers.

Power

MCP Integration

Connect external tools, databases, and services via the Model Context Protocol. Extend QARK's capabilities with any MCP-compatible server — from GitHub to Slack to your own custom tools. The agent runtime becomes infinitely extensible.

Power

Embedded LLM

A local language model runs on-device for auto-naming conversations, smart tagging, and intelligent organization — zero API calls, zero latency, zero cost.

Power

Keyboard-First

Command palette, split view, tabs, multi-pane layouts, and deep keyboard navigation. Every action has a shortcut — new conversation, switch model, toggle sidebar, jump to search. Vim-style motions for power users. Designed for people who never touch the mouse.

Power

Context Strategies

Six ways to manage context — truncation, summarization, or hybrid. Per-conversation control.

Runtime

Skills

Filesystem-based skill packages following the open Agent Skills spec. Three-tier injection — catalog, instructions, execution — lets agents load specialized knowledge on demand and run bundled scripts. Activate with /skill or #skill syntax.

Coming Soon

Runtime

Memory

Conversations end. Context shouldn't. Persistent memory gives your agents recall across sessions, projects, and weeks — they remember what matters so you don't have to repeat yourself.

Coming Soon

Platform

iOS App

QARK in your pocket. Same agents, same privacy, same workflows — synced through iCloud. Your cloud, not ours.

Explore all features

FAQ

Frequently asked questions

Everything you need to know about Qark.

Qark is an AI agent kernel — not a chat wrapper. Every interaction dispatches an agent with its own model, tools, system prompt, and context strategy. Agents can call other agents as tools, creating composable pipelines and nested workflows. Built with Rust and Tauri for native performance.

Subscription chat apps lock you into one provider and charge $20/month whether you use it or not. Qark connects to 13+ providers with your own API keys — you pay only for tokens you actually use. Plus you get agents, RAG, Flows, Sparks, MCP, and tools that no chat app offers.

Not at all. Start a conversation, pick a model, and chat — it works like any AI app. The advanced features (agents, flows, RAG, MCP) are there when you want them, but never in the way. Lawyers, writers, scientists, and students use Qark alongside developers.

macOS today, with Windows and Linux support launching within 30 days. An iOS companion app is also in development for on-the-go access with the same agents and privacy.

Most BYOK apps are chat wrappers with a model picker. Qark is an agent runtime — every interaction dispatches a configurable agent that can call other agents as tools, creating nested workflows. Add RAG, Flows, Sparks overlay, MCP integration, built-in web search, image/video generation, and Unix commands — it's a full AI operating environment, not a pretty textarea.

Yes — and it's not basic file attachment. Qark has a full RAG pipeline: vector search, hybrid retrieval, cross-encoder reranking, HyDE generation, and code-aware chunking. Smart routing picks the right strategy per document. Drop in PDFs, DOCX, spreadsheets, code files, or entire folders. Every answer cites its source.

Yes. Sparks (Cmd+Alt+Space) is a floating overlay that works from any app — highlight text, summon Sparks, run a Flow, and the result goes straight to your clipboard. No copy-pasting between windows. Flows are reusable LLM pipelines you define once and trigger instantly.

Deeply. MCP (Model Context Protocol) lets you plug in Notion, Google Sheets, GitHub, Slack, databases, or any custom tool — agents discover and use them automatically. On top of that, Qark ships with 9 built-in tools including web search, web fetch, image and video generation, and sandboxed Unix commands.

Your purchase includes all updates and new features for 12 months. After that, the app continues to work forever — you just won't receive new features unless you optionally renew for $19/year.

Yes! Every plan includes a 7-day free trial with full access to all features. No credit card required to start.

For $99, you get lifetime updates forever — no renewal needed. This is limited to 500 licenses, then it's retired permanently.

You pay once for Qark itself. AI usage is separate — you use your own API keys and pay providers directly for tokens. Most casual users spend $2–5/month. Power users might spend more, but you're always in control with per-provider budget limits and real-time cost tracking per message.

Never. Qark has no server, no cloud, no account system. API calls go directly from your machine to the provider. Your API keys are encrypted on disk with a randomly generated encryption key unique to your installation. Conversations are stored in a local SQLite database you fully own. We couldn't read your data even if we wanted to.

Yes. Run local models via Ollama or LM Studio and Qark works with zero internet connection — no API calls, no tokens, no cost. The embedded LLM handles conversation naming and tagging locally too. Perfect for air-gapped environments, sensitive documents, or just saving money.

That depends on the provider's API terms, not Qark. Most providers (OpenAI, Anthropic, Google) do not train on API data by default. Since Qark uses API keys directly, you get the same data protection as any API user. You can also use local models for complete privacy — nothing leaves your machine.

Zero. No telemetry, no analytics, no crash reporting, no update checks you didn't ask for. Qark doesn't even have a user account system — there's nothing to phone home to. You can verify this yourself: the app makes no network requests except the ones you trigger to AI providers.

Have more questions?

Read the docs · Join us on Reddit · Follow on X

Not a chat app. An agent kernel.

Every interaction dispatches an agent.

Agents all the way down

Summon. Solve. Vanish.

Reason over everything

Your best prompts, bottled

Not a chat list. A workspace.

Every provider. Every model. Zero lock-in.

25 100+ reasons to never open another AI app.

Agents

Sparks

Every Major Provider

RAG

Flows

Conversations

Web Search

Image & Video Generation

Speech Support

Unix Commands

Web Fetch

File Support

Any Model You Want

100% Local

Bring Your Own Keys

Local Models

Budget Control

Encrypted Keys

MCP Integration

Embedded LLM

Keyboard-First

Context Strategies

Skills

Memory

iOS App

Frequently asked questions

Ready to try the AI agent kernel?

Every interaction
dispatches an agent.

Every provider. Every model.
Zero lock-in.

25 100+ reasons to never open
another AI app.