Our Story & Philosophy

About Avelyn — Privacy-First
Local LLM Writing Assistant

Avelyn was built to bridge the gap between powerful generative AI capabilities and absolute data sovereignty. We believe your thoughts belong to you.

The Privacy Crisis in Modern Writing Utilities

Almost every mainstream writing assistant runs on centralized cloud models. This means every email draft, code snippet, sensitive project note, and password you highlight is sent across the internet, stored in external databases, and used to train future model parameters. In corporate environments, research laboratories, and personal contexts, this data pipeline presents an unacceptable liability.

As organizations enforce strict policies against data leakage, professionals are forced to choose between efficiency and security. Avelyn is engineered as a zero-telemetry alternative. It acts as a dedicated **local LLM writing assistant** that operates system-wide, processing your information entirely within a local sandbox on your machine.

Moreover, the bandwidth costs, subscription limitations, and availability bottlenecks of cloud APIs interfere with smooth, native editing pipelines. Security departments often block cloud endpoints, but a sandboxed local tool that handles data solely in system RAM is fully compliant with enterprise security standards.

Under the Hood: The Hybrid Multi-Provider Architecture

Avelyn integrates with native macOS accessibility APIs to intercept highlighted text only when a user triggers the global shortcut (Ctrl+Shift+E). Once activated, the app captures the selection, loads it into local memory, and presents a minimalist command palette right at the cursor position.

Unlike cloud-only tools that send all draft items to a single database, Avelyn is multi-provider. It routes text to your local Ollama engine by default, but can also route to Avelyn Cloud (OpenRouter) or custom OpenAI-compatible endpoints. The app's Auto Provider Mode uses smart task classification to decide when to run locally for privacy and when to leverage cloud power, utilizing local fallbacks to search-priority endpoints if a server goes down.

Once inference completes, Avelyn replaces the selection in place. The entire workflow is transparent, customizable, and protects raw API keys inside settings using password bullet masks.

Apple Silicon Native Hardware Acceleration

Running LLMs locally has historically been slow and resource-heavy. Avelyn solves this by optimizing compilation for Apple Silicon Unified Memory Architecture (M1, M2, M3, M4 chips). It leverages the Apple Neural Engine and GPU cores to perform prompt processing and token streaming with sub-second latency.

By maintaining dynamic model warming in the background, Avelyn ensures that highlighting a block and asking for a rewrite executes in under 1.5 seconds. For details on benchmarking configurations and model specifications, check out the Avelyn AI technical guide.

Strategic Comparisons: How We Stack Up

When compared to cloud alternatives, Avelyn offers a completely different paradigm. Traditional systems operate as browser destinations, requiring constant copy-pasting, monthly subscriptions, and fixed rate limits. Avelyn gives you a system-wide macOS overlay that runs locally for free, supports flexible custom API connections, provides smart fallbacks, and implements instant cancellation controls so you can abort generations immediately.

To see detailed head-to-head comparisons against mainstream solutions, explore our dedicated research pages:

Advanced Local Sandbox Architecture Details

Operating inside a local sandbox requires compliance with macOS App Sandbox security rules. The assistant maintains absolute boundary isolation. It does not write temporary files containing highlighted text selections to disk, avoiding data leaks. The temporary text buffer is loaded purely in system RAM. Once the rewrite instruction executes and the output is pasted back, the system memory block is immediately zeroed out.

This standard of sandboxing prevents other applications from intercepting the clipboard data or reading the text buffer during LLM processing. This matches the security standard required by enterprises handling classified documentation or proprietary source code.

A Community-Driven Open Source Core

Our repository is publicly hosted, encouraging peer review, contributions, and community audits. Developers can review the accessibility hooks, shortcut bindings, and local api wrappers to verify that no network requests are sent.

In addition to security, an open ecosystem fosters model diversity. As developers build specialized coding helpers or creative writing weights, they can plug them directly into the assistant. This community approach ensures the product remains adaptable to future breakthroughs in language modeling.

Zero Cloud Logging

Your texts never touch cloud gateways, external servers, or proxy log databases.

Silicon Acceleration

Optimized for Apple M-series chips, ensuring sub-second inference speeds.

Private Sandbox

Runs locally on your device within a secured macOS sandbox, protecting code and documentation.

Open Weights Models

Supports Gemma 3, Llama 3, Mistral, and custom fine-tuned weights via Ollama.

Experience Local Offline Writing Assistance

Take part in our private beta program and run your macOS writing workflows with complete digital sovereignty.

Apply for Beta Access

About Avelyn — Privacy-First Local LLM Writing Assistant