About Avelyn — Privacy-First
Local LLM Writing Assistant
Avelyn was built to bridge the gap between powerful generative AI capabilities and absolute data sovereignty. We believe your thoughts belong to you.
The Privacy Crisis in Modern Writing Utilities
Almost every mainstream writing assistant runs on centralized cloud models. This means every email draft, code snippet, sensitive project note, and password you highlight is sent across the internet, stored in external databases, and used to train future model parameters. In corporate environments, research laboratories, and personal contexts, this data pipeline presents an unacceptable liability.
As organizations enforce strict policies against data leakage, professionals are forced to choose between efficiency and security. Avelyn is engineered as a zero-telemetry alternative. It acts as a dedicated **local LLM writing assistant** that operates system-wide, processing your information entirely within a local sandbox on your machine.
Moreover, the bandwidth costs, subscription limitations, and availability bottlenecks of cloud APIs interfere with smooth, native editing pipelines. Security departments often block cloud endpoints, but a sandboxed local tool that handles data solely in system RAM is fully compliant with enterprise security standards.
Under the Hood: The Local Sandboxed Architecture
Avelyn integrates with native macOS accessibility APIs to intercept highlighted text only when a user triggers the global shortcut (Ctrl+Shift+E). Once activated, the app captures the selection, loads it into local memory, and presents a minimalist command palette right at the cursor position.
Unlike cloud-based tools that stream files to third-party endpoints, Avelyn routes the text directly to a local inference engine running on the host. By utilizing local ports connected to models hosted on your system, it guarantees that no information is sent to external servers. To understand the detailed mechanics, read our answers to common questions in the What is Avelyn FAQ guide.
Once inference completes, Avelyn replaces the selection in place. The entire workflow is transparent, lightning-fast, and relies on zero cloud processing.
Apple Silicon Native Hardware Acceleration
Running LLMs locally has historically been slow and resource-heavy. Avelyn solves this by optimizing compilation for Apple Silicon Unified Memory Architecture (M1, M2, M3, M4 chips). It leverages the Apple Neural Engine and GPU cores to perform prompt processing and token streaming with sub-second latency.
By maintaining dynamic model warming in the background, Avelyn ensures that highlighting a block and asking for a rewrite executes in under 1.5 seconds. For details on benchmarking configurations and model specifications, check out the Avelyn AI technical guide.
Strategic Comparisons: How We Stack Up
When compared to cloud alternatives, Avelyn offers a completely different paradigm. Traditional systems operate as destinations, requiring constant copy-pasting and subscriptions. Avelyn runs 100% offline, for free, with unlimited context inputs.
To see detailed head-to-head comparisons against mainstream solutions, explore our dedicated research pages:
Advanced Local Sandbox Architecture Details
Operating inside a local sandbox requires compliance with macOS App Sandbox security rules. The assistant maintains absolute boundary isolation. It does not write temporary files containing highlighted text selections to disk, avoiding data leaks. The temporary text buffer is loaded purely in system RAM. Once the rewrite instruction executes and the output is pasted back, the system memory block is immediately zeroed out.
This standard of sandboxing prevents other applications from intercepting the clipboard data or reading the text buffer during LLM processing. This matches the security standard required by enterprises handling classified documentation or proprietary source code.
A Community-Driven Open Source Core
Our repository is publicly hosted, encouraging peer review, contributions, and community audits. Developers can review the accessibility hooks, shortcut bindings, and local api wrappers to verify that no network requests are sent.
In addition to security, an open ecosystem fosters model diversity. As developers build specialized coding helpers or creative writing weights, they can plug them directly into the assistant. This community approach ensures the product remains adaptable to future breakthroughs in language modeling.
Zero Cloud Logging
Your texts never touch cloud gateways, external servers, or proxy log databases.
Silicon Acceleration
Optimized for Apple M-series chips, ensuring sub-second inference speeds.
Private Sandbox
Runs locally on your device within a secured macOS sandbox, protecting code and documentation.
Open Weights Models
Supports Gemma 3, Llama 3, Mistral, and custom fine-tuned weights via Ollama.
Experience Local Offline Writing Assistance
Take part in our private beta program and run your macOS writing workflows with complete digital sovereignty.
Apply for Beta Access