DecisionsAI - Desktop AI Assistant Offline by Default

Name: DecisionsAI
Rating: 5 (1 reviews)
Author: Tensology

DecisionsAI is a free desktop application - a local offline AI assistant that works like Siri for your computer. Download this native desktop app for Windows, macOS, or Linux. Open source voice assistant to control your computer with natural language. Runs completely offline. Free, privacy-first desktop AI assistant.

Desktop Native Open Source Privacy First

A globe floating above your windows, always listening. Select text, say "Read This" - it reads it back. Rewrite anything. Control everything. All by voice. Windows, macOS, Linux. Offline by default.

What You Can Say to the Agent

AI Assistant Commands

Read This - Select the text on your screen and then say 'Read This' - the agent will read the selected text to you.

Rework This / Rework from Clipboard - Select text and say 'Rework This' to improve and paste it, or copy to clipboard and say 'Rework from Clipboard' to improve the clipboard content.

Summarize This / Summarize from Clipboard - Select text and say 'Summarize This' to get a summary pasted, or copy to clipboard and say 'Summarize from Clipboard' to summarize clipboard content.

Explain This / Explain from Clipboard - Select text and say 'Explain This' to get an explanation, or copy to clipboard and say 'Explain from Clipboard' to explain the clipboard content.

Computer Control Commands

Open / Focus - Say 'Open [app name]' or 'Focus on [app name]' - the agent opens or switches to that application.

New Tab / Close - Say 'New Tab' to create a browser tab, or 'Close' to close the current window.

Copy / Paste / Cut - Say 'Copy', 'Paste', or 'Cut' - the agent performs the clipboard action.

Select All - Say 'Select All' - the agent selects all text in the current field or document.

Undo / Redo - Say 'Undo' to reverse the last action, or 'Redo' to repeat it.

Mouse and Navigation Commands

Mouse Up/Down/Left/Right - Say 'Mouse Up', 'Mouse Down', 'Mouse Left', or 'Mouse Right' - the agent moves the cursor in that direction.

Click / Double Click - Say 'Click' for a left click, or 'Double Click' for a double-click action.

Scroll Up / Down - Say 'Scroll Up' or 'Scroll Down' - the agent scrolls the page in that direction.

Move Mouse Center - Say 'Move Mouse Center' - the agent moves your cursor to the center of the screen.

Right Click - Say 'Right Click' - the agent performs a right-click at the current cursor position.

Dictation and Transcription Commands

Dictate - Say 'Dictate' - everything you say next will be typed into the current field until you say 'Enter This'.

Transcribe - Say 'Transcribe' or 'Listen' - your speech is stored to clipboard until you say 'Enter This' or 'Stop Listening'.

Listen - Say 'Listen' or 'Listen to' - the agent starts capturing your speech to the clipboard.

Stop Listening - Say 'Stop Listening' or 'Stop' - the agent stops capturing your speech.

Media Control Commands

Play / Pause / Stop - Say 'Play', 'Pause', or 'Stop' - the agent controls media playback on your system.

Next / Previous Track - Say 'Next Track' or 'Previous Track' - the agent skips to the next or previous song.

Volume Up / Down - Say 'Volume Up' or 'Volume Down' - the agent adjusts your system volume.

Mute - Say 'Mute' - the agent mutes your system audio.

System Commands

Hide / Show Globe - Say 'Hide Globe' to hide the interface, or 'Show Globe' to bring it back.

New Chat - Say 'New Chat' or 'Start Over' - the agent begins a fresh conversation.

Refresh / Reload - Say 'Refresh' or 'Reload' - the agent refreshes the current page or application.

Exit - Say 'Exit' - the agent closes the DecisionsAI application.

The Voice Pipeline

Every voice command flows through a sophisticated real-time pipeline built on Pipecat - the industry-leading AI voice framework. Your words are captured, understood, and executed in milliseconds, all while staying completely local on your machine.

The entire process happens in four seamless stages, each optimized for speed and privacy. From the moment you speak to the moment your computer responds, everything runs offline by default - giving you sub-100ms latency without ever touching the cloud.

Speech Recognition

Powered by Vosk for ultra-low latency or Whisper.cpp for maximum accuracy. Both run completely offline with intelligent noise cancellation and voice activity detection.

AI Understanding

Local models via Ollama (default llama3.2:3b) or optionally OpenAI for advanced reasoning. Understands context, intent, and executes complex multi-step commands.

Voice Synthesis

Fast local voices with Kokoro TTS or premium options via ElevenLabs. Natural, responsive audio that feels human.

Real-Time Execution

The Pipecat framework orchestrates everything with sub-100ms latency. Seamless streaming from microphone to action, with intelligent buffering and response management.

The entire pipeline is architected for offline-first operation. Every component - from speech recognition to language understanding to voice synthesis - is designed to work independently of internet connectivity. Your data never leaves your machine unless you explicitly opt into cloud enhancements.

Built with Industry-Leading Open Source: DecisionsAI leverages cutting-edge technologies - Pipecat for real-time voice orchestration, Pydantic for robust data validation, and PyQt6 for a modern, responsive interface. All running locally, all completely transparent.

Desktop Application with True Computer Control

This native desktop application gives you true computer control. As a desktop app running on Windows, macOS, or Linux, DecisionsAI uses PyAutoGUI for GUI automation and OpenInterpreter for Python code execution. Automate complex workflows, manipulate desktop applications, and even write and run code - all through natural voice.

GUI Automation with PyAutoGUI

Control any application on your computer through voice. PyAutoGUI integration enables clicking, typing, window management, and screen interaction - automate repetitive tasks with simple voice commands.

Python Code Execution

OpenInterpreter integration lets you write, execute, and debug Python code through voice. Ask to analyze data, create scripts, or manipulate files - the assistant writes and runs code in real-time.

Real-Time Voice Pipeline

Pipecat framework delivers sub-100ms latency from speech to action. Stream voice commands continuously, interrupt naturally, and get instant responses - all processed locally with zero cloud dependency.

Advanced Speech Recognition

Choose between Vosk for ultra-low latency or Whisper.cpp for maximum accuracy. Both run completely offline with intelligent noise cancellation, VAD detection, and multi-language support.

Hybrid Architecture

Default to 100% offline operation with local models, or selectively enable cloud APIs for enhanced capabilities. Mix and match - use Ollama locally, OpenAI for complex reasoning, ElevenLabs for premium voices.

Intelligent Voice Synthesis

Kokoro TTS provides fast, natural local voices with multiple character options. Upgrade to ElevenLabs for voice cloning, emotion control, and ultra-realistic speech patterns - all configurable per conversation.

Processing Options

Choose from a variety of processing options for voice and language tasks, with the flexibility to run locally or in the cloud.

Voice Models

Speech Recognition - Default to Vosk for lightweight offline processing, or choose Whisper.cpp for local high-accuracy transcription, or AssemblyAI's SLAM-1 for cloud-based performance with real-time streaming and advanced audio intelligence. Features intelligent VAD threshold detection and echo cancellation.

ElevenLabs - Premium voice synthesis with access to ElevenLabs' latest models including v2 and v3. Features voice cloning, emotion control, and ultra-realistic speech patterns with your API key.

Kokoro TTS - High-performance local TTS with Kokoro - multiple voice options. Optimized for low-latency response and offline operation, with configurable speech parameters and voice styles.

Language Models

OpenAI Models - Access GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo through your OpenAI API key. Perfect for complex reasoning, creative tasks, and general conversation with industry-leading performance.

Customizable Ollama Models - Default to llama3.2:3b for general tasks, with support for any Ollama model. Specialize with separate models for conversation (e.g., Mistral) and logical reasoning (e.g., CodeLlama). Full control over model parameters and context windows.

LangChain Integration - Advanced RAG with local folder indexing, persistent chat history, and conversation review UI. Supports document chunking, embedding generation, and semantic search with configurable parameters.

Built on Industry-Leading Technologies

DecisionsAI is built with the best open source tools and frameworks. Each component is carefully selected for performance, privacy, and flexibility.

Pipecat: Real-Time AI Voice Framework - Real-time voice AI pipeline orchestration framework with frame-based streaming.

Vosk: Low Latency ASR Toolkit - Ultra-low latency offline speech recognition.

Whisper.cpp: Open-Source ASR Toolkit - High-accuracy offline speech recognition.

Ollama: AI Model Deployment - Local language model inference supporting various models including Llama, Gemma, and more.

OpenAI: API LLMs - Cloud-based language models including GPT-4, GPT-3.5, and more.

Kokoro: Text-to-Speech - High-quality offline text-to-speech with natural voice synthesis.

ElevenLabs: Text-to-Speech - Cloud-based text-to-speech with high-quality voice synthesis.

OpenInterpreter: Python Interpreter - Execute Python code through natural language commands.

PyAutoGUI: GUI Automation - Control any application on your computer through voice commands.

Pydantic: Data Validation - Robust data validation and settings management.

PyQt6: GUI Framework - Modern, responsive desktop application interface.

All technologies are open source or provide generous free tiers. DecisionsAI respects the licenses and contributions of each project, and we're grateful to the open source community for making this possible.

Download and Installation

DecisionsAI is available for free download from GitHub. This desktop application works on Windows, macOS, and Linux. Simply clone the repository, install dependencies, and start using your offline AI assistant.

System Requirements: Operating System Windows, macOS, or Linux. RAM Minimum 8GB (16GB recommended for optimal performance). Python 3.8 or higher. System Dependencies PortAudio and FFmpeg.

Thanks to Pipecat's optimized architecture, DecisionsAI can run efficiently on systems with 8GB RAM, though 16GB is recommended for the best experience with larger language models. The application includes cross-platform support with platform-specific optimizations for clipboard operations, keyboard shortcuts, and system paths.

Features

Voice-controlled AI assistant (English only). Task automation and computer control. Natural language processing. Text-to-speech capabilities. Customizable actions and commands. Multi-model AI support (currently Ollama only). Real-time voice AI pipeline powered by Pipecat framework.

Open Source and Privacy

DecisionsAI is completely open source and available on GitHub. The entire codebase is transparent, allowing you to verify privacy and security. All processing happens locally on your machine by default. Your voice data, commands, and conversations never leave your computer unless you explicitly opt into cloud services.

This project is actively being developed. Current focus areas include improving voice recognition accuracy, enhancing offline capabilities, adding support for additional AI models, and enhanced dictation and transcription features.

Contact and Support

Contact support@tensology.com for any enquiries. Made with love by tensology.com. DecisionsAI is a desktop application - a native offline AI assistant for Windows, macOS, and Linux. This open source desktop app lets you control your computer with natural voice commands - completely offline, completely private. Download from GitHub and install on your desktop.