OpenGravity — Local-First AI Coding Agent for VS Code

In Action

See It Work

A polished, zero-clutter interface docked directly inside VS Code.

Clean conversational interface with syntax-highlighted code blocks and one-click apply.

Automatically includes open files, cursor position, and selected text as context on every request.

Plan mode produces a structured implementation plan before touching any code.

Apply all code changes to disk with a single click — reliable for files of any size.

Capabilities

Everything You Need

Built for serious development work — not demos.

🤖

Autonomous Agent

Reads files, searches code, writes changes, and executes multi-step tasks without hand-holding. Up to 40 tool-call steps per request.

🔒

Total Privacy

Zero data ever leaves your machine. No telemetry, no API keys, no cloud inference. Works fully offline on your own hardware.

⚡

Streaming Responses

Tokens stream to the screen as they're generated. See the agent's thinking in real time — no more waiting for a full response.

🌐

Web Search & Fetch

Search the web via Brave Search or self-hosted SearXNG, then fetch pages for full context. Docs, specs, GitHub files — all fair game.

📋

Session Continuity

Automatically writes PLAN.md at the end of every task. On the next session the agent reads it first — no context lost between conversations.

✏️

Inline Autocomplete

Ghost-text completions as you type using your local model. Supports Fill-in-the-Middle (FIM) for Qwen, DeepSeek-Coder, CodeLlama, and StarCoder.

🔍

Smart Context

Active editor gets priority budget. VS Code diagnostics, git branch, recent commits, and uncommitted diffs are injected automatically.

🗂️

Persistent History

Chat history survives VS Code restarts. When conversations grow large, older turns are automatically summarized to stay within context limits.

🖥️

Remote GPU Support

Run llama.cpp on a powerful desktop or server and connect from any laptop. Up to 128K context (configurable) over the network with a single URL change.

Setup

Up in Three Steps

No accounts, no API keys, no cloud setup required.

Start a Local Model

Launch llama.cpp, Ollama, or LM Studio with any GGUF or compatible model. A coding-optimized model like Qwen2.5-Coder is recommended.

Install the Extension

Download the .vsix and install via VS Code's Extension panel. Point OpenGravity at your server URL in settings — that's it.

Start Coding

Open the panel, type your task, and let the agent work. It reads your code, makes changes, and explains everything as it goes.

Backends

Your Inference, Your Choice

OpenGravity works with any local inference backend. Switch providers in settings with no restart required.

llama.cpp

Native llama.cpp support — local or remote. Recommended for maximum performance. Context window is fully configurable (128K+ with Flash Attention).

provider    = llamacpp
llamacppUrl = http://192.168.1.100:8080
apiMode     = openaiCompat
model       = ""  ← server picks loaded model
                

Ollama

Full native Ollama API support. Easiest setup for local development — install Ollama, pull a model, and go.

provider  = ollama
ollamaUrl = http://localhost:11434
model     = qwen2.5-coder:7b
                

LM Studio

OpenAI-compatible API from LM Studio. Great for users who prefer a GUI model manager with an easy model library.

provider     = lmstudio
lmstudioUrl  = http://localhost:1234
                

OpenAI-Compatible

Any local server implementing the OpenAI chat completions API — vLLM, text-generation-webui, TabbyAPI, and more.

provider            = openaiCompatible
openaiCompatibleUrl = http://localhost:8000
                

Agentic Tools

What the Agent Can Do

Every tool runs locally inside VS Code — sandboxed to your workspace.

Tool	Description	Default
list_files	List files in the workspace or a subdirectory	Always On
read_file	Read file content, optionally between line bounds	Always On
search_in_files	Plain-text search across workspace files with glob filtering	Always On
write_file	Create or overwrite a file — handles any size reliably	Always On
replace_in_file	Surgical find-and-replace — first match or all occurrences	Always On
apply_unified_diff	Apply a unified diff patch across one or more files in one step	Always On
fetch_url	Fetch any http/https URL as plain text — docs, specs, GitHub raw files	On
web_search	Search the web via Brave Search API or self-hosted SearXNG	Opt-in
run_terminal_command	Run a shell command in the workspace root — confirmation required	Opt-in

Get Started

Install in Minutes

No account, no API key, no cloud setup. Just a local model and VS Code.

Start your inference server

Launch Ollama, llama.cpp, or LM Studio with a coding-optimized model.

Install the .vsix

Download OpenGravity-Latest.vsix, then open VS Code → Extensions (Ctrl+Shift+X) → ··· menu → Install from VSIX…

Set your provider URL

Click the ⚙ icon in the OpenGravity panel to open settings and point it at your server.

Test the connection

Run OpenGravity: Test Connection from the Command Palette to confirm everything is working.

Ollama quick start

# Pull a coding model
ollama pull qwen2.5-coder:7b

# It starts automatically
# Then in OpenGravity settings:
provider  = ollama
ollamaUrl = http://localhost:11434
model     = qwen2.5-coder:7b

llama.cpp remote server

# On your GPU machine:
llama-server -m model.gguf \
  --host 0.0.0.0 --port 8080 \
  -c 131072 -ngl 999 -fa 1

# In OpenGravity settings:
provider    = llamacpp
llamacppUrl = http://192.168.1.x:8080

💡

Move OpenGravity to the right side panel

For the best experience, dock OpenGravity on the right so the file Explorer stays on the left.

Press Ctrl+Alt+B (Mac: Cmd+Option+B) to open the Secondary Side Bar on the right.
Right-click the OpenGravity icon in the Activity Bar and select Move to Secondary Side Bar — or simply drag it across.
OpenGravity now lives on the right, leaving your Explorer undisturbed.

The AI Coding Agent
That Runs On Your Machine

See It Work

Everything You Need

Autonomous Agent

Total Privacy

Streaming Responses

Web Search & Fetch

Session Continuity

Inline Autocomplete

Smart Context

Persistent History

Remote GPU Support

Up in Three Steps

Start a Local Model

Install the Extension

Start Coding

Your Inference, Your Choice

llama.cpp

Ollama

LM Studio

OpenAI-Compatible

What the Agent Can Do

Install in Minutes

Start your inference server

Install the .vsix

Set your provider URL

Test the connection

Move OpenGravity to the right side panel

Your AI. Your Hardware.
Your Rules.

The AI Coding AgentThat Runs On Your Machine

See It Work

Everything You Need

Autonomous Agent

Total Privacy

Streaming Responses

Web Search & Fetch

Session Continuity

Inline Autocomplete

Smart Context

Persistent History

Remote GPU Support

Up in Three Steps

Start a Local Model

Install the Extension

Start Coding

Your Inference, Your Choice

llama.cpp

Ollama

LM Studio

OpenAI-Compatible

What the Agent Can Do

Install in Minutes

Start your inference server

Install the .vsix

Set your provider URL

Test the connection

Move OpenGravity to the right side panel

Your AI. Your Hardware.Your Rules.

The AI Coding Agent
That Runs On Your Machine

Your AI. Your Hardware.
Your Rules.