Overview

Live state of the evolving OpenClaw pod.

Active Pods

—

online

Suggestions Pending

—

awaiting review?Improvement suggestions generated by the analysis engine, awaiting your review.

Pod Health

—

not scanned?Last pod health scan result. Click to open the Health tab and run a scan.

Recovery

Recovery →

Host —

CPU

—

Memory

—

Disk

—

Load

—

Uptime

—

since boot

Pods

Make Your Pod Better

Loading recommendations…

Analytics

Performance, cost, and quality metrics across the pod.

Sessions

Cost

Classifier

Suggestions

Turns

Productive vs Maintenance Sessions

Resolution Rate Trend?Percentage of sessions where the user's goal was successfully completed.

Daily Cost by Bot (30d)

Model Tier Usage

Billing Mode

Spend Alert

Overall Accuracy

—

session classification accuracy

Accuracy Trend (weekly)

Recent Audit Results

Suggestion Funnel

Suggestions by Coach

Rejection Rate by Coach

Daily Turns — MAX vs API Key

By Source

By Model

By Channel

Raw Turns Drill-Down

Plugins

Per-bot external capabilities, grouped by surface. Messaging, LLM Providers, Tools, and Infrastructure are sections within this page. See Help.

Loading…

Plugins

Credentials

Embeddings

MCP Servers

Hooks

Activity

API Keys

One row per provider. Key values are masked (first 8 + last 4 chars). Writes go directly to auth-profiles.json.

Loading…

Looking for usage analytics? See the Usage page for turns, cost, and cost/turn breakdowns by model, channel, and source.

Embedding Providers

Used for memory_search (semantic recall over a bot's notes). Different model family from chat models — provider menu is shorter and shaped differently. Resolves to a fallback chain so OpenClaw fails over without evolve in the loop. Edit the chain order in AI Optimization.

Select a bot above to load embedding-provider config.

MCP Servers

Per-bot MCP server config from openclaw.json → mcp.servers. Phase A is read-only; install/remove/update lands with the proposal pipeline in Phase B. Spec: docs/spec-mcp-administration-2026-05-10.md.

Loading…

Plugins

Per-bot OpenClaw plugin entries from openclaw.json → plugins. Phase A is read-only; enable/disable + allow-list amendments land via proposals in Phase B. Spec: docs/spec-plugin-inventory-2026-05-10.md.

Loading…

Recent activity

Recent operator-clicked changes on this bot — Enable / Disable / Install / Remove / Configure across all surfaces. Each entry is an applied (or refused) proposal in the audit log. LLM-suggested proposals live on the Recommendations page.

Loading…

Hooks

Two hook surfaces per bot: webhook ingress (openclaw.json → hooks, external HTTP triggers) and plugin typed hooks (plugins.entries.<id>.hooks, per-plugin allowConversationAccess / allowPromptInjection). Phase A is read-only; mutation via proposals lands in Phase B. Spec: docs/spec-hook-governance-2026-05-10.md.

Loading…

Apps

Installed apps, gallery, forge jobs, and reliability — all in one place.

Loading…

Installed

Gallery

Forge Jobs

Reliability

Show archived

App Gallery

Browse, install, and manage gallery apps across your bots.

Loading gallery…

Forge Jobs

Track and approve forge runs for App Gallery installations and improvements.

How it works

The forge turns an app idea into a running app. Every app starts with a manifest — its spec, file list, and tests — that comes from a Gallery template, an Evo chat, or an RSI proposal. Claude drafts the code, critiques its own work over a few rounds, runs the app's tests, then either you approve it or — when the request was already a deliberate conversation — it ships automatically. The manifest is updated at the end so it always reflects what's on disk.

📋

Manifest

spec, files, tests

🛠️

Build

Claude drafts the code

🔍

Critique

2–3 self-review rounds

🧪

Test

runs the app's tests

🚀

Ship & Sync

approve, apply, refresh manifest

Manifest sources

gallery_template evo_chat rsi_proposal spec_wizard improvement_rerun

App	Bot	Type	Progress	Status	Started
Loading…

App Reliability

Per-app test-pass rates across the pod. Run the app-test-scheduler to collect data.

Loading…

Applications → Apps

This page has moved to Apps → Installed.

App Gallery → Apps

This page has moved to Apps → Gallery.

Forge Jobs → Apps

This page has moved to Apps → Forge Jobs.

Skills

Capability primitives installed on each bot — the building blocks your applications use.?Skills are what a bot CAN do: send a Slack message, read Gmail, search the web. Applications orchestrate skills toward a goal. OpenClaw plugins are proprietary skills; MCP servers are portable, agentskills.io-standard skills.

Loading…

Add a skill

Across pod

Loading…

Recommendations

Better Engine watches the pod and surfaces recommendations. You decide which ones land.

Loading…

How it works

A coach notices a problem and queues a suggestion for you. You approve, reject, snooze, or dismiss. Approved changes apply — automatically for config edits, or by you for hand-rolled instructions. Anything with a measurable claim gets a 7-day check-in to confirm it actually helped.

🔍

Spot

a coach notices

👤

You decide

approve / reject / snooze

⚙️

Apply

auto for config; you for instructions

📊

Check-in

after 7 days, when measurable

Active coaches

Loading…

Inbox

In Process

History

Coaches

Observations

Suggestion Queue

Dimension Urgency Audience Coach include snoozed

Loading…

Suggestions you've accepted that need offline follow-through. Refine to iterate, Mark complete when done.?Investigation and WorkflowInstruction suggestions don't have an automated action — they describe work for you to do. Once you've handled it, click Mark complete to close the suggestion. Use Refine to ask the bot to rewrite the suggestion based on your feedback.

Loading…

Suggestions that moved past the queue. Check-in confirms or refutes the measurable claim 7 days after apply.?Applied = waiting for check-in. Confirmed = the claimed metric improved as predicted. Refuted = it didn't, and the revert plan ran. Dismissed = the operator killed it before apply.

Status Bot Coach Since

Loading…

Optimizers, guardians, and meta-guardians that propose and review. Track record scales by verified outcomes.?Each coach has a charter (immutable identity), a track record (lifetime wins/losses), a status (active/paused/paused for review), and a track record computed from its verified outcomes. Guardians run at duty (track record 1.0); optimizers compete for attention via track record × dimension weight × urgency.

Loading…

The (noun × verb × mood) tuples the extraction layer pulls from each turn. Filter to spot mis-classification or confirm a behaviour cluster.?Each turn contributes one tuple to the observation stream. Coaches read aggregates over these tuples to decide whether to propose changes.

Bot Since (days) Noun Verb Mood

Select a bot…

Recovery → Maintenance

This page has moved to Maintenance → Recovery.

Settings

Pod modules, network settings, and identity configuration.

Loading…

Modules

Pod Config

Enable, disable, and tune each Evolve component.

Network settings, primary bot configuration, and identity claims.

Network

Bot

Identity

Alerts?Telegram channel that receives spend-cap warnings (spend_alert.py), classifier audit reports (classifier_audit.py), and forge job alerts. Stored at network.json → alerts.

Channel?Notification channel. Currently only "telegram" is implemented.

Chat ID?Telegram chat ID — the destination for alert messages. Get it by messaging @userinfobot on Telegram.

Primary Bot?The bot whose auth-profiles.json + tier_assignments are used by engine background LLM calls (analyzer, scanner, security warden, help bot, spec extractor). Also the default target for MCP context routing. Does NOT move daemon ownership, cron schedules, or report-delivery responsibilities — those follow each bot's "role" field and require a redeploy via `sudo evolve-admin deploy` to change. To shift those, see the bot setup CLI.

Primary Bot?Pick a bot registered in network.json. Changing this redirects engine LLM calls to the new bot's keys + tier choices on the next request.

Tier Resolution?What tier0/1/2/3 resolve to right now for engine background LLM calls (analyzer, scanner, security warden, help bot, spec extractor). Priority: primary bot's tier_assignments → pod-wide network.json override → hardcoded DEFAULT_TIERS in analyzer/models.py. Change the primary bot's per-tier models on the AI Optimization page; change which bot is "primary" on this tab.

Loading…

App Testing?Periodic re-runs of per-app behavioral tests after forge, executed by the app-test-scheduler LaunchDaemon. Per-app manifests may override the cadence. Spec: docs/spec-app-testing-2026-05-07.md.

Default cadence?off — run once at forge, never re-run. on_change — re-run when app code/manifest changes. light (default) — on_change + weekly. strict — on_change + daily.

Scheduler enabled ?Master switch for the periodic test scheduler. When off the LaunchDaemon still wakes hourly but the tick is a no-op.

Max runs per tick?Hard cap on test runs per scheduler tick. Apps over the cap are deferred to the next tick (most-overdue first). 0 disables the cap. Default: 10.

Select a bot…

Display Name?What this bot identifies as in chat (Telegram, Slack, etc.) and across the admin UI. Wraps upstream `openclaw agents set-identity` — the bot's internal id, workspace, and macOS user stay untouched. Empty means the UI falls back to the bot id.

Select a bot above…

Model Config?Resolved primary model and fallbacks for this bot, read from its openclaw.json agents.defaults.model.

Select a bot above…

Compaction Settings?How this bot compacts its conversation context as it grows. Read from openclaw.json agents.defaults.compaction.

Select a bot above…

Tier Classification Keywords (read-only)?Vocabulary the session classifier (TierClassifier.ts / LLMTierClassifier.ts) uses to label sessions productive vs. maintenance. Maintenance-classified sessions get downgraded to tier3 (Haiku) for the next assistant turn. Today calibration is pod-wide and there is no calibration writer in the codebase — these are the hardcoded base lists. Per-bot calibration is a planned follow-up.

Select a bot above…

Slack Policy?Slack channel/user policy for this bot. Policy lives in shared_dir/bots/<bot>/slack-policy.json and is rendered to openclaw.json by the writer. Run `evolve-admin slack-doctor <bot>` on the host for the same state in a terminal.

Select a bot above…

Full openclaw.json (sanitized, read-only)?The bot's complete openclaw.json with secrets masked. Manage via this admin UI or, on the host, with: sudo -u {bot} openclaw config set <key> <value>.

Select a bot above…

Direct host edits: sudo -u {bot} openclaw config set <key> <value>.

Pod Admin

Pod-wide: any user listed here is treated as an admin on the matching channel for every multi-user bot. Get an external_id from the user's chat profile (Slack user IDs start with U; Telegram IDs are numeric).

Channel

External ID

Pod user (optional)

Per-bot Primary User

Multi-user bots only. The primary user is the bot's owner — improvement proposals route to them by default, and per-user quality signals partition around them.

Loading…

Maintenance

Bot gateway status, cron jobs, infrastructure, and logs.?Status is probed live via direct HTTP + process check. Cron data comes from openclaw cron list --json via OC CLI.

Loading…

Status

Cron Jobs

Infra Jobs

OC Version

Gateway Logs

Admin Server

Setup

Claude Access

System

Pod Health

Recovery

Auto-refresh

Loading…

Status probed live via HTTP (port /evolve/status) with process-check fallback.

Aggregated cron jobs from all bots via openclaw cron list --json

Auto-refresh (60s)

Loading…

Evolve/OpenClaw launchd jobs installed in /Library/LaunchDaemons/

Loading…

Installed OpenClaw version vs latest on npm registry

Loading…

Auto-refresh (10s)

 Loading…

evolve-admin Flask server running as a macOS launchd service

Service Status

Loading…

First-time setup?

Use the Setup Wizard to install the persistent tunnel on your laptop and configure browser shortcuts.

Diagnostic Snapshots

Capture system info, configuration, and recent error logs into a JSON snapshot stored in ~/.evolve/reports/. To open a GitHub issue with the snapshot attached, use the floating 💬 Send feedback button instead.

Note (optional)

Recent Snapshots

(loading…)

1 · Server

2 · SSH Keys

3 · Configure

4 · Install

5 · URL Shortcut

6 · Verify

Step 1: Persistent Server

The admin server should run as a launchd service so it starts automatically at login and restarts if it crashes. No terminal session required.

Step 2: SSH Key Setup

For the tunnel to reconnect automatically (without a password prompt), you need passwordless SSH from your laptop to this machine.

Run these commands on your laptop:

# Generate a key if you don't have one ssh-keygen -t ed25519 -C "evolve-tunnel" # Copy it to the Mac Mini ssh-copy-id -i ~/.ssh/id_ed25519 pod_admin_user@mini

If you already have SSH key auth working, you can skip this step.

Step 3: Tunnel Configuration

Configure the SSH tunnel parameters. These will be baked into the setup script you download.

Mac Mini SSH hostname or alias

SSH username on Mac Mini

Admin server port (on Mac Mini)

Local port (on your laptop)

SSH private key path (on laptop)

Step 4: Install Tunnel on Laptop

Download the setup script and run it on your laptop. Double-click the downloaded file in Finder — it will open Terminal and complete automatically.

📦 evolve-tunnel-setup.command

Installs autossh + launchd agent. Run once. Tunnel persists across reboots.

🔗 evolve-tunnel-connect.command

One-shot connect script. Save to your Desktop as a manual reconnect backup.

After running the setup script, the tunnel will start immediately and reconnect automatically whenever your laptop reboots or the connection drops.

Step 5: Browser URL Shortcut

Instead of typing http://localhost:5050, set up a keyword shortcut so you just type evolve in the address bar.

Chrome / Arc

Firefox

Safari

Open Settings ⌘, → Search engine → Manage search engines
Click Add next to Site Search
Name: Evolve Admin Keyword: evolve URL: http://localhost:5050
Save. Now type evolve in the address bar and press Tab → Enter.

Navigate to http://localhost:5050 and bookmark it ⌘D
Right-click the bookmark → Properties
Set the Keyword field to: evolve
Now type evolve in the address bar and press Enter.

Safari doesn't support keyword bookmarks.
Navigate to http://localhost:5050
File → Add to Dock… — creates a clickable web app icon in your Dock.

Step 6: Verify

Let's confirm everything is working.

Click the button below to test the connection.

MCP Bridge

Loading…

Bot Context Access

Loading…

Claude Desktop Config

Paste this into ~/Library/Application Support/Claude/claude_desktop_config.json on each machine running Claude Desktop, then restart Claude Desktop.

Click to copy

CLAUDE.md: ~/CLAUDE.md — instructs Claude to call get_context at session start.

Recent Activity

Writes only

Loading…

Disk Usage

Loading…

Evolve Version & Sync Status

Loading…

Upgrade Evolve Pod

Pulls latest code from git, rebuilds the plugin, and redeploys to all bots. Your configuration and data are preserved.

Skip plugin rebuild Dry run (preview only)

⚠ Danger Zone

Removes all Evolve launchd jobs, bot workspace files, and optionally shared data (metrics, proposals, capabilities).

Never scanned

Click Scan Now to check pod health.

Pause everything

Disables every bot gateway across the pod. Bot data stays exactly where it is — only the running message handlers stop. Reversible. Typically completes in well under 30 seconds.

Roll a bot back to a known-good day

Every bot's config is checkpointed nightly via git backup. Pick a bot, choose a date, and we'll revert its openclaw.json to that commit and restart its gateway. The current state is snapshotted first — so the rollback itself can be reversed.

Bot

Rollback to

Recent pod-state commits

The last 7 days of code commits to this pod's deploy checkout. Rolling back here resets the entire pod codebase — configs, gallery manifests, and code — to that snapshot. The admin-UI restarts automatically after rollback. Destructive and pod-wide.

Loading…

Per-bot rollback history

Past per-bot config rollbacks. Each one can be reversed — that just performs another rollback, restoring the pre-rollback config.

Loading…

Pod-state rollback log

History of pod-state (codebase) rollbacks performed from this UI.

Loading…

Pause-all audit log

Who hit pause, when, and what happened on each gateway.

Loading…

Reports

Operator-configured digests, notification subscriptions, and the proposal watchlist.

Loading…

Subscriptions

Alerts

Watchlist

Messages

Configure

Recent Messages

include suppressed

What the dispatcher actually sent (or tried to send) to the Evolve bot's chat thread. Same messages your phone receives.

Loading…

Digest Delivery

When the daily Pod Report digest goes out, and how loud the empty-state runs.

Loading…

Notification Subscriptions

Pick which event types reach the Evolve bot's chat thread and how often. Defaults are calibrated for typical pods; the operator can mute, throttle, or batch any event type. Source-level on/off lives in Settings → Pod Config; this page is for everyday notification preferences. The Pod Report's content is gated separately by Alerts → Configure (sensitivity thresholds).

Loading…

Firing

History

Configure

Firing Alerts (needing action)

Loading…

show info-tier signals

Alert State Change History

Loading…

Sensitivity

Tunes pod_report's anomaly detectors — the magnitude floors that decide whether a condition becomes an Alert (and shows up in the daily Pod Report digest). The first row sets the pod default; each bot row overrides it. Tighter values = more findings. Leave a cell blank to inherit the pod default.

Loading…

Tracking

Candidates the synthesizer is watching. Signal is worth noting; doesn't yet warrant operator action.

Loading…

Awaiting synthesis

Substrate-level candidates — the same condition fired on three or more bots. These suggest a pod-wide fix (a default change in evolve itself) and wait for the LLM synthesizer to draft the right action.

Loading…

Recently dropped (last 7d)

Candidates the gate filtered out. Use this view to tune magnitude floors and repetition windows in proposal_synthesizer/config.py — if something here looks like it should have surfaced, the threshold for that variant is too high.

Loading…

Alerts → Maintenance

Alerts have moved to Maintenance → Alerts.

Security

Audit findings and security state per bot.

Loading…

Audit

Advanced view ▾

Security Audit

Loading…

Sessions → Usage

This page has moved to Usage → Sessions.

Usage

AI API usage by model, source, and channel — with billing breakdown.

Spend

Sessions

Today

7d

30d

Custom…

Usage Summary

Loading…

Activity Composition (by trigger)

Loading…

Stack by:

Activity Timeline — Turns by Model

Activity Timeline — Turns by Trigger

Day detail

Click a day on either timeline above to see the top sessions for that day.

By Model

No data

By Channel

No data

By Source

No data

By User (top 10)

No data

Context Health

Loading…

Anthropic Admin Cost Report cross-check

Pulls the org-level cost report from Anthropic's Admin API so the operator can verify the locally-derived cost ledger against the authoritative numbers. Requires an admin org API key — drop {"api_key": "sk-ant-admin-..."} at /Users/Shared/evolve/anthropic-admin-key.json (mode 600, evolve-owned). Phase A is on-demand fetch only; Phase B adds daily ingest + a divergence signal when the two sources disagree.

Click Fetch from Anthropic to load the cost report.

7d

30d

90d

When this bot is active?Cost-event count by hour of day (UTC) × day of week. Reveals each bot's natural rhythm — a steady fingerprint that shifts when something changes.

Sessions per day?Distinct sessions counted per day. A session is "active on a day" if any of its cost events land on that date. Spotting growth, abandonment, or burst days.

Inter-turn gap distribution?Seconds between consecutive user turns within a session. TTL must outlive the gap to reap cacheRead savings — if most gaps are above your TTL, you're paying cacheWrite over and over. Median and p95 shown below.

Cache health over time?Per-day stacked counts of cost events by cache_state. Invalidated turns paid cacheWrite cost without cacheRead savings — typically TTL too short. Fresh = first turn of a session; cannot be cached.

Token decomposition?Token totals across the window. Operator weights to dollars using model's published prices — cacheRead is ~10% of input price; cacheWrite is ~125%. Total cost shown below.

Cost per session?Distribution of session costs (heavy-tailed log-ish bins). The top 10 most expensive sessions are listed below — the "go read these" surface.

Context size trajectory?Average prompt tokens (input + cacheRead + cacheWrite) per event per day. Catches conversation bloat that the cache-state signals don't see directly.

Sessions to read?Top sessions worth investigating, grouped by ground-truth criteria. A session showing up in multiple categories (e.g. expensive AND highly invalidated) is informative — that's where to look first.

Cost Optimization

Budget caps, warning thresholds, and spend governance — the levers you set to keep costs in bounds. For observational data, see the Usage page.

Loading…

Cost Efficiency Score

Loading…

Context & Session Settings

Loading…

Cost Profiles

Loading…

Spending Caps & Enforcement

Loading…

Runaway Session Monitor

Loading…

Turn Audit

Loading…

Spike Explorer top LLM calls by cost — Budget Hawk v2

Loading…

AI Optimization

Session classification, model routing, and tier configuration.

Loading…

Model Freshness

Compares each bot's tier assignments to the latest recommended models for the providers you have keys for.

Loading…

Model Catalog

Select a bot above to manage its model catalog.

Tier Definitions

Which models live at each tier

Loading tier config…

Routing Rules

Select a bot above to load routing config.

Fallback Configuration

Loading fallback config…

Embedding Providers

Order = OpenClaw failover for memory_search

Loading embedding config…

Live Session Routing Status

Phase 2

Real-time session tier assignments, confidence scores, and manual overrides. Coming in Phase 2.

How Sessions Are Routed

Classifier accuracy

—

Session Classification Quality

Every Saturday, Evolve audits whether sessions were classified correctly.

Overview

Analytics

Plugins

Apps

New Application

Recommended for

App Gallery

Forge Jobs

App Reliability

Applications → Apps

App Gallery → Apps

Forge Jobs → Apps

Skills

Recommendations

Suggestion Queue

Recovery → Maintenance

Settings

Maintenance

Step 1: Persistent Server

Step 2: SSH Key Setup

Step 3: Tunnel Configuration

Step 4: Install Tunnel on Laptop

Step 5: Browser URL Shortcut

Step 6: Verify

⚠ Danger Zone

Pause everything

Roll a bot back to a known-good day

Recent pod-state commits

Per-bot rollback history

Pod-state rollback log

Pause-all audit log

Reports

Alerts → Maintenance

Security

Sessions → Usage

Usage

Cost Optimization

AI Optimization

Create a New App

Overview

Analytics

Plugins

Apps

New Application

Recommended for

App Gallery

Import App Package

Install App

Forge Jobs

App Reliability

Applications → Apps

App Gallery → Apps

Forge Jobs → Apps

Skills

Recommendations

Suggestion Queue

Recovery → Maintenance

Settings

Maintenance

Upgrade Safety Report

Step 1: Persistent Server

Step 2: SSH Key Setup

Step 3: Tunnel Configuration

Step 4: Install Tunnel on Laptop

Step 5: Browser URL Shortcut

Step 6: Verify

⚠ Danger Zone

Pause everything

Roll a bot back to a known-good day

Recent pod-state commits

Per-bot rollback history

Pod-state rollback log

Pause-all audit log

MCP Server Catalog

Edit plugin hook policy

Edit hook baseline

Plugin Catalog

Edit plugin config

Open Advisories

Probe History

Install MCP Server

Add API Key

Rotate Key

View config

Set up integration

Set up Google Workspace

Add a skill

Edit Model

Reports

Alerts → Maintenance

Dismiss signal

Security

Sessions → Usage

Usage

Cost Optimization

AI Optimization

Add Bot

Review & Deploy

Deploying…

Import API Usage from Logs

Create a New App

Upgrading Evolve…

Uninstall Evolve