InsightTestBench Logo
  • contact@verticalserve.com
Install in your environment

Install InsightTestBench

Self-host in one command. Bench API + UI + MySQL + the worker engine all come up via docker-compose. Bring your own provider credentials and the URL of the app you want to test.

One-command install

The reference install is docker-compose — bench API + UI + MySQL + the InsightWorker engine in one stack. Suitable for a laptop, a single VPS, or a small-team self-host. Scale up to K8s when you've outgrown one box.

Recommended — works for everything from laptop to team self-host

docker-compose

All four services come up together. Migrations run on every bench start so a clean MySQL gets the schema automatically. Persistent volumes keep your projects, screenshots, and history between restarts.

Install:

git clone https://github.com/verticalserve/insighttestbench
cd insighttestbench
cp .env.example .env
# fill in JWT_SECRET (`openssl rand -base64 48`)
# fill in AWS creds (Bedrock) or ANTHROPIC_API_KEY
docker compose up -d --build

Then:

  • UI: http://localhost:5273
  • API: http://localhost:8200
  • Docs: http://localhost:8200/docs
  • Click Sign in / register in the top-right avatar menu — first user becomes admin automatically.
  • Click Create from brief, paste a sentence about an app you want to test, and watch the bench bootstrap.
Dev-only fast-start: leave DEV_MODE=true in .env and the bench skips auth entirely — a synthetic admin user is injected for every request. Flip to false before exposing anywhere shared.
For multi-team production

Kubernetes (Helm)

Helm chart with horizontal scaling, rolling upgrades, ingress via your existing controller, customer-managed MySQL / RDS. Coming with the next release — talk to us if you need it sooner.

In the meantime, the docker-compose stack is the supported production install for single-team deployments.

Hosted (managed)

Don't want to run your own stack? We can host the bench in our environment and point it at your staging apps via outbound HTTPS. Your secrets stay yours — we never see them. Per-customer instance, not multi-tenant.

Contact us for hosted pricing + setup.

Prerequisites

Required

  • Docker + docker-compose (Docker Desktop on Mac/Win works for local dev)
  • ~8 GB free disk for the Chromium download + image layers
  • A model provider — AWS Bedrock with Anthropic models, OR an Anthropic direct API key. (OpenAI / Azure / Gemini supported via the worker's provider config.)
  • JWT_SECRET — random 32+ char string. openssl rand -base64 48 works.

Optional

  • Credentials of the app under test — only needed if the app requires login. Bench stores the env-var NAMES only; values stay in your environment.
  • Webhook URL — Slack incoming-webhook, Discord, generic, etc., for scheduled-run notifications.
  • Reverse proxy: Caddy / nginx-proxy / Traefik in front for TLS termination when exposing publicly.

Configure your LLM provider

The bench needs at least one provider configured — most operations need a strong model (Claude Opus 4.7 by default) plus a fast model (Claude Haiku 4.5) for the RCA agent. Pick whichever matches what your org already pays for.

AWS Bedrock (default)
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
BEDROCK_MODEL=us.anthropic.claude-opus-4-7-v1
BEDROCK_MODEL_FAST=us.anthropic.claude-haiku-4-5-20251001-v1
Azure OpenAI
AZURE_OPENAI_ENDPOINT=https://my.openai.azure.com
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_DEPLOYMENT=my-gpt-deployment
Custom (on-prem GPU)
CUSTOM_LLM_BASE_URL=http://gpu-box.internal:8000/v1
CUSTOM_LLM_API_KEY=...   # if your endpoint needs it
CUSTOM_LLM_MODEL=meta-llama/Llama-3.1-70B-Instruct
Anthropic direct
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-opus-4-7
ANTHROPIC_MODEL_FAST=claude-haiku-4-5

Need help with the install?

Tell us about your environment and we'll walk you through a 30-minute pilot install on a sandbox VM.

Talk to us