Integration Guide
Set up Kill Switch financial protection for your project. Three integration paths, use whichever fits your architecture.
Integration Paths
| Method | Best For | Setup Time |
|---|---|---|
| REST API | Direct HTTP integration, CI/CD, webhooks | ~5 min |
| Edge Agent | Cloudflare Workers monitoring with local kill-switch | ~10 min |
| Spend Guard | In-app budget limits for GPU/AI services (RunPod, Gemini, etc.) | ~15 min |
1. REST API
The Kill Switch API at https://api.kill-switch.net provides full programmatic control over monitoring, rules, and kill switches.
Quick Start
# Health check
curl https://api.kill-switch.net/
# List supported providers
curl https://api.kill-switch.net/providers
# List preset rules (no auth needed)
curl https://api.kill-switch.net/rules/presets
# Trigger a kill switch (requires auth)
curl -X POST https://api.kill-switch.net/rules/agent/trigger \
-H "Authorization: Bearer YOUR_JWT" \
-H "Content-Type: application/json" \
-d '{
"threatDescription": "RunPod spend exceeded $50 in 2 hours",
"severity": "critical",
"recommendedActions": [
{ "type": "disconnect", "target": "my-worker" }
],
"autoExecute": false
}'
Key Endpoints
| Endpoint | Auth | Purpose |
|---|---|---|
GET /providers | None | List Cloudflare, GCP, AWS, RunPod with default thresholds |
GET /rules/presets | None | DDoS, cost-runaway, GPU-runaway, Lambda-loop, etc. |
POST /cloud-accounts | JWT | Connect a cloud provider for monitoring |
POST /check | JWT | Run monitoring check on all accounts |
POST /rules/agent/trigger | JWT | AI agent triggers a kill switch |
POST /agent/report | API Key | Edge agent submits metrics |
GET /analytics/overview | JWT | FinOps dashboard (daily costs, savings) |
POST /database/kill | JWT | Initiate database kill sequence |
Full API reference: API Docs (OpenAPI)
2. Edge Agent (Cloudflare Workers)
The edge agent is a lightweight Cloudflare Worker that runs in your account. Your API tokens never leave your infrastructure.
Deploy
# Clone and deploy
git clone https://github.com/divinci-ai/kill-switch.git
cd kill-switch/packages/agent
# Edit thresholds in wrangler.toml, then:
wrangler deploy
# Set your CF credentials (stays in your account)
wrangler secret put CLOUDFLARE_API_TOKEN
wrangler secret put CLOUDFLARE_ACCOUNT_ID
Configuration
# wrangler.toml
[vars]
GUARDIAN_API_URL = "https://api.kill-switch.net"
DO_REQUEST_THRESHOLD = "1000000"
DO_WALLTIME_HOURS_THRESHOLD = "100"
WORKER_REQUEST_THRESHOLD = "10000000"
[triggers]
crons = ["*/5 * * * *"] # Check every 5 minutes
What It Monitors
| Metric | Default Threshold | Action on Breach |
|---|---|---|
| Durable Object requests/day | 1,000,000 | Auto-disconnect routes |
| DO wall-time hours/day | 100 hours | Auto-disconnect routes |
| Worker requests/day | 10,000,000 | Auto-disconnect routes |
3. Spend Guard (In-App Budget Limits)
For services the edge agent can't see (RunPod GPU, Google Gemini, external APIs), add the Spend Guard directly to your app. It uses D1 to track per-job costs and enforces daily budgets.
How It Works
- Before every generation request,
checkSpendBudget()runs a single D1 query - If any limit is exceeded, returns
429with a clear message - After job creation,
recordSpend()logs the estimated cost - When the provider reports completion,
updateActualCost()replaces the estimate - At 80% budget utilization, PagerDuty/Discord/Slack alerts fire
- At 100%, all generation is blocked
Install
# Copy spend-guard.ts into your project
cp kill-switch/INTEGRATION.md . # Full setup instructions
# The spend_log D1 table auto-creates on first use
# No migration needed
Wire Into Your Generate Endpoint
import { checkSpendBudget, recordSpend } from "./services/spend-guard";
// Before processing
const budget = await checkSpendBudget("runpod", userId);
if (!budget.allowed) {
return new Response(budget.reason, { status: 429 });
}
// After creating the job
await recordSpend("runpod", userId, jobId);
Default Budget Limits
| Limit | Default | Protects Against |
|---|---|---|
| Global daily spend | $50/day | Total platform runaway |
| Per-user daily jobs | 100/day | Single-account abuse |
| RunPod daily spend | $25/day | GPU cost explosion |
| RunPod job timeout | 30 min | Stuck/infinite GPU jobs |
| Max concurrent GPU jobs | 8 | Parallel job flood |
| VEO daily requests | 200/day | API quota burn |
| TTS daily requests | 500/day | TTS abuse |
4. Alerting
Alerts fire automatically when spend approaches limits.
| Channel | Trigger | Behavior |
|---|---|---|
| PagerDuty | 80% warning, 95% critical | Pages on-call, deduped per severity/day |
| Discord | Same thresholds | Rich embed with per-provider breakdown |
| Slack | Same thresholds | Text message with severity emoji |
Configure PagerDuty
# Store routing key in Cloudflare Secrets Store
wrangler secrets-store secret create \
0b7ac993cf26413ea6e2f1b5ede20b25 \
--name PAGERDUTY_ROUTING_KEY \
--scopes workers --remote
# Bind in wrangler.toml
[[secrets_store_secrets]]
binding = "PAGERDUTY_ROUTING_KEY"
store_id = "0b7ac993cf26413ea6e2f1b5ede20b25"
secret_name = "PAGERDUTY_ROUTING_KEY"
npm install -g @kill-switch/cli and run ks onboard to connect any provider interactively. See the CLI docs for details.
5. Supported Providers
| Provider | Resources Monitored | Kill Actions |
|---|---|---|
| Cloudflare | Workers, Durable Objects, R2, D1, Queues, Stream, Zones | Disconnect routes, disable workers.dev, delete worker |
| GCP | Compute Engine, GKE, BigQuery, Cloud Functions, GCS | Stop instances, scale down, disable billing |
| AWS | EC2, Lambda, RDS, ECS, EKS, SageMaker, S3 | Stop instances, throttle concurrency, disable functions |
| RunPod | GPU Pods (on-demand & spot), Serverless Endpoints, Network Volumes | Stop pod, terminate pod, scale down endpoints |
| Redis | Redis Cloud, AWS ElastiCache, Self-hosted (memory, connections, ops/sec) | Kill connections, scale down, flush, pause cluster |
| MongoDB | Atlas clusters, Self-hosted (storage, connections, ops/sec) | Kill connections, isolate (IP whitelist), pause/scale cluster |
| OpenAI | GPT API token usage, request counts, daily cost | Rotate credentials (manual) |
| Anthropic | Claude API token usage, daily cost | Rotate credentials (manual) |
| xAI (Grok) | Grok API token usage, daily cost | Rotate credentials (manual) |
| Replicate | GPU predictions, model usage, daily cost | Rotate credentials (manual) |
| Snowflake | Warehouse credits, query costs, data scanning | Scale down warehouse, suspend warehouse |
| Vercel | Function invocations, bandwidth, build minutes | Scale down, disable service |
| Datadog | Host count, log ingestion, custom metrics | Rotate credentials, mute monitors |
| Neon | Serverless Postgres compute hours, storage, data transfer | Scale down, delete project |
| Neo4j Aura | Graph DB instances, memory, storage, instance count | Pause instance, scale down, delete |
6. RunPod Setup Guide
RunPod is a GPU cloud platform popular for ML training and inference. Kill Switch monitors your GPU pods, serverless endpoints, and network volumes — and can automatically shut down runaway resources before the bill arrives.
Credentials
RunPod uses a single API key for authentication — simpler than AWS or GCP.
- Go to runpod.io/console/user/settings
- Scroll to API Keys and click Create API Key
- Copy the key — it will only be shown once
Connect via CLI or dashboard:
# CLI (one command)
ks onboard --provider runpod \
--runpod-api-key "YOUR_API_KEY" \
--name "ML Training" \
--shields cost-runaway,gpu-runaway
# Or connect at https://app.kill-switch.net/accounts/connect/runpod
What's Monitored
| Resource | Metrics Tracked | Cost Estimation |
|---|---|---|
| GPU Pods (on-demand) | Running count, GPU type, uptime | Per-GPU hourly rate × 24h (A100: $1.64/hr, H100: $3.29/hr, RTX 4090: $0.69/hr) |
| GPU Pods (spot) | Running count, preemption risk | ~70% discount from on-demand rates |
| Serverless Endpoints | Active workers, min/max scaling | Standby worker hourly cost |
| Network Volumes | Storage size (GB) | $0.07/GB/month |
Kill Switch uses RunPod's costPerHr field from the API when available. For pods where this isn't reported, it falls back to built-in GPU pricing estimates.
Kill Actions
| Action | Applies To | Reversible | What Happens |
|---|---|---|---|
stop-pod | GPU Pods | Yes | Stops the pod. Container disk and network volume data are preserved. Restart anytime. |
terminate-pod | GPU Pods | No | Terminates the pod. Container disk is destroyed. Network volume data survives. Use as a last resort. |
scale-down | Serverless Endpoints | Yes | Sets workersMin and workersMax to 0. No new requests are processed. Scale back up in the dashboard. |
stop-pod, which preserves your data. terminate-pod is only used when auto-delete is explicitly enabled or when a violation reaches critical severity (2x threshold). Your network volumes are always safe — they are never deleted by kill actions.
Default Thresholds
These are applied automatically when you connect a RunPod account. Adjust them in the dashboard or via the API.
| Threshold | Default | What It Protects Against |
|---|---|---|
| GPU Pod count | 4 pods | Forgotten or leaked pods left running |
| Spot Pod count | 8 pods | Spot pod sprawl (higher limit since they're cheaper) |
| Serverless Workers | 10 workers | Endpoint autoscaling runaway |
| Network Volume storage | 500 GB | Unbounded data accumulation |
| Daily cost | $50/day | Overall spend cap |
| Monthly spend limit | $1,500/month | Billing shock prevention |
Violations at 1x threshold trigger a warning. At 2x threshold, severity escalates to critical and auto-kill actions execute (if enabled).
7. Redis Setup Guide
Monitor Redis Cloud, AWS ElastiCache, or self-hosted Redis instances for memory spikes, connection overload, and cost runaway.
Credentials
Self-hosted: Provide a Redis URL (redis://user:pass@host:6379).
Redis Cloud: Account Key + Secret Key from Redis Cloud API Keys, plus your Subscription ID.
ElastiCache: AWS Access Key + Secret Key + Region + Cluster ID.
ks onboard --provider redis --redis-url "redis://user:pass@host:6379" --name "Production Redis"
# Or connect at https://app.kill-switch.net/accounts/connect/redis
Default Thresholds
| Threshold | Default |
|---|---|
| Memory Usage | 512 MB |
| Connected Clients | 100 |
| Commands/sec | 10,000 |
| Daily Cost | $25/day |
8. MongoDB Setup Guide
Monitor MongoDB Atlas clusters or self-hosted instances for storage growth, connection overload, and cost spikes.
Credentials
Atlas: Create API keys at Organization > Access Manager > API Keys. Needs "Project Read Only" + "Project Cluster Manager" roles.
Self-hosted: Provide a MongoDB URI (mongodb+srv://user:pass@host/db).
ks onboard --provider mongodb --atlas-public-key PUB --atlas-private-key PRIV --atlas-project-id PROJ --cluster-name Cluster0
# Or connect at https://app.kill-switch.net/accounts/connect/mongodb
Default Thresholds
| Threshold | Default |
|---|---|
| Storage | 10 GB |
| Active Connections | 200 |
| Operations/sec | 5,000 |
| Daily Cost | $30/day |
9. OpenAI Setup Guide
Monitor GPT API token usage, request counts, and daily spend. Catch runaway agent loops before they drain your budget.
Credentials
- Go to platform.openai.com/api-keys
- Create a new API key (starts with
sk-) - Optionally provide your Organization ID from Settings
ks onboard --provider openai --openai-api-key "sk-..." --name "Production OpenAI"
# Or connect at https://app.kill-switch.net/accounts/connect/openai
Default Thresholds
| Threshold | Default |
|---|---|
| Tokens/day | 1,000,000 |
| Requests/day | 10,000 |
| Daily Cost | $50/day |
10. Anthropic Setup Guide
Monitor Claude API token usage and daily spend.
Credentials
- Go to console.anthropic.com/settings/keys
- Create a new API key (starts with
sk-ant-)
ks onboard --provider anthropic --anthropic-api-key "sk-ant-..." --name "Production Claude"
# Or connect at https://app.kill-switch.net/accounts/connect/anthropic
Default Thresholds
| Threshold | Default |
|---|---|
| Tokens/day | 1,000,000 |
| Daily Cost | $50/day |
11. xAI (Grok) Setup Guide
Monitor Grok API token usage and daily spend.
Credentials
- Go to console.x.ai/api-keys
- Create a new API key
ks onboard --provider xai --xai-api-key "xai-..." --name "Grok API"
# Or connect at https://app.kill-switch.net/accounts/connect/xai
Default Thresholds
| Threshold | Default |
|---|---|
| Tokens/day | 1,000,000 |
| Daily Cost | $50/day |
12. Replicate Setup Guide
Monitor GPU prediction costs, model usage, and daily spend on Replicate.
Credentials
- Go to replicate.com/account/api-tokens
- Create a new token (starts with
r8_)
ks onboard --provider replicate --replicate-api-token "r8_..." --name "ML Predictions"
# Or connect at https://app.kill-switch.net/accounts/connect/replicate
Default Thresholds
| Threshold | Default |
|---|---|
| Predictions/day | 100 |
| GPU Hours/day | 4 |
| Daily Cost | $25/day |
13. Snowflake Setup Guide
Monitor Snowflake warehouse credits, query costs, and data scanning. Auto-suspend warehouses on threshold breach.
Credentials
Provide your Snowflake account name (from the URL), username, and password.
ks onboard --provider snowflake --snowflake-account "xy12345.us-east-1" \
--snowflake-username "USER" --snowflake-password "PASS" \
--warehouse "COMPUTE_WH" --name "Production Snowflake"
# Or connect at https://app.kill-switch.net/accounts/connect/snowflake
Kill Actions
| Action | What Happens | Reversible |
|---|---|---|
scale-down | Resize warehouse to X-SMALL | Yes |
stop-instances | Suspend warehouse entirely | Yes |
Default Thresholds
| Threshold | Default |
|---|---|
| Credits/day | 10 |
| Warehouses | 3 |
| Daily Cost | $100/day |
14. Vercel Setup Guide
Monitor Vercel function invocations, bandwidth usage, and build minutes.
Credentials
- Go to vercel.com/account/tokens
- Create a new token with appropriate scope
- Optionally provide your Team ID (from Team Settings)
ks onboard --provider vercel --vercel-api-token "TOKEN" --name "Production Vercel"
# Or connect at https://app.kill-switch.net/accounts/connect/vercel
Default Thresholds
| Threshold | Default |
|---|---|
| Function Invocations/day | 100,000 |
| Bandwidth/day | 100 GB |
| Daily Cost | $50/day |
15. Datadog Setup Guide
Monitor Datadog host count, log ingestion volume, and custom metrics costs.
Credentials
You need both an API Key and an Application Key:
- API Key: Organization Settings > API Keys
- Application Key: Organization Settings > Application Keys
- Optionally specify
--datadog-site eufor EU region (default: US)
ks onboard --provider datadog --datadog-api-key "KEY" --datadog-application-key "APP_KEY" --name "Production Datadog"
# Or connect at https://app.kill-switch.net/accounts/connect/datadog
Default Thresholds
| Threshold | Default |
|---|---|
| Host Count | 50 |
| Log Ingestion/day | 10 GB |
| Daily Cost | $100/day |
16. Neon Setup Guide
Monitor Neon serverless Postgres compute hours, storage, and data transfer. Scale down or pause a runaway project before the bill spikes.
Credentials
- Go to console.neon.tech > Account Settings > API Keys
- Create an API key and copy it (shown once)
- Find your Project ID under Project Settings
# Connect at the dashboard (Neon onboarding is dashboard-driven):
# https://app.kill-switch.net/accounts/connect/neon
# API key: neon_api_key_...
# Project ID: your-project-id
Kill Actions
| Action | What Happens | Reversible |
|---|---|---|
scale-down | Suspend all compute endpoints (autosuspend) | Yes |
delete | Delete the project (last resort) | No |
Default Thresholds
| Threshold | Default |
|---|---|
| Compute | 80 CU-hrs/month |
| Storage | 400 MB |
| Data Transfer | 4 GB |
| Daily Cost | $1/day (monthly limit $30) |
17. Neo4j Aura Setup Guide
Monitor Neo4j Aura graph database instances — memory, storage, and instance count. Pause or scale down before idle instances rack up cost.
Credentials
- Go to console.neo4j.io > Account > API Credentials
- Create credentials — you get a Client ID and Client Secret
- Optionally note a specific Instance ID (otherwise all instances are monitored)
# CLI (one command):
ks onboard --provider neo4j \
--neo4j-client-id "CLIENT_ID" \
--neo4j-client-secret "CLIENT_SECRET" \
--name "Production Neo4j" --shields cost-runaway
# Or connect at https://app.kill-switch.net/accounts/connect/neo4j
Kill Actions
| Action | What Happens | Reversible |
|---|---|---|
pause-cluster | Pause the Aura instance | Yes |
scale-down | Resize to a smaller tier | Yes |
delete | Delete the instance (last resort) | No |
Default Thresholds
| Threshold | Default |
|---|---|
| Running Instances | 3 |
| Memory | 8 GB |
| Storage | 16 GB |
| Daily Cost | $20/day (monthly limit $600) |
Need Help?
- Full API Reference (OpenAPI 3.1)
- CLI Documentation
- GitHub Repository
- INTEGRATION.md (detailed Claude Code setup guide)