LLM Runtime

AI Core

Your LLM calls fail at 3am. You don't know until morning. Switching models means rewriting code.

### How it works

Drop-in LLM proxy with 7 levels of fallback. Auto-calibrates model per request strategy. Detects user frustration and escalates to premium model. Tracks every token spent. Send one API call — the runtime handles provider selection, failover, streaming, and budget tracking automatically.

### API Example

curl -X POST https://api.nevrai.com/v1/chat \
  -H "Authorization: Bearer nvr_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is JTBD?"}
    ]
  }'

### Key Features

> 7-level cascade (Groq → OpenRouter → bootstrap fallback)
> speed_first / quality_first strategy selection
> Auto-escalation on user frustration detection
> SSE streaming out of the box
> Per-request budget tracking and token accounting

### Pricing

Free: 1,000 req/mo

Starter ($49): 10,000 req/mo

Pro ($149): 100,000 req/mo

Get API key → See docs →