Background ray
Customer image
Customer image
Customer image

Trusted by developers

Trusted by developers

The best model for every request. Automatically.

One API that gets you the fastest, most reliable, and lowest-cost results across all models and providers.

Valid JSON · Automatic fallback · Cost ceilings · Fast routing

Valid JSON · Automatic fallback · Cost ceilings · Fast routing

  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo
  • Client logo

Features

Features

Features

Designed for production-grade AI applications

Accurate results, predictable spend, and fast execution — without managing model complexity

Map
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Map
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Map
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag
Flag

Fast Execution

Requests run on the fastest available model or region. Latency is minimized without compromising accuracy or output quality.

Low

High

Low

High

Low

High

Cost Control

Requests stay within your defined budget. You set the limits, and the runtime ensures predictable, stable spend — without surprises.

Code

1

2

3

4

5

class AutomationAgent:
def __init__(self, activation_limit):
self.activation_limit = activation_limit
self.current_mode = "idle"

def evaluate_task(self, workload_value):
if workload_value > self.activation_limit:
self.current_mode = "engaged"
return "Automation agent has been successfully activated!"
else:
return "No activation needed. Agent stays idle."
def get_current_mode(self):
return f"Current operational mode: {self.current_mode}"

Structured Output

Every response is validated, repaired, or retried until it meets the expected structure. No malformed JSON. No silent failures.

How it works

How it works

How it works

One request in. Correct output back.

The runtime inspects your request, chooses the best model, and validates the output before returning a correct response.

1

Inspect the request

The runtime reads your input and identifies what the task requires — structured output, extraction, reasoning, or classification.

Extraction

OUTPUT

INPUT

1

Inspect the request

The runtime reads your input and identifies what the task requires — structured output, extraction, reasoning, or classification.

Extraction

OUTPUT

INPUT

1

Inspect the request

The runtime reads your input and identifies what the task requires — structured output, extraction, reasoning, or classification.

Extraction

OUTPUT

INPUT

2

Pick the best model

The most suitable model is chosen automatically across providers and regions, based on accuracy needs, latency, and your cost limits.

2

Pick the best model

The most suitable model is chosen automatically across providers and regions, based on accuracy needs, latency, and your cost limits.

2

Pick the best model

The most suitable model is chosen automatically across providers and regions, based on accuracy needs, latency, and your cost limits.

3

Validate the output

Responses are checked and repaired before being returned. Invalid JSON or incorrect structure triggers an automatic retry or fallback.

Improved

Improved

3

Validate the output

Responses are checked and repaired before being returned. Invalid JSON or incorrect structure triggers an automatic retry or fallback.

Improved

Improved

3

Validate the output

Responses are checked and repaired before being returned. Invalid JSON or incorrect structure triggers an automatic retry or fallback.

Improved

Improved

Code example

Code example

Code example

One line to add intelligent model selection

Replace your model client and get correct, validated output on every request.

// fallback multiple model selection today

import OpenAI from "openai";
import Anthropic from "@anthropic-ai/sdk";

const openai = new OpenAI({ apiKey: process.env.OPENAI });
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC });

const response = await openai.chat.completions.create({
  model: "gpt-4o-mini",         // pick a model
  messages,                     // may return bad JSON
});

let data;

try {
  data = JSON.parse(response.choices[0].message.content);
} catch {
  // retry with a different provider
  const retry = await anthropic.messages.create({
    model: "claude-3-haiku",
    messages,
  });
  data = JSON.parse(retry.content[0].text);
}
// automatic model selection at runtime with Perf

import Perf from "perf";

const perf = new Perf({ apiKey: process.env.PERF_KEY });

const response = await perf.chat({
  messages       
});

Pricing

Pricing

Pricing

Production-ready AI infrastructure, built to scale

Pay yearly and save 2 months

Monthly

Yearly

Starter

Best for experimenting and personal projects

Free

10K tokens per month

1 Project

Core runtime orchestration

Community support

Logs retained for 7 days

Pro

Best for indie developers and early-stage startups

$49

Monthly

100K tokens per month

3 Projects

Bring Your Own Keys (BYOK)

Email support

Logs retained for 30 days

Growth

Best for production workloads

$199

Monthly

1M tokens per month

10 Projects

Custom controls for costs & model

Priority support

Logs retained for 365 days

Team

Best for teams running AI at scale

$499

Monthly

10M tokens per month

Unlimited Projects

Up to 3 Organizations

Team access and governance

SLA-backed priority support

Full log history

Monthly

Yearly

Starter

Best for experimenting and personal projects

Free

10K tokens per month

1 Project

Core runtime orchestration

Community support

Logs retained for 7 days

Pro

Best for indie developers and early-stage startups

$49

Monthly

100K tokens per month

3 Projects

Bring Your Own Keys (BYOK)

Email support

Logs retained for 30 days

Growth

Best for production workloads

$199

Monthly

1M tokens per month

10 Projects

Custom controls for costs & model

Priority support

Logs retained for 365 days

Team

Best for teams running AI at scale

$499

Monthly

10M tokens per month

Unlimited Projects

Up to 3 Organizations

Team access and governance

SLA-backed priority support

Full log history

Monthly

Yearly

Starter

Best for experimenting and personal projects

Free

10K tokens per month

1 Project

Core runtime orchestration

Community support

Logs retained for 7 days

Pro

Best for indie developers and early-stage startups

$49

Monthly

100K tokens per month

3 Projects

Bring Your Own Keys (BYOK)

Email support

Logs retained for 30 days

Growth

Best for production workloads

$199

Monthly

1M tokens per month

10 Projects

Custom controls for costs & model

Priority support

Logs retained for 365 days

Team

Best for teams running AI at scale

$499

Monthly

10M tokens per month

Unlimited Projects

Up to 3 Organizations

Team access and governance

SLA-backed priority support

Full log history

Overages are $0.05 per 1K tokens. Plans upgrade automatically when usage reaches the next tier.

Overages are $0.50 per 1K tokens.

Plans upgrade automatically when usage reaches the next tier.

FAQ

FAQ

FAQ

Frequently asked questions

How is Perf different from a model router?

Does Perf add latency?

What happens if a model returns invalid JSON or a malformed response?

Do I need API keys for every provider?

Do I lose control over which models Perf can use?

Do I need to change my prompts?

How does cost control work?

Can I see which model Perf selected for a request?

How is Perf different from a model router?

Does Perf add latency?

What happens if a model returns invalid JSON or a malformed response?

Do I need API keys for every provider?

Do I lose control over which models Perf can use?

Do I need to change my prompts?

How does cost control work?

Can I see which model Perf selected for a request?

How is Perf different from a model router?

Does Perf add latency?

What happens if a model returns invalid JSON or a malformed response?

Do I need API keys for every provider?

Do I lose control over which models Perf can use?

Do I need to change my prompts?

How does cost control work?

Can I see which model Perf selected for a request?

Brand logo

Ready to make your AI stack more predictable?

Perf handles model selection, validation, and fallbacks so your application stays stable, consistent and within budget.

Graphic
Brand logo

Ready to make your AI stack more predictable?

Perf handles model selection, validation, and fallbacks so your application stays stable, consistent and within budget.

Graphic
Brand logo

Ready to make your AI stack more predictable?

Perf handles model selection, validation, and fallbacks so your application stays stable, consistent and within budget.

Graphic