Skip to content

Inference supply reference

Browse model coverage and batch rates before creating a quote.

Workflows
responses1 provider

Gemma 4 E4B it

AUTONOMOUSc published batch-capable responses supply for Gemma 4 E4B it.

Rates
$0.08079 representative provider quote
Source
Provider base rates updated May 22, 4:25 PM from 1 provider quote.
responsesvision1 provider-10% discount

GPT-5.4

OpenAI general-purpose batch model for document, extraction, and support workflows.

Rates
Input $1.20 / Cached input $0.1247 / Output $7.19 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responsesvision1 provider-10% discount

GPT-5.4 Mini

OpenAI lower-cost batch model for broad extraction, tagging, and rollup work.

Rates
Input $0.3596 / Cached input $0.0360 / Output $2.16 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responsesvision1 provider-10% discount

GPT-5.4 Nano

OpenAI lowest-cost batch lane for simple classification, extraction, and enrichment.

Rates
Input $0.0959 / Cached input $0.0096 / Output $0.5994 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responsesvision1 provider-10% discount

GPT-5.5

OpenAI flagship batch model for difficult async reasoning and long-context review.

Rates
Input $2.40 / Cached input $0.2398 / Output $14.38 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
embeddings1 provider-10% discount

Text Embedding 3 Small

Batch embeddings for semantic search, clustering, and retrieval pipelines.

Rates
Input $0.0959 / 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responses1 provider-10% discount

Claude Haiku 4.5

Anthropic economical batch lane for tagging, enrichment, and simple transformations.

Rates
Input $0.4795 / Cached input $0.0479 / Output $2.40 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responses1 provider-10% discount

Claude Opus 4.7

Anthropic frontier batch lane for long-context review and hard async reasoning.

Rates
Input $2.40 / Cached input $0.2398 / Output $11.99 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
responses1 provider-10% discount

Claude Sonnet 4.6

Anthropic balanced batch model for document review, extraction, and workflow routing.

Rates
Input $1.44 / Cached input $0.1439 / Output $7.19 per 1M tokens
Source
Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.