Inference supply reference
Browse model coverage and batch rates before creating a quote.
Gemma 4 E4B it
AUTONOMOUSc published batch-capable responses supply for Gemma 4 E4B it.
- Rates
- $0.08079 representative provider quote
- Source
- Provider base rates updated May 22, 4:25 PM from 1 provider quote.
GPT-5.4
OpenAI general-purpose batch model for document, extraction, and support workflows.
- Rates
- Input $1.20 / Cached input $0.1247 / Output $7.19 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
GPT-5.4 Mini
OpenAI lower-cost batch model for broad extraction, tagging, and rollup work.
- Rates
- Input $0.3596 / Cached input $0.0360 / Output $2.16 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
GPT-5.4 Nano
OpenAI lowest-cost batch lane for simple classification, extraction, and enrichment.
- Rates
- Input $0.0959 / Cached input $0.0096 / Output $0.5994 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
GPT-5.5
OpenAI flagship batch model for difficult async reasoning and long-context review.
- Rates
- Input $2.40 / Cached input $0.2398 / Output $14.38 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
Text Embedding 3 Small
Batch embeddings for semantic search, clustering, and retrieval pipelines.
- Rates
- Input $0.0959 / 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
Claude Haiku 4.5
Anthropic economical batch lane for tagging, enrichment, and simple transformations.
- Rates
- Input $0.4795 / Cached input $0.0479 / Output $2.40 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
Claude Opus 4.7
Anthropic frontier batch lane for long-context review and hard async reasoning.
- Rates
- Input $2.40 / Cached input $0.2398 / Output $11.99 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
Claude Sonnet 4.6
Anthropic balanced batch model for document review, extraction, and workflow routing.
- Rates
- Input $1.44 / Cached input $0.1439 / Output $7.19 per 1M tokens
- Source
- Discounted customer rates updated May 22, 4:25 PM from 1 provider quote; BatchRouter fee included.
