Overview
GPT-4o is OpenAI’s multimodal model (text, vision, audio) announced May 2024, focused on interactive latency and structured outputs.
Inputs and Outputs
- Inputs: text, images, audio (provider-reported streaming)
- Outputs: text, JSON, audio
Inference & Decoding
Supports common decoding controls and structured outputs via JSON Schema or function calling. Speculative/assisted generation is supported provider-side.
Safety & Compliance
Use moderation endpoints and policy guidance. Watermarking support is partial/experimental where applicable.
Evals & Quality
OpenAI cites competitive results across reasoning/coding; public third-party evals vary by prompt.
Deployment & Ops
Available via OpenAI and Azure OpenAI; runtime features include streaming and structured outputs.
Pricing & Licensing
Commercial API; per-token pricing; see platform pricing pages.
References
See YAML references.