Overview
Gemini 1.5 Pro is Google’s multimodal model highlighting very large context and function calling.
Inputs and Outputs
- Inputs: text, images, audio, video frames
- Outputs: text, JSON
Inference & Decoding
Supports structured outputs via schema-based function calling and common decoding.
Safety & Compliance
Google safety settings and policies apply.
Evals & Quality
Cited strong results on multimodal and reasoning tasks.
Deployment & Ops
Available via Gemini API; integrated tooling across Google cloud/services.
Pricing & Licensing
Commercial API with token-based pricing.
References
See YAML references.