Gemini AI Models Explained: A Decision-Maker's Guide to Google's Model Architecture
Google's Gemini model family isn't a product — it's a signal. A signal that the AI arms race has entered its infrastructure phase, and organizations still treating AI as a department-level experiment are about to get outengineered by those who understand model selection as a systems decision. The teams winning with AI in 2026 aren't the ones who deployed the flashiest chatbot. They're the ones who built architectures.
Since Google DeepMind launched the Gemini model family, the landscape has fractured into a spectrum of capability tiers — from lightweight edge-deployable variants to frontier multimodal powerhouses. As of 2026, Gemini 3 and its derivatives represent Google's most sophisticated AI infrastructure yet, rivaling OpenAI's GPT class models across reasoning, code generation, and long-context processing [1]. For operations leaders and technology decision-makers, the question is no longer 'should we use AI?' — it's 'which model, deployed how, inside which workflow architecture?'
This guide breaks down the full Gemini model family with engineering precision — what each model is built for, how they compare against each other and the broader AI field, and most critically, how to select and integrate the right Gemini model as a functional component inside a production-grade automation system — not as an isolated toy.
What Is the Gemini AI Model Family? Google's Architecture Decoded
Gemini is Google DeepMind's flagship family of multimodal large language models, designed to process and generate text, code, images, audio, and video — not as separate capability bolted together, but as a unified processing architecture [1]. That distinction matters more than most decision-makers realize. Legacy AI systems retrofitted with vision plugins or audio connectors carry architectural debt at the seams. Gemini was trained on mixed-modality data from the ground up, which gives it structural coherence that retrofitted multimodal models simply cannot replicate.
The family is tiered by capability and compute cost, and understanding those tiers as a distributed processing architecture — not as a product lineup with a good-better-best structure — is the first systems-thinking shift required. You don't select a Gemini model the way you select a SaaS subscription tier. You select it the way a network engineer selects a routing protocol: based on latency requirements, throughput constraints, and the specific data physics of each workflow node.
Models are accessible via Google AI Studio, the Gemini API, Vertex AI, and Firebase AI Logic — each with materially different compliance, scalability, and integration profiles [2]. Choosing the wrong deployment surface for a regulated workload isn't a configuration error. It's an architectural mistake that creates compliance debt from day one.
The Gemini Model Tiers: Ultra, Pro, Flash, and Nano
Gemini Ultra / Gemini 3: Frontier-class, highest reasoning capability. Built for complex multi-step workflows, full-document legal analysis, clinical reasoning support, and agentic task execution across long context windows. This is not a general-purpose model — it is a specialized processor for your most cognitively demanding workflow nodes.
Gemini Pro / Gemini 3.1 Pro: The production workhorse. For most enterprise automation pipelines, Pro-tier models function as the central processor — balancing capability depth with cost efficiency across document-heavy reasoning tasks: contract review, intake processing, proposal generation, and financial analysis [3].
Gemini Flash: Latency-optimized for high-throughput, real-time applications. Customer-facing agents, document triage queues, email classification — any workflow node where sub-second response is a system requirement, not a preference. Flash is where you route volume. Pro is where you route complexity.
Gemini Nano: On-device model for mobile and edge deployments where data cannot leave the local environment. In regulated industries — healthcare practices operating under HIPAA, law firms protecting attorney-client privilege — Nano is not a lightweight alternative. It is a compliance requirement disguised as a model tier.
How Gemini Models Are Deployed: API, Vertex AI, and Firebase AI Logic
Deployment surface selection is not a developer decision. It is an architectural and compliance decision that belongs in the same conversation as model selection itself [4].
Google AI Studio and the Gemini API offer rapid prototyping access with low friction — but they do not offer the compliance posture that regulated industries require. Build experiments here. Do not run production workloads here.
Vertex AI is the enterprise deployment layer. VPC service controls, audit logging, IAM granularity, data residency controls, and model version management are built into the platform. For any organization operating under HIPAA, SOC 2, or legal privilege frameworks, Vertex AI is the only architecturally defensible deployment surface [4].
Firebase AI Logic targets developer-embedded use cases — AI logic baked into mobile and web application layers. High utility for product teams building AI-native user experiences, but not the right surface for back-office workflow automation.
Which Is the Latest Gemini AI Model? Tracking the 2026 Model Frontier
As of 2026, Gemini 3 and Gemini 3.1 Pro represent the leading edge of Google's publicly available model family [3]. Gemini 3 introduces significant improvements across the three pillars of real workflow automation: long-context reasoning, multimodal understanding, and agentic task execution. These aren't incremental benchmark improvements — they represent qualitative shifts in what these models can do inside a production pipeline.
Gemini 3's context window expansion — reaching up to 1 million tokens in frontier variants — means that full-document legal analysis, longitudinal patient record synthesis, and complete codebase reasoning are no longer edge cases requiring specialized chunking architectures. They are default capabilities [2].
But here's the operational reality that most teams miss: Google's release cadence has accelerated. A model governance strategy is now a mandatory component of any enterprise AI architecture. Organizations that lock into a single model version without version management are building brittle automation pipelines that will break quietly every time Google updates an underlying model. Build for model interchangeability from day one. Pin versions in production. Test before migrating. Treat model updates the way you treat software dependency upgrades — with discipline, not optimism.
What's the Best Gemini Model for Enterprise Workflows? A Use-Case-Driven Framework
'Best' is an architectural question, not a benchmark question. The right model is the one that fits the latency, cost, compliance, and capability requirements of a specific workflow node — not the one with the highest MMLU score on a leaderboard.
A mature AI system uses multiple Gemini model tiers simultaneously, routing tasks to the appropriate model the way a network routes packets to the appropriate endpoint. Pro handles the reasoning-intensive back-office tasks. Flash handles the synchronous user-facing interactions. Nano handles the edge cases where data sovereignty is non-negotiable. The failure mode of most mid-market AI deployments is selecting a frontier model for every task — creating unnecessary cost and latency overhead throughout the entire automation pipeline.
Which Gemini Model Is Right for You? Decision Matrix
| Use Case | Recommended Model | Context Window | Relative Cost | Latency Profile |
|---|---|---|---|---|
| Complex legal/clinical reasoning | Gemini 3 / Ultra | Up to 1M tokens | High | Acceptable |
| Document-heavy enterprise workflows | Gemini 3.1 Pro | Long | Medium | Moderate |
| Real-time customer-facing agents | Gemini Flash | Medium | Low | Sub-second |
| High-volume document classification | Gemini Flash-Lite | Medium | Very Low | Minimal |
| On-device / offline / regulated edge | Gemini Nano | Short | Minimal | Local |
Selection flowchart logic: Start with compliance — if data cannot leave the device, Nano is the answer. If the task is synchronous and user-facing, Flash. If the task requires multi-document reasoning or complex output generation, Pro. If the task involves frontier reasoning across massive context, Gemini 3 / Ultra. If budget is the primary constraint, default down one tier and evaluate quality degradation against the cost delta.
Legal and Healthcare Use Cases: Model Selection Under Compliance Pressure
Boutique law firms and healthcare practices cannot treat model selection as a marketing decision. It is a risk management decision. The wrong model on the wrong deployment surface is a liability exposure, not a productivity experiment.
Long-context Pro and Gemini 3 models are structurally suited for full-document legal analysis, multi-exhibit review, and longitudinal patient record synthesis. The context window depth that makes these models expensive is precisely what makes them capable of holding an entire case file or patient history in working memory — a capability that shorter-context models simply cannot replicate without complex chunking architectures that introduce their own error surfaces.
On-device Nano deployments address scenarios where protected health information or privileged attorney-client communications cannot traverse external APIs under any circumstances. This is not paranoia. This is compliance architecture.
Vertex AI's enterprise controls — data processing agreements, model versioning locks, audit trails, VPC isolation — are not optional compliance theater. They are infrastructure requirements for any regulated-industry deployment [4]. If your current AI deployment doesn't have these controls in place, you're not running a production system. You're running a prototype with production data.
Operations and Mid-Market Enterprise Use Cases
For mid-market enterprises with 50–500 employees, the highest-ROI applications of the Gemini model family sit at the intersection of volume and reasoning complexity. Flash models power high-volume document classification, email triage, and intake routing where throughput is the primary system constraint. Pro models handle reasoning-intensive back-office tasks: proposal generation, vendor contract analysis, financial report summarization.
Multimodal capabilities unlock an entire class of use cases that text-only models cannot serve: invoice processing from scanned PDFs, visual inspection workflows, mixed-media knowledge bases where images and text coexist inside the same document. Organizations still running OCR pipelines and manual extraction workflows on structured documents have a Gemini Flash integration waiting to replace that entire process.
Gemini vs. GPT-5 vs. Claude 4.5: An Honest Benchmark Comparison
The 'Gemini vs. GPT' question is the wrong frame — but it's also unavoidable, so let's address it with precision rather than vendor loyalty.
As of 2026, the 'big 4' AI model families — Google Gemini, OpenAI GPT, Anthropic Claude, and Meta Llama — each have distinct architectural strengths, licensing models, and deployment constraints. No single provider holds a universal performance advantage across all task classes. Anyone claiming otherwise is selling something.
Context Window: Gemini 3's 1M token context window is a structural differentiator. GPT-5 and Claude 4.5 offer competitive long-context capabilities, but Gemini's architecture was designed for this from the ground up, giving it an edge on full-document and multi-document reasoning tasks [2].
Multimodal Capabilities: Gemini's native multimodality — trained on mixed-modality data rather than retrofitted — gives it a structural advantage on document-image-text fusion tasks. GPT-5 has strong vision capabilities, and Claude 4.5 has improved significantly, but Gemini's architectural coherence across modalities remains a differentiated capability.
Coding Performance: On HumanEval and similar coding benchmarks, GPT-5 and Gemini 3 are closely matched at the frontier. Gemini's deep integration with Google's own code infrastructure gives it contextual advantages in certain code generation scenarios. Claude 4.5 performs strongly on code comprehension and refactoring tasks. Run domain-specific evaluations against your actual codebase — headline benchmark scores will not predict performance on your specific use cases [1].
Pricing: Flash-tier Gemini models represent the most cost-efficient option for high-volume production workloads among the big 4. Frontier-tier pricing across providers is competitive, with differences in token costs varying by use-case profile. Meta's Llama family — open-weight and self-hostable — changes the cost equation entirely for organizations with infrastructure capacity to run their own inference.
Real-World Task Accuracy: LMSYS Chatbot Arena rankings shift regularly, and no single model dominates across all categories. The operationally relevant insight is this: benchmark performance varies by task class. Do not extrapolate from headline numbers to your specific workflow requirements. Run domain-specific evaluations against representative samples from your actual operational data.
Where Gemini Has a Structural Advantage
Native multimodality, context window depth at scale, and Google ecosystem integration are Gemini's three structural advantages. For organizations on Google Workspace and GCP, Gemini is the lowest-impedance integration path — fewer data hops, fewer security surface areas, and a compliance posture that aligns with existing Google Cloud agreements [4]. Vertex AI's enterprise controls are also more mature than many competing deployment platforms for regulated-industry workloads.
Where Gemini Has Limitations to Engineer Around
Model availability and API rate limits on consumer tiers create production reliability risks — Vertex AI enterprise contracts are the architectural solution, not a premium upsell. Gemini's ecosystem advantage is simultaneously a lock-in risk. Teams building on Gemini should architect for model portability using abstraction layers — orchestration frameworks that allow model swapping without rebuilding prompt architecture or integration logic. Treat the model as a replaceable component, not a fixed dependency.
How to Integrate Gemini Models Into a Production Automation System
Model selection is 20% of the integration challenge. The other 80% is data pipelines, prompt architecture, output validation, error handling, and workflow orchestration. A Gemini model deployed without structured input/output schemas is a prototype. Organizations that deploy Gemini as a standalone chatbot and call it 'AI integration' have built an expensive parlor trick, not a workflow asset.
If you're serious about building production-grade Gemini integrations, the most efficient next step is to schedule a Systems Audit — a structured assessment of your current workflow architecture against the model selection and integration requirements of your specific operational context.
The Systems Architecture for Gemini-Powered Workflow Automation
Layer 1 — Data Infrastructure: Structured knowledge bases, document stores, and CRM/EHR connectors that feed clean, properly formatted context into the model. Garbage in, garbage out is not a cliché — it is a data physics law. The model cannot compensate for unstructured, inconsistent input data.
Layer 2 — Model Orchestration: Routing logic that selects the appropriate Gemini tier — or alternative model — based on task type, cost budget, and latency requirements. This is the nervous system of your AI architecture. Without it, you're making model selection decisions manually at the task level, which doesn't scale.
Layer 3 — Output Processing: Validation layers, human-in-the-loop checkpoints, and downstream system writes that make model output actionable. Raw model output is not a deliverable. Validated, formatted, system-integrated output is.
Layer 4 — Monitoring and Governance: Audit trails, performance dashboards, and model version management that keep the system defensible in regulated environments. This layer is what separates an AI system from an AI experiment. It is also the layer that most organizations skip until a compliance audit or a production failure forces the conversation.
Common Integration Failure Modes to Engineer Around
Prompt injection vulnerabilities in customer-facing agents require input sanitization and output guardrails — non-negotiable for any externally exposed AI endpoint.
Hallucination in high-stakes outputs — legal citations, clinical data, financial figures — requires retrieval-augmented generation (RAG) architecture, source grounding, and confidence scoring mechanisms. Do not deploy generative models on high-stakes factual tasks without retrieval grounding.
Model version drift when Google updates underlying models requires version pinning in production and regression test suites that run on every model update before it reaches live workflows.
Cost overruns from inefficient model selection — the most preventable failure mode — require task-level cost monitoring and automatic tier downgrading for routine tasks that don't require frontier capability.
Gemini Models and the Future of AI Infrastructure: What Decision-Makers Must Plan For
The Gemini model family will continue to expand throughout 2026 and beyond. Agentic AI — Gemini models operating autonomously across multi-step workflows with tool use and external API calls — is the next production frontier [5]. Organizations building on Gemini today must architect for model evolution: abstraction layers, modular prompt systems, and vendor-agnostic orchestration frameworks.
The competitive advantage in AI is shifting from 'which model you use' to 'how well your systems harness model capabilities at scale.' Decision-makers who treat model selection as a one-time procurement decision rather than an ongoing architectural practice will find themselves rebuilding from scratch every 18 months. That is not a prediction. It is a pattern already visible in the organizations that bet on first-generation GPT-3 integrations without building for model portability.
Frequently Asked Questions: Gemini AI Models
What models does Gemini use? As of 2026, the Gemini model family includes Gemini 3 (frontier), Gemini 3.1 Pro (enterprise production), Gemini Flash (latency-optimized), Gemini Flash-Lite (high-volume/low-cost), and Gemini Nano (on-device/edge). Each tier has distinct technical designations and capability profiles accessible via the Gemini API documentation [2].
Which is the latest Gemini AI model? Gemini 3 and Gemini 3.1 Pro represent the current frontier as of 2026, with Gemini 3 leading on reasoning depth and context window capacity [3].
What are the big 4 AI models? Google Gemini, OpenAI GPT, Anthropic Claude, and Meta Llama — the four dominant model families in 2026, each with distinct architectural strengths, licensing structures, and deployment constraints.
Is Gemini better than GPT? Task-dependent. Gemini has structural advantages in native multimodality and long-context processing. GPT-5 remains competitive on certain reasoning benchmarks. The correct comparison axis is integration fit with your existing infrastructure and compliance posture — not aggregate benchmark scores.
What are the top 5 AI models now? Gemini 3, GPT-5, Claude 4.5, Meta Llama (latest), and Gemini 3.1 Pro represent the leading operational models in 2026. Selection should be driven by use-case alignment, deployment requirements, and total cost of ownership — not marketing positioning.
What are the 8 AI models? Beyond the big 4 families, the broader landscape includes Mistral, Cohere Command, Amazon Titan, and various open-weight variants. For regulated enterprise deployments, the decision framework should center on compliance posture and ecosystem integration — which narrows the field considerably.
The Bottom Line
The Gemini AI model family represents one of the most capable and enterprise-ready AI infrastructures available in 2026 — but capability without architecture is noise. Decision-makers who approach Gemini as a feature to bolt onto existing workflows will extract a fraction of its potential. Those who treat model selection as an engineering discipline — matching capability tiers to workflow requirements, deploying on compliant infrastructure, and building for model evolution — will build systems that compound in value.
The model is not the system. The system is the system. Gemini is one of the most powerful components available to build it with.
Your organization's AI capability ceiling is not set by Google's model roadmap — it's set by the quality of the architecture you build around it. If you're evaluating Gemini models for a specific workflow challenge in legal, healthcare, or enterprise operations, the next step isn't another research cycle. It's a systems audit. Schedule your AI Systems Audit and we'll map the right model selection and integration architecture against your actual operational requirements — no off-the-shelf recommendations, no vendor-aligned advice.
Frequently Asked Questions
Q: What models does Gemini use?
The Gemini AI model family is organized into four primary tiers, each designed for different capability and compute requirements. As of 2026, the lineup includes Gemini Ultra, the most powerful frontier-class model built for complex reasoning and multimodal tasks; Gemini Pro, a balanced model suited for enterprise-grade text, code, and analytical workloads; Gemini Flash, an optimized variant designed for high-speed, cost-efficient inference; and Gemini Nano, a lightweight model built for on-device and edge deployments. The latest generation, Gemini 3 and its derivatives, represents Google DeepMind's most advanced iteration yet. All Gemini models are natively multimodal, meaning they were trained to process text, code, images, audio, and video as a unified architecture — not as separate capabilities patched together. This is a meaningful structural distinction from competitors that retrofitted vision or audio onto existing language models. Organizations can access these models through Google AI Studio, the Gemini API, Vertex AI, or Firebase AI Logic, each offering different compliance, scalability, and integration profiles depending on the deployment context.
Q: What are the 8 AI models?
While there is no single definitive list of exactly eight AI models, the most commonly referenced frontier and widely deployed large language models in 2026 include: Google's Gemini Ultra and Gemini Pro (part of the Gemini AI models family), OpenAI's GPT-4o and o3, Anthropic's Claude 3.7 Sonnet, Meta's Llama 4, Mistral Large, and Amazon's Titan or Nova class models via AWS Bedrock. Each serves distinct use cases. Gemini AI models stand out for their native multimodality and deep integration with Google's infrastructure stack, including Vertex AI and Workspace. GPT-class models lead in developer ecosystem maturity. Claude models are favored for safety-focused enterprise deployments. Llama 4 provides open-weight flexibility for organizations that need on-premises control. The right model depends on your workflow requirements — latency, cost per token, modality needs, compliance constraints, and existing infrastructure investment should all factor into the selection decision rather than defaulting to a single popular name.
Q: Which is the latest Gemini AI model?
As of 2026, the latest Gemini AI model is Gemini 3 and its associated derivatives, released by Google DeepMind. Gemini 3 represents the most advanced iteration of the Gemini model family, with improvements across multimodal reasoning, long-context processing, and code generation that place it in direct competition with OpenAI's most capable GPT-class models. Within the Gemini 3 generation, Google has maintained the tiered architecture — Ultra, Pro, Flash, and Nano variants — allowing organizations to deploy the appropriate capability level for each workload node. The Flash and Nano variants of Gemini 3 in particular have seen significant performance-per-cost improvements, making them viable for high-volume production pipelines where inference costs at scale are a primary constraint. For decision-makers, staying current with model versions matters less than understanding which tier within the current generation is appropriate for each specific workflow. Subscribing to Google DeepMind's release notes and Vertex AI changelogs is the most reliable way to track production-relevant updates.
Q: What's the best Gemini model?
There is no single best Gemini AI model — the right model depends entirely on the specific workload, latency requirements, cost constraints, and deployment context. That said, here is a practical framework: Gemini Ultra is the strongest performer for complex, high-stakes tasks such as advanced reasoning, scientific analysis, long-document synthesis, and sophisticated multimodal processing. It carries the highest compute cost and is best reserved for tasks where quality directly impacts business outcomes. Gemini Pro is the most versatile choice for enterprise workflows — balancing capability and cost effectively across text generation, code assistance, and analytical tasks. It is the default recommendation for most production deployments. Gemini Flash is optimized for speed and cost efficiency, making it the right selection for high-throughput pipelines, real-time applications, or any workflow where response latency is a primary constraint. Gemini Nano is purpose-built for on-device or edge deployments where cloud connectivity is limited or data residency requirements prohibit external API calls. Organizations running mature AI architectures often deploy multiple Gemini tiers simultaneously, routing tasks to the appropriate model based on complexity — a pattern that maximizes both performance and cost efficiency.
Q: What are the big 4 AI models?
The term 'big 4 AI models' is not a formally defined industry category, but in 2026, the four most strategically significant frontier model families dominating enterprise AI adoption are Google's Gemini AI models, OpenAI's GPT series, Anthropic's Claude series, and Meta's Llama series. Google's Gemini stands out for native multimodality and tight integration with enterprise infrastructure via Vertex AI. OpenAI's GPT models, particularly GPT-4o and o3, maintain the largest developer ecosystem and broadest third-party tooling support. Anthropic's Claude series leads in safety-focused deployments and is favored in regulated industries. Meta's Llama series provides open-weight flexibility, enabling on-premises deployment without API dependency — a critical requirement for organizations with strict data sovereignty or compliance obligations. For technology decision-makers, the strategic question is not which of these is universally best, but which aligns with your existing infrastructure, compliance requirements, and workflow architecture. Many mature organizations deploy models from multiple families, selecting based on task-specific fit rather than organizational loyalty to a single vendor.
Q: Is Gemini better than GPT?
Comparing Gemini AI models to GPT-class models requires specificity — neither family is universally superior across all tasks and contexts. As of 2026, Gemini 3 and GPT-4o and o3 are competitive at the frontier level, with meaningful differences in specific capability areas. Gemini has a structural advantage in native multimodality: because Gemini was trained on mixed-modality data from the ground up, it handles tasks involving text, images, audio, video, and code as a unified architecture. GPT models have been retrofitted with multimodal capabilities, which can introduce inconsistencies at modality boundaries. Gemini also has a natural advantage for organizations already operating within Google's infrastructure — Workspace, BigQuery, Vertex AI, and Firebase all integrate more cleanly with Gemini than with GPT. GPT models, conversely, have a more mature third-party ecosystem, broader developer tooling, and a longer track record in enterprise production environments. For most decision-makers, the choice between Gemini and GPT should be driven by infrastructure fit, specific capability requirements, and total cost of ownership — not benchmark rankings, which shift with every model release.
Q: What are the top 5 AI models now?
As of 2026, the five most capable and widely deployed AI models across enterprise and developer use cases are: Google Gemini Ultra (Gemini 3 generation) — Google DeepMind's flagship frontier model, leading in native multimodal capability and long-context processing; OpenAI GPT-4o and o3 — the most mature enterprise AI models with the broadest ecosystem and tooling support; Anthropic Claude 3.7 Sonnet — favored for safety-critical and compliance-sensitive deployments; Meta Llama 4 — the leading open-weight model enabling on-premises and sovereign AI deployments; and Mistral Large — a strong European-origin model gaining traction in organizations requiring GDPR-aligned infrastructure. Each model has distinct strengths. Gemini AI models in particular offer the deepest integration with Google's enterprise infrastructure stack, making them the default choice for organizations already invested in Google Cloud. The 'top 5' designation shifts frequently as new model versions release, so decision-makers should evaluate models based on current benchmark performance in their specific task categories — not on general popularity rankings.
Q: What is the $900,000 AI job?
The '$900,000 AI job' refers to highly publicized compensation packages for elite AI researchers and machine learning engineers at frontier AI companies, including Google DeepMind, OpenAI, Anthropic, and Meta. These roles — typically Principal Research Scientists, Distinguished Engineers, or AI Architects working directly on frontier model development — have commanded total compensation packages in the $800,000 to over $1,000,000 range in 2025 and 2026, primarily driven by equity grants rather than base salary alone. The acute scarcity of researchers capable of training and fine-tuning models at the scale required for systems like Gemini AI models has created extreme compensation pressure at the top of the talent market. For most organizations, this talent market reality underscores a critical strategic point: building proprietary frontier models is not a realistic path for the vast majority of enterprises. The practical alternative is developing deep operational expertise in deploying, fine-tuning, and architecting workflows around existing frontier models like Gemini — a capability gap that is generating its own emerging category of high-value AI engineering and AI operations roles, typically in the $150,000 to $400,000 range depending on scope and seniority.
References
[1] https://www.ibm.com/think/topics/google-gemini. ibm.com. https://www.ibm.com/think/topics/google-gemini
[2] https://ai.google.dev/gemini-api/docs/models. ai.google.dev. https://ai.google.dev/gemini-api/docs/models
[3] https://aistudio.google.com/models/gemini-3. aistudio.google.com. https://aistudio.google.com/models/gemini-3
[4] https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models. docs.cloud.google.com. https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models
[5] https://gemini.google.com/. gemini.google.com. https://gemini.google.com/