Google Launches Gemma 4: Four Open Models with Frontier-Level Efficiency

ModelsTop News4 sources·Apr 3

google gemma gemini multimodal function-calling agents context-window moe open-source-release top-news model-releases

Summary

• Gemma 4 released in four sizes — E2B, E4B, 26B MoE, 31B Dense — under Apache 2.0 license
• 31B model ranks #3 and 26B ranks #6 among all open models on Arena AI text leaderboard
• All models support multimodal input (video, images); edge variants add native audio input
• Context windows up to 256K tokens; natively trained on 140+ languages
• Gemma ecosystem surpasses 400 million downloads and 100,000 community variants

Adjust signal

Details

#	Type	Key Point	Context
1	Product Launch	Gemma 4 released in four sizes under Apache 2.0	Google released Gemma 4 in E2B, E4B, 26B MoE, and 31B Dense configurations. The Apache 2.0 license allows commercial use and modification without restrictions, making these models broadly accessible for enterprise and research deployment.
2	Stat	31B model ranks #3 and 26B ranks #6 among open models globally	Rankings are based on the Arena AI text leaderboard as of April 1, 2026. Google states the 26B MoE model outperforms models with 20x more parameters, underscoring the efficiency gains achieved in this generation.
3	New Tech	Native agentic workflow support across the entire model family	All Gemma 4 models include native function-calling, structured JSON output, and system instruction handling. These features enable developers to build autonomous agents that interact with external APIs and execute multi-step workflows without additional scaffolding.
4	New Tech	Full multimodal support including video, images; audio on edge models	All four models process images and video at variable resolutions, with particular strength in OCR and chart understanding. The E2B and E4B edge models additionally support native audio input for speech recognition and understanding.
5	Tech Info	Context windows of 128K (edge) and 256K (larger models)	The E2B and E4B models support 128K token context windows for long documents and extended conversations. The 26B and 31B models extend this to 256K tokens, enabling full repository ingestion or book-length documents in a single prompt.
6	Tech Info	Models natively trained on 140+ languages	Gemma 4's multilingual training coverage is designed to support global application development without relying on translation layers, particularly relevant for developers building localized products in lower-resource languages.
7	Stat	Gemma ecosystem: 400M+ downloads and 100,000+ community variants	These figures span all Gemma generations since the first release. The scale of community adoption — fine-tunes, quantizations, and derivative models — reflects the practical utility of the open-weight format for downstream customization.
8	Research	Yale University used Gemma for Cell2Sentence-Scale cancer therapy research	Google cited this collaboration as a concrete example of high-impact fine-tuning: the project explored new pathways for cancer therapy using the model's ability to process and reason over biological data.
9	Context	Gemma 4 shares research lineage with Gemini 3	Google positions Gemma 4 as built from the same underlying research and technology as its flagship Gemini 3 proprietary models, suggesting architectural and training advances flow from the closed to the open model line.
10	Strategy	Edge-first sizing targets Android devices and laptop GPUs	The E2B and E4B models are explicitly designed for billions of Android devices and consumer hardware, reflecting a strategy to embed Gemma into on-device applications where latency, privacy, and cost make cloud inference impractical.

1.Product Launch

Gemma 4 released in four sizes under Apache 2.0

Google released Gemma 4 in E2B, E4B, 26B MoE, and 31B Dense configurations. The Apache 2.0 license allows commercial use and modification without restrictions, making these models broadly accessible for enterprise and research deployment.

2.Stat

31B model ranks #3 and 26B ranks #6 among open models globally

Rankings are based on the Arena AI text leaderboard as of April 1, 2026. Google states the 26B MoE model outperforms models with 20x more parameters, underscoring the efficiency gains achieved in this generation.

3.New Tech

Native agentic workflow support across the entire model family

All Gemma 4 models include native function-calling, structured JSON output, and system instruction handling. These features enable developers to build autonomous agents that interact with external APIs and execute multi-step workflows without additional scaffolding.

4.New Tech

Full multimodal support including video, images; audio on edge models

All four models process images and video at variable resolutions, with particular strength in OCR and chart understanding. The E2B and E4B edge models additionally support native audio input for speech recognition and understanding.

5.Tech Info

Context windows of 128K (edge) and 256K (larger models)

The E2B and E4B models support 128K token context windows for long documents and extended conversations. The 26B and 31B models extend this to 256K tokens, enabling full repository ingestion or book-length documents in a single prompt.

6.Tech Info

Models natively trained on 140+ languages

Gemma 4's multilingual training coverage is designed to support global application development without relying on translation layers, particularly relevant for developers building localized products in lower-resource languages.

7.Stat

Gemma ecosystem: 400M+ downloads and 100,000+ community variants

These figures span all Gemma generations since the first release. The scale of community adoption — fine-tunes, quantizations, and derivative models — reflects the practical utility of the open-weight format for downstream customization.

8.Research

Yale University used Gemma for Cell2Sentence-Scale cancer therapy research

Google cited this collaboration as a concrete example of high-impact fine-tuning: the project explored new pathways for cancer therapy using the model's ability to process and reason over biological data.

9.Context

Gemma 4 shares research lineage with Gemini 3

Google positions Gemma 4 as built from the same underlying research and technology as its flagship Gemini 3 proprietary models, suggesting architectural and training advances flow from the closed to the open model line.

10.Strategy

Edge-first sizing targets Android devices and laptop GPUs

The E2B and E4B models are explicitly designed for billions of Android devices and consumer hardware, reflecting a strategy to embed Gemma into on-device applications where latency, privacy, and cost make cloud inference impractical.

Product Launch = new model/product release; Stat = quantitative claim or ranking; New Tech = novel capability or technical feature; Tech Info = technical specification; Research = academic or scientific application; Context = background framing; Strategy = business or product positioning intent

What This Means

Gemma 4 raises the ceiling for what developers can deploy on commodity hardware — a 31B model ranking third among all open models globally means teams can now access near-frontier reasoning without cloud dependency or large GPU clusters. The inclusion of native agentic features, long context, and full multimodal support across all four sizes makes this a strong foundation for production AI applications beyond research use. With Apache 2.0 licensing and 400 million downloads already behind the Gemma brand, this release is positioned to become a significant reference point in open model benchmarking and community development through 2026.

Sentiment

Broadly excited about local efficiency and benchmark performance

@victormustarVictor Mustar · Head of Product @huggingfaceView post

Excited

“Google Gemma 4 is here - and it delivers 🤯 Here's HOW TO run it on your hardware (runs on most devices) with llama.cpp to give you a Chat UI + OpenAI chat completion endpoint instantly!”

@ggerganovGeorgi Gerganov · engineer @huggingfaceView post

Impressed

“Gemma 4 is here! The best open-source model you can run on your machine. Day-0 support in a llama.cpp. Check it out!”

@arenaArena.ai · AI evaluation platformView post

Supportive

“Gemma-4-31B is now live in Text Arena - ranking #3 among open models (#27 overall), matching much larger models at 10× smaller scale! ... Congrats to @GoogleDeepMind on a major step forward for open models!”

@antirezantirez · Redis authorView post

Skeptical

“Qwen 3.5 27B is still the top scoring model in this category of memory usage... I don't believe there is currently any hint at Gemma 4 being currently the top local inference model.”

Split

~90/10 positive/skeptical — most praise efficiency and benchmarks, minor doubts on topping rivals like Qwen for local inference.

Sources

Similar Events

Google Gemini 3.1 Flash Live: Real-Time Audio AI with Benchmark-Leading Performance

Mar 26

Google March 2026 AI Roundup: Search, Maps, Workspace, and Gemini Expansions

Apr 3