What is it?
Gemma 4 E2B is Google DeepMind’s mobile-first member of the Gemma 4 family, released in April 2026. With 2.3 billion effective parameters (using the Per-Layer Embedding architecture) and a 1.5 GB quantized footprint, it’s purpose-built to run entirely on consumer phones — no cloud calls, no streaming, no privacy compromises. Cove uses Gemma 4 across all four apps (Travel, Voice, Photo, Health), making it the model with the most real-world consumer deployments at the time of writing.
Note on parameter counts: official labels say “E2B = 2.3B effective parameters”, referring to weights active in each forward pass. The Per-Layer Embedding (PLE) lookup tables bring the total weight count to ~5.1B, but those tables are accessed selectively rather than computed through. The 1.5 GB quantized footprint is what hits your phone’s storage.
Core specs at a glance
(See spec card above — populated from structured data.)
What devices can run it?
Gemma 4 E2B runs comfortably on flagship Android (Pixel 8 and newer, Galaxy S24+, OnePlus 12+) and iPhone 15 Pro / Pro Max / 16 family. It will technically install on devices with 6 GB RAM, but token throughput drops sharply below 8 GB. iPad M-series and recent MacBook Air / Pro models are also supported, where it benefits from the higher memory bandwidth.
Strengths and limitations
Strengths. Best-in-class size-to-quality ratio for general text tasks, native multimodal support (text + vision + audio), Apache 2.0 license, and Google’s active maintenance with quarterly updates. Distillation from larger Gemini family models gives it broader knowledge than its parameter count suggests.
Limitations. Below Phi-4-multimodal on math and reasoning benchmarks. The 128K context is now on par with Llama 3.2, so it is no longer the long-document bottleneck — but multilingual quality is uneven: strong on top 20 languages, weaker on under-represented ones.
When to choose it (and when not to)
Choose Gemma 4 E2B if: you need a balanced general-purpose on-device model, you want text + vision + audio in one runtime, you’re shipping to phones with 4+ GB RAM as a baseline, and license simplicity matters.
Skip it if: your workload is reasoning-heavy (use Phi-4-multimodal or DeepSeek-R1 Distill), you need million-token context (still cloud-only territory), or you’re targeting Apple-only and want first-party tools (use Apple Foundation Models).
How it compares to similar on-device models
The two closest siblings are Microsoft Phi-4-multimodal (larger, sharper reasoning, MIT license, also text+vision+audio) and Qwen 3.5 2B (stronger Chinese, comparable size, 262K context). For full side-by-side, see the leaderboard.
In a real Cove app
Cove Travel uses Gemma 4 for camera-based menu translation and offline voice translation; Cove Voice uses it for AI-summarized voice notes. Both apps demonstrate that Gemma 4 E2B is production-ready for consumer use cases, not just a research demo.