SAN FRANCISCO — xAI has released Quality Mode for its Grok Imagine API, bringing a significant leap in photorealistic image generation to enterprise developers and teams. The launch, announced on May 6, 2026, introduces a new tier of image output built on Aurora — xAI's proprietary autoregressive Mixture-of-Experts model — designed to produce cinematic-quality visuals, accurate text rendering within images, and fine-grained creative control through natural-language prompts.
Quality Mode is available through the Grok Imagine API at resolutions up to 2K (2048×2048), across 13 aspect ratios ranging from 2:1 to 1:2, and in JPEG, PNG, and WebP output formats.
What Aurora Delivers
Most image generation systems rely on diffusion models, which produce images by progressively denoising random noise guided by a prompt. Aurora takes a fundamentally different approach: it generates images autoregressively, token by token, the same way that Grok generates text. This architectural choice yields measurable differences in output quality — consistent facial structure across a session, accurate material textures, and cinematic lighting behavior that diffusion models often struggle to replicate.
The practical result is higher realism across a wide range of styles and subjects. In independent leaderboard evaluations, Grok Imagine ranked among the top five text-to-image models globally as of early May 2026.
Text Rendering and Creative Control
Two of Quality Mode's most distinctive capabilities are strong in-image text rendering and multi-image reference composition. Most image generators produce illegible or distorted text when asked to include written copy in an image. Quality Mode renders accurate, legible typography — menus, signage, labels — from a natural-language description alone.





