On April 6, 2025, Meta announced the release of Llama 4, a new collection of AI models designed to enhance its AI assistant across various platforms, including WhatsApp, Messenger, and Instagram. The two models, Llama 4 Scout and Llama 4 Maverick, are now available for download, with a third model, Llama 4 Behemoth, currently in training.
- Meta announces Llama 4 AI models
- Llama 4 Scout fits in a single GPU
- Llama 4 Behemoth has 288 billion parameters
- MoE architecture conserves model resources
- Llama 4 criticized for license restrictions
- LlamaCon conference scheduled for April 29th
Meta’s Llama 4 collection includes Llama 4 Scout, a compact model that operates on a single Nvidia H100 GPU, and Llama 4 Maverick, which is comparable to other leading models like GPT-4o and Gemini 2.0 Flash. Llama 4 Scout boasts a 10-million-token context window and reportedly outperforms Google’s Gemini 2.0 Flash-Lite and Mistral 3.1 across various benchmarks.
Key specifications for the Llama 4 models include:
- Llama 4 Scout: 10-million-token context window, optimized for single GPU usage.
- Llama 4 Maverick: Designed to compete with advanced models like GPT-4o.
- Llama 4 Behemoth (in development): 288 billion active parameters, total of 2 trillion parameters.
Meta has adopted a “mixture of experts” architecture for Llama 4, which allows the model to utilize only the necessary components for specific tasks, enhancing efficiency. The company plans to reveal more about its AI strategies at the upcoming LlamaCon conference on April 29, 2025. Despite being labeled as “open-source,” the Llama 4 license imposes restrictions on commercial use, requiring large entities to seek permission from Meta.
Meta’s introduction of the Llama 4 models marks a significant step in AI development, with a focus on performance and resource efficiency. The company aims to position itself competitively against other leading AI technologies while addressing the complexities of licensing in the open-source domain.