Snowflake Cortex AI to Host Llama 3.1 Collection of Multilingual Open Source LLMs

July 23, 2024

Snowflake will host the Llama 3.1 collection of multilingual open source large language models (LLMs) in Snowflake Cortex AI for enterprises to easily harness and build powerful AI applications at scale.

This offering includes Meta’s largest and most powerful open source LLM, Llama 3.1 405B, with Snowflake developing and open sourcing the inference system stack to enable real-time, high-throughput inference and further democratize powerful natural language processing and generation applications. Snowflake’s AI Research Team has optimized Llama 3.1 405B for both inference and fine-tuning, supporting a massive 128K context window from day one, while enabling real-time inference with up to 3x lower end-to-end latency and 1.4x higher throughput than existing open source solutions. Moreover, it allows for fine-tuning on the massive model using just a single GPU node — eliminating costs and complexity for developers and users — all within Cortex AI.

By partnering with Meta, Snowflake is providing customers with easy, efficient, and trusted ways to seamlessly access, fine-tune, and deploy Meta’s newest models in the AI Data Cloud, with a comprehensive approach to trust and safety built-in at the foundational level.

“Snowflake’s world-class AI Research Team is blazing a trail for how enterprises and the open source community can harness state-of-the-art open models like Llama 3.1 405B for inference and fine-tuning in a way that maximizes efficiency,” said Vivek Raghunathan, VP of AI Engineering, Snowflake. “We’re not just bringing Meta’s cutting-edge models directly to our customers through Snowflake Cortex AI. We’re arming enterprises and the AI community with new research and open source code that supports 128K context windows, multi-node inference, pipeline parallelism, 8-bit floating point quantization, and more to advance AI for the broader ecosystem.”

Snowflake’s AI Research Team continues to push the boundaries of open source innovations through its regular contributions to the AI community and transparency around how it is building cutting-edge LLM technologies. In tandem with the launch of Llama 3.1 405B, Snowflake’s AI Research Team is now open sourcing its Massive LLM Inference and Fine-Tuning System Optimization Stack in collaboration with DeepSpeed, Hugging Face, vLLM, and the broader AI community. This breakthrough establishes a new state-of-the-art for open source inference and fine-tuning systems for multi-hundred billion parameter models.

Snowflake’s Massive LLM Inference and Fine-Tuning System Optimization Stack uses advanced parallelism techniques and memory optimizations, enabling fast and efficient AI processing, without needing complex and expensive infrastructure. For Llama 3.1 405B, Snowflake’s system stack delivers real-time, high-throughput performance on just a single GPU node and supports a massive 128k context windows across multi-node setups. This flexibility extends to both next-generation and legacy hardware, making it accessible to a broader range of businesses. Moreover, data scientists can fine-tune Llama 3.1 405B using mixed precision techniques on fewer GPUs, eliminating the need for large GPU clusters. As a result, organizations can adapt and deploy powerful enterprise-grade generative AI applications easily, efficiently, and safely.

Snowflake’s AI Research Team has also developed optimized infrastructure for fine-tuning inclusive of model distillation, safety guardrails, retrieval augmented generation (RAG), and synthetic data generation so that enterprises can easily get started with these use cases within Cortex AI.

Snowflake is making Snowflake Cortex Guard generally available to further safeguard against harmful content for any LLM application or asset built in Cortex AI — either using Meta’s latest models, or the LLMs available from other leading providers including AI21 Labs, Google, Mistral AI, Reka, and Snowflake itself. Cortex Guard leverages Meta’s Llama Guard 2, further unlocking trusted AI for enterprises so they can ensure that the models they’re using are safe.

Industry News

Check Point Software Emerges as a Leader and Only Outperformer Among 14 Vendors in Enterprise Firewalls, According to Latest GigaOm Radar Report

April 21, 2025

Check Point® Software Technologies Ltd.(link is external) announced that it has ranked as a Leader and the only Outperformer for its Check Point Quantum(link is external) Security Solutions in GigaOm’s latest Radar for Enterprise Firewall report(link is external).

Postman Introduces Bring Your Own Key (BYOK) Encryption and Spec Hub

April 21, 2025

Postman announced new releases designed to help organizations build APIs faster, more securely, and with less friction.

SnapLogic Releases AgentCreator 3.0

April 21, 2025

SnapLogic announced AgentCreator 3.0, an evolution in agentic AI technology that eliminates the complexity of enterprise AI adoption.

GitLab Duo with Amazon Q Released

April 17, 2025

GitLab announced the general availability of GitLab Duo with Amazon Q.

Perforce Delphix Partners with Liquibase

April 17, 2025

Perforce Software and Liquibase announced a strategic partnership to enhance secure and compliant database change management for DevOps teams.

Spacelift Launches Saturnhead AI

April 17, 2025

Spacelift announced the launch of Saturnhead AI — an enterprise-grade AI assistant that slashes DevOps troubleshooting time by transforming complex infrastructure logs into clear, actionable explanations.

CodeSecure Integrates with FOSSA

April 16, 2025

CodeSecure and FOSSA announced a strategic partnership and native product integration that enables organizations to eliminate security blindspots associated with both third party and open source code.

Bauplan Launches with $7.5 Million in Seed Funding

April 16, 2025

Bauplan, a Python-first serverless data platform that transforms complex infrastructure processes into a few lines of code over data lakes, announced its launch with $7.5 million in seed funding.

Perforce Introduces Kafka Service Bundle

April 15, 2025

Perforce Software announced the launch of the Kafka Service Bundle, a new offering that provides enterprises with managed open source Apache Kafka at a fraction of the cost of traditional managed providers.

LambdaTest Launches HyperExecute MCP Server

April 14, 2025

LambdaTest announced the launch of the HyperExecute MCP Server, an enhancement to its AI-native test orchestration platform, HyperExecute.

Cloudflare Announces Workers VPC and VPC Private Link

April 14, 2025

Cloudflare announced Workers VPC and Workers VPC Private Link, new solutions that enable developers to build secure, global cross-cloud applications on Cloudflare Workers.

Nutrient Expands Cloud-Based Services

April 14, 2025

Nutrient announced a significant expansion of its cloud-based services, as well as a series of updates to its SDK products, aimed at enhancing the developer experience by allowing developers to build, scale, and innovate with less friction.

Check Point Recognized for #1 AI-Powered Cyber Security Platform by Miercom

April 10, 2025

Check Point® Software Technologies Ltd.(link is external) announced that its Infinity Platform has been named the top-ranked AI-powered cyber security platform in the 2025 Miercom Assessment.

Orca Introduces Bitbucket App

April 10, 2025

Orca Security announced the Orca Bitbucket App, a cloud-native seamless integration for scanning Bitbucket Repositories.

Live API for Gemini Models in Preview

April 10, 2025

The Live API for Gemini models is now in Preview, enabling developers to start building and testing more robust, scalable applications with significantly higher rate limits.

DEVOPSdigest

Industry News

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

The Latest

Hot Topics

Industry News

Search form

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

User login

The Latest

Hot Topics