Snowflake Cortex AI to Host Llama 3.1 Collection of Multilingual Open Source LLMs

July 23, 2024

Snowflake will host the Llama 3.1 collection of multilingual open source large language models (LLMs) in Snowflake Cortex AI for enterprises to easily harness and build powerful AI applications at scale.

This offering includes Meta’s largest and most powerful open source LLM, Llama 3.1 405B, with Snowflake developing and open sourcing the inference system stack to enable real-time, high-throughput inference and further democratize powerful natural language processing and generation applications. Snowflake’s AI Research Team has optimized Llama 3.1 405B for both inference and fine-tuning, supporting a massive 128K context window from day one, while enabling real-time inference with up to 3x lower end-to-end latency and 1.4x higher throughput than existing open source solutions. Moreover, it allows for fine-tuning on the massive model using just a single GPU node — eliminating costs and complexity for developers and users — all within Cortex AI.

By partnering with Meta, Snowflake is providing customers with easy, efficient, and trusted ways to seamlessly access, fine-tune, and deploy Meta’s newest models in the AI Data Cloud, with a comprehensive approach to trust and safety built-in at the foundational level.

“Snowflake’s world-class AI Research Team is blazing a trail for how enterprises and the open source community can harness state-of-the-art open models like Llama 3.1 405B for inference and fine-tuning in a way that maximizes efficiency,” said Vivek Raghunathan, VP of AI Engineering, Snowflake. “We’re not just bringing Meta’s cutting-edge models directly to our customers through Snowflake Cortex AI. We’re arming enterprises and the AI community with new research and open source code that supports 128K context windows, multi-node inference, pipeline parallelism, 8-bit floating point quantization, and more to advance AI for the broader ecosystem.”

Snowflake’s AI Research Team continues to push the boundaries of open source innovations through its regular contributions to the AI community and transparency around how it is building cutting-edge LLM technologies. In tandem with the launch of Llama 3.1 405B, Snowflake’s AI Research Team is now open sourcing its Massive LLM Inference and Fine-Tuning System Optimization Stack in collaboration with DeepSpeed, Hugging Face, vLLM, and the broader AI community. This breakthrough establishes a new state-of-the-art for open source inference and fine-tuning systems for multi-hundred billion parameter models.

Snowflake’s Massive LLM Inference and Fine-Tuning System Optimization Stack uses advanced parallelism techniques and memory optimizations, enabling fast and efficient AI processing, without needing complex and expensive infrastructure. For Llama 3.1 405B, Snowflake’s system stack delivers real-time, high-throughput performance on just a single GPU node and supports a massive 128k context windows across multi-node setups. This flexibility extends to both next-generation and legacy hardware, making it accessible to a broader range of businesses. Moreover, data scientists can fine-tune Llama 3.1 405B using mixed precision techniques on fewer GPUs, eliminating the need for large GPU clusters. As a result, organizations can adapt and deploy powerful enterprise-grade generative AI applications easily, efficiently, and safely.

Snowflake’s AI Research Team has also developed optimized infrastructure for fine-tuning inclusive of model distillation, safety guardrails, retrieval augmented generation (RAG), and synthetic data generation so that enterprises can easily get started with these use cases within Cortex AI.

Snowflake is making Snowflake Cortex Guard generally available to further safeguard against harmful content for any LLM application or asset built in Cortex AI — either using Meta’s latest models, or the LLMs available from other leading providers including AI21 Labs, Google, Mistral AI, Reka, and Snowflake itself. Cortex Guard leverages Meta’s Llama Guard 2, further unlocking trusted AI for enterprises so they can ensure that the models they’re using are safe.

Industry News

SmartBear Names New CEO

January 08, 2025

SmartBear has appointed Dan Faulkner, the company’s Chief Product Officer, as Chief Executive Officer.

Horizon3.ai Launches NodeZero Kubernetes Pentesting

January 07, 2025

Horizon3.ai announced the release of NodeZero™ Kubernetes Pentesting, a new capability available to all NodeZero users.

GitHub Copilot Free Released

January 06, 2025

GitHub announced GitHub Copilot Free.

Veracode Acquires Phylum

January 06, 2025

Veracode acquired certain assets of Phylum, including its malicious package analysis, detection, and mitigation technology.

Haveli Investments Completes Acquisition of AppViewX

January 06, 2025

AppViewX announced the completion of its acquisition by Haveli Investments.

Check Point Software Recognized as a Leader in Email Security in Inaugural Gartner Magic Quadrant for Email Security Platforms

December 19, 2024

Check Point® Software Technologies Ltd. has been recognized as a Leader in the 2024 Gartner® Magic Quadrant™ for Email Security Platforms (ESP).

Progress ShareFile Selected as Latest Addition to American Institute of CPAs Member Discount Program, Offering Benefits for AICPA Members

December 19, 2024

Progress announced its partnership with the American Institute of CPAs (AICPA), the world’s largest member association representing the CPA profession.

Kurrent Enterprise Edition Released

December 18, 2024

Kurrent announced $12 million in funding, its rebrand from Event Store and the official launch of Kurrent Enterprise Edition, now commercially available.

Blitzy Platform Released

December 18, 2024

Blitzy announced the launch of the Blitzy Platform, a category-defining agentic platform that accelerates software development for enterprises by autonomously batch building up to 80% of software applications.

Sonata Software Launches IntellQA

December 17, 2024

Sonata Software launched IntellQA, a Harmoni.AI powered testing automation and acceleration platform designed to transform software delivery for global enterprises.

Sonar to Acquire Tidelift

December 17, 2024

Sonar signed a definitive agreement to acquire Tidelift, a provider of software supply chain security solutions that help organizations manage the risk of open source software.

Kindo Launches Channel Partner Program

December 17, 2024

Kindo formally launched its channel partner program.

Red Hat Enterprise Linux AI 1.3 Released

December 16, 2024

Red Hat announced the latest release of Red Hat Enterprise Linux AI (RHEL AI), Red Hat’s foundation model platform for more seamlessly developing, testing and running generative artificial intelligence (gen AI) models for enterprise applications.

Fastly AI Accelerator Released

December 16, 2024

Fastly announced the general availability of Fastly AI Accelerator.

Amazon Q Developer Plugins for AWS Management Console Released

December 12, 2024

Amazon Web Services (AWS) announced the launch and general availability of Amazon Q Developer plugins for Datadog and Wiz in the AWS Management Console.

DEVOPSdigest

Industry News

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

The Latest

Hot Topics

Industry News

Search form

Upcoming Webinars

On-Demand Webinars

Analyst Reports

White Papers

Media Partners

User login

The Latest

Hot Topics