Cloudflare Adds New Capabilities to Workers AI
September 26, 2024

Cloudflare announced powerful new capabilities for Workers AI, the serverless AI platform, and its suite of AI application building blocks, to help developers build faster, more powerful and more performant AI applications.

Applications built on Workers AI can now benefit from faster inference, bigger models, improved performance analytics, and more. Workers AI is the easiest platform to build global AI applications and run AI inference close to the user, no matter where in the world they are.

Cloudflare’s globally distributed network helps to minimize network latency, setting it apart from other networks that are typically made up of concentrated resources in limited data centers. Cloudflare’s serverless inference platform, Workers AI, now has GPUs in more than 180 cities around the world, built for global accessibility to provide low latency times for end users all over the world. With this network of GPUs, Workers AI has one of the largest global footprints of any AI platform, and has been designed to run AI inference locally as close to the user as possible and help keep customer data closer to home.

“As AI took off last year, no one was thinking about network speeds as a reason for AI latency, because it was still a novel, experimental interaction. But as we get closer to AI becoming a part of our daily lives, the network, and milliseconds, will matter,” said Matthew Prince, co-founder and CEO, Cloudflare. “As AI workloads shift from training to inference, performance and regional availability are going to be critical to supporting the next phase of AI. Cloudflare is the most global AI platform on the market, and having GPUs in cities around the world is going to be what takes AI from a novel toy to a part of our everyday life, just like faster Internet did for smartphones.”

Cloudflare is also introducing new capabilities that make it the easiest platform to build AI applications with:

- Upgraded performance and support for larger models: Now, Cloudflare is enhancing their global network with more powerful GPUs for Workers AI to upgrade AI inference performance and run inference on significantly larger models like Llama 3.1 70B, as well as the collection of Llama 3.2 models with 1B, 3B, 11B (and 90B soon). By supporting larger models, faster response times, and larger context windows, AI applications built on Cloudflare’s Workers AI can handle more complex tasks with greater efficiency – thus creating natural, seamless end-user experiences.

- Improved monitoring and optimizing of AI usage with persistent logs: New persistent logs in AI Gateway, available in open beta, allow developers to store users’ prompts and model responses for extended periods to better analyze and understand how their application performs. With persistent logs, developers can gain more detailed insights from users’ experiences, including cost and duration of requests, to help refine their application. Over two billion requests have traveled through AI Gateway since launch last year.

- Faster and more affordable queries: Vector databases make it easier for models to remember previous inputs, allowing machine learning to be used to power search, recommendations, and text generation use-cases. Cloudflare’s vector database, Vectorize, is now generally available, and as of August 2024 now supports indexes of up to five million vectors each, up from 200,000 previously. Median query latency is now down to 31 milliseconds (ms), compared to 549 ms. These improvements allow AI applications to find relevant information quickly with less data processing, which also means more affordable AI applications.

Share this

Industry News

September 26, 2024

Cirata announced a new release of Cirata Subversion MultiSite Plus, a DevOps solution that enables distributed teams to securely collaborate as one with no downtime or disruption.

September 26, 2024

Crowdbotics announced the availability of new capabilities of its AI-powered application development platform, aimed at addressing the most significant challenges in the application development industry.

September 26, 2024

Cloudflare announced powerful new capabilities for Workers AI, the serverless AI platform, and its suite of AI application building blocks, to help developers build faster, more powerful and more performant AI applications.

September 26, 2024

Codefesh has announced the general availability of enterprise support for Argo CD, Argo Workflows, Argo Rollouts, and Argo Events to all Argo users.

September 25, 2024

Harness announced a new product release featuring a multi-agent AI architecture designed to revolutionize workflows, increase productivity, and enhance the work experience for software developers globally.

September 25, 2024

Salt Security announced its integration with Google Cloud's Apigee API Management platform.

September 25, 2024

System Initiative announced the general availability of its technology for DevOps Automation.

September 25, 2024

Diagrid announced the public beta of Catalyst, which extends open source Dapr beyond Kubernetes to include major cloud compute platforms – making it easier to create microservice applications.

September 25, 2024

Sencha announces the launch of two innovative products: ReExt and Rapid Ext JS.

September 25, 2024

Kobiton announces its plans to provide mobile app developers with new AI-enabled testing tools.

September 24, 2024

Progress announced that Progress® Semaphore™, its metadata management and semantic AI platform, has been named the Leader and a Gold Medalist in Info-Tech Research Group's 2024 Metadata Management Data Quadrant.

September 24, 2024

GitHub Enterprise Cloud will offer a data residency feature for enterprises, starting with general availability in the European Union (EU) on October 29, 2024.

September 24, 2024

StackGen announced the availability of its generative Infrastructure from Code solution in AWS Marketplace.

September 23, 2024

ArmorCode announced the expansion of its platform with the launch of two new modules for Penetration Testing Management and Exceptions Management.