Integration Brings Cerebras Inference Capabilities to Hugging Face Hub — Campus Technology

You are currently viewing Integration Brings Cerebras Inference Capabilities to Hugging Face Hub — Campus Technology

Integration Brings Cerebras Inference Capabilities to Hugging Face Hub

AI {hardware} firm Cerebras has teamed up with Hugging Face, the open supply platform and neighborhood for machine studying, to combine its inference capabilities into the Hugging Face Hub. This collaboration offers greater than 5 million builders with entry to fashions operating on Cerebras’ CS-3 system, the businesses mentioned in an announcement, with reported inference speeds considerably increased than standard GPU options.

Cerebras Inference, now accessible on Hugging Face, processes greater than 2,000 tokens per second. Latest benchmarks point out that fashions corresponding to Llama 3.3 70B operating on Cerebras’ system can attain speeds exceeding 2,200 tokens per second, providing a efficiency improve in comparison with main GPU-based options.

“By making Cerebras Inference accessible by means of Hugging Face, we’re enabling builders to entry various infrastructure for open supply AI fashions,” mentioned Andrew Feldman, CEO of Cerebras, in an announcement.

For Hugging Face’s 5 million builders, this integration offers a streamlined method to leverage Cerebras’ expertise. Customers can choose “Cerebras” as their inference supplier throughout the Hugging Face platform, immediately accessing one of many {industry}’s quickest inference capabilities.

The demand for high-speed, high-accuracy AI inference is rising, particularly for test-time compute and agentic AI purposes. Open supply fashions optimized for Cerebras’ CS-3 structure allow sooner and extra exact AI reasoning, the businesses mentioned, with velocity beneficial properties starting from 10 to 70 instances in comparison with GPUs.

“Cerebras has been a pacesetter in inference velocity and efficiency, and we’re thrilled to associate to deliver this industry-leading inference on open supply fashions to our developer neighborhood,” commented Julien Chaumond, CTO of Hugging Face.

Builders can entry Cerebras-powered AI inference by deciding on supported fashions on Hugging Face, corresponding to Llama 3.3 70B, and selecting Cerebras as their inference supplier.

In regards to the Writer



John K. Waters is the editor in chief of various Converge360.com websites, with a give attention to high-end improvement, AI and future tech. He is been writing about cutting-edge applied sciences and tradition of Silicon Valley for greater than two many years, and he is written greater than a dozen books. He additionally co-scripted the documentary movie Silicon Valley: A 100 12 months Renaissance, which aired on PBS.  He could be reached at [email protected].



Source link

Leave a Reply