Integration Brings Cerebras Inference Capabilities to Hugging Face Hub — Campus Technology

You are currently viewing Integration Brings Cerebras Inference Capabilities to Hugging Face Hub — Campus Technology

Integration Brings Cerebras Inference Capabilities to Hugging Face Hub

AI {hardware} firm Cerebras has teamed up with Hugging Face, the open supply platform and group for machine studying, to combine its inference capabilities into the Hugging Face Hub. This collaboration offers greater than 5 million builders with entry to fashions operating on Cerebras’ CS-3 system, the businesses stated in an announcement, with reported inference speeds considerably larger than standard GPU options.

Cerebras Inference, now obtainable on Hugging Face, processes greater than 2,000 tokens per second. Latest benchmarks point out that fashions akin to Llama 3.3 70B operating on Cerebras’ system can attain speeds exceeding 2,200 tokens per second, providing a efficiency enhance in comparison with main GPU-based options.

“By making Cerebras Inference obtainable by Hugging Face, we’re enabling builders to entry different infrastructure for open supply AI fashions,” stated Andrew Feldman, CEO of Cerebras, in an announcement.

For Hugging Face’s 5 million builders, this integration offers a streamlined technique to leverage Cerebras’ expertise. Customers can choose “Cerebras” as their inference supplier throughout the Hugging Face platform, immediately accessing one of many {industry}’s quickest inference capabilities.

The demand for high-speed, high-accuracy AI inference is rising, particularly for test-time compute and agentic AI purposes. Open supply fashions optimized for Cerebras’ CS-3 structure allow sooner and extra exact AI reasoning, the businesses stated, with velocity positive aspects starting from 10 to 70 instances in comparison with GPUs.

“Cerebras has been a frontrunner in inference velocity and efficiency, and we’re thrilled to companion to carry this industry-leading inference on open supply fashions to our developer group,” commented Julien Chaumond, CTO of Hugging Face.

Builders can entry Cerebras-powered AI inference by deciding on supported fashions on Hugging Face, akin to Llama 3.3 70B, and selecting Cerebras as their inference supplier.

Concerning the Creator



John K. Waters is the editor in chief of numerous Converge360.com websites, with a deal with high-end improvement, AI and future tech. He is been writing about cutting-edge applied sciences and tradition of Silicon Valley for greater than two many years, and he is written greater than a dozen books. He additionally co-scripted the documentary movie Silicon Valley: A 100 Yr Renaissance, which aired on PBS.  He might be reached at [email protected].



Source link

Leave a Reply