AWS, Microsoft, Google, Others Make DeepSeek-R1 AI Mannequin Out there on Their Platforms
Main cloud service suppliers at the moment are making the open supply DeepSeek-R1 reasoning mannequin out there on their platforms. The Chinese language startup generated intense curiosity for its means to leverage extra environment friendly processing and cut back compute useful resource consumption, which is a key driver of excessive AI prices.
Amazon Web Services (AWS), Microsoft, and Google Cloud have all made the mannequin out there to their prospects, however as of this writing that they had but to implement the per-token pricing construction used for different AI fashions corresponding to Meta’s Llama 3.
As a substitute, DeepSeek-R1 customers on these cloud platforms pay just for the computing sources they eat, reasonably than for the quantity of textual content the mannequin generates. AWS and Google have reported that this method aligns with current pricing fashions for open-source AI.
DeepSeek launched its newest DeepSeek-V3 mannequin in December 2024. It was adopted by the discharge of DeepSeek-R1, DeepSeek-R1-Zero, and DeepSeek-R1-Distill on Jan. 20, 2025. The DeepSeek-R1-Zero mannequin reportedly options 671 billion parameters, and the DeepSeek-R1-Distill lineup presents fashions starting from 1.5 billion to 70 billion parameters. On January 27, 2025, the corporate expanded its portfolio with Janus-Professional-7B, a vision-based AI mannequin.
DeepSeek-R1 is positioned as a cost-efficient various to proprietary AI fashions, significantly for organizations with large-scale AI deployments. The mannequin was designed to course of data extra effectively, decreasing the general compute burden.
Nonetheless, cloud suppliers might in the end revenue extra from infrastructure leases than direct mannequin utilization charges, trade watchers have noticed. And renting cloud servers for AI workloads typically prices greater than accessing fashions by way of APIs. AWS, for instance, fees as much as $124 per hour for an AI-optimized cloud server, which interprets to almost $90,000 per thirty days for steady utilization. Microsoft Azure prospects don’t must hire devoted servers for DeepSeek, however they nonetheless pay for underlying computing energy, resulting in variable pricing relying on how effectively they run the mannequin.
In distinction, organizations utilizing Meta’s Llama 3.1 via AWS pay $3 per 1 million tokens, a considerably decrease upfront price for these with intermittent AI wants. Tokens symbolize processed textual content, with 1,000 tokens equal to roughly 750 phrases, in response to AI infrastructure supplier Anyscale.
Smaller cloud suppliers, together with Together AI and Fireworks AI, have already carried out fastened per-token pricing for DeepSeek-R1, a construction that would change into extra frequent as demand for cost-effective AI fashions grows.
For organizations looking for the bottom price, DeepSeek-R1 is on the market by way of its mum or dad firm’s API at $2.19 per million tokens — three to 4 occasions cheaper than some Western cloud suppliers. Nonetheless, routing AI workloads via Chinese language servers raises knowledge privateness and safety considerations. Delicate enterprise data could possibly be topic to Chinese language authorities rules, together with potential knowledge sharing below native legal guidelines. And plenty of organizations are cautious about sending proprietary or buyer knowledge to servers exterior their jurisdiction, particularly in areas with much less stringent privateness protections.
AWS, Microsoft, and Google haven’t disclosed what number of prospects are actively utilizing DeepSeek-R1.
Concerning the Writer
John K. Waters is the editor in chief of quite a lot of Converge360.com websites, with a concentrate on high-end improvement, AI and future tech. He is been writing about cutting-edge applied sciences and tradition of Silicon Valley for greater than two many years, and he is written greater than a dozen books. He additionally co-scripted the documentary movie Silicon Valley: A 100 12 months Renaissance, which aired on PBS. He might be reached at [email protected].