Here’s a number that’s hard to even conceptualize: 5 quintillion. That’s 5 followed by 18 zeros. And according to DigitalOcean CEO Paddy Srinivasan, that’s roughly how many AI inference tokens the world will chew through annually by 2030.
The driving force behind this projection isn’t humans chatting with chatbots. It’s AI agents talking to other AI agents, autonomously generating workloads that dwarf anything we’ve seen from the current generation of AI products. Srinivasan estimates that autonomous agents will account for approximately 70% of all tokens generated per year by the end of the decade.
The scale of what’s coming
To put Srinivasan’s prediction in context, consider the current baseline. Around 50 trillion inference tokens are processed daily right now. DigitalOcean’s projections suggest that number will balloon to over 500 trillion daily by 2030. That’s a tenfold increase in daily throughput, and it maps to the 4 to 5 quintillion annual figure.
Goldman Sachs appears to be singing from a similar hymn sheet. The investment bank’s May 2026 report forecasts a 24-fold increase in monthly token consumption, projecting it will reach 120 quadrillion tokens per month by 2030.
According to DigitalOcean, agentic workloads use 4 times more CPU and 15 times more tokens than traditional AI interactions.
DigitalOcean is betting big on the shift
Srinivasan isn’t just making predictions from the sidelines. DigitalOcean launched its AI-Native Cloud platform in April 2026, purpose-built to handle the specific demands of agent-driven workloads. The platform targets the infrastructure layer that autonomous agents need: more compute, more memory, more everything.
In Q1 2026, DigitalOcean’s revenue run-rate from AI customers hit $170 million, representing a 221% year-over-year increase.
DigitalOcean has historically carved out its niche by serving small and mid-sized developers, the companies too big for a shared hosting plan but too small for a bespoke AWS architecture. The AI-Native Cloud play represents a strategic pivot toward high-margin inference services.
As AI agents become more sophisticated, they generate compound workloads. One agent might call another agent, which queries a database, which triggers a third agent to synthesize results. Each step in that chain consumes tokens and compute.
What this means for investors
Human-driven AI interactions are bursty. You ask a question, get an answer, maybe follow up a few times, then close the tab. Agent-driven workloads are persistent, running continuously, scaling horizontally, and consuming resources around the clock.
For investors evaluating the AI infrastructure space, the key metric to track isn’t just total token consumption. It’s the ratio of agentic to human-driven tokens. As that ratio climbs toward the 70% threshold Srinivasan describes, the companies with infrastructure specifically optimized for persistent, compute-heavy agent workloads will likely capture disproportionate value.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

1 hour ago
2
















English (US) ·