Search
Logo
Follow
Subscribe
Logo
Subscribe

Jun 7, 2026

As agent use grows, Cisco targets the token budget problem

Cisco is building observability and control tools across every layer of the AI stack to help enterprises manage token consumption.

As agent use grows, Cisco targets the token budget problem

At Cisco Live 2026, company executives made token cost management a central theme, arguing that enterprises are spending on AI without sufficient visibility into what they are consuming or what they are getting for it.

Cisco Chief Product Officer Jeetu Patel described a layered assessment model that spans GPU utilization, model performance, application behavior, and agent activity, with token spend as the connective thread across all of them.

The core commercial concern is that organizations are reaching the end of budget cycles having spent more than they expected, without corresponding gains in output or value.

The token billing model that now dominates AI services is structurally different from the fixed-cost software licensing most enterprises are accustomed to. Costs vary with prompt length, task complexity, and reasoning depth. Chain-of-thought reasoning, which enables more accurate model responses, can consume up to a hundred times more tokens per inference than standard tasks.

As AI becomes embedded in more workflows and as more capable models get deployed, consumption tends to rise even as per-token prices fall. This is a version of Jevons' Paradox: cheaper tokens do not necessarily mean lower total spending, because cheaper access encourages more expansive use. The result is that budgeting for AI looks less like procuring software and more like managing a utility with unpredictable and variable demand.

Cisco's response is to position itself as the observability and control layer for this problem. The stack assessment Patel described is not a product announcement in itself but a framing for how Cisco intends to organize a set of tools across infrastructure, model, application, and agent layers.

The most operationally specific element he described is agent behavior monitoring, which Cisco is building in part through its recent acquisition of Galileo, a company that developed models capable of detecting when an agent is operating outside its expected parameters.

Patel described the ability to intercept and terminate an agent that is consuming tokens beyond defined limits. The Galileo acquisition is a build-versus-buy decision in favor of buying specialized capability, rather than developing behavioral evaluation models internally. That choice reflects both the speed at which the agentic AI market is moving and the difficulty of building reliable behavior assessment from scratch.

Patel's broader argument is that enterprises are still in an early phase where they are learning to use AI effectively, and that token spending during this phase does not yet produce proportionate value. His framing is that value accretes meaningfully only after organizations get past the familiarization stage, and that the risk to the market is a pullback if costs and outputs stay out of alignment for too long.

For Cisco, the commercial opportunity is to reduce that risk by giving enterprises the instrumentation to manage spending before it becomes a boardroom problem.

Stay in the loop!

  • Subscribe to Uplink for free
  • Follow us on LinkedIn

Keep reading


AWS ditches fat tree routing with new resilient network graph

Jun 7, 2026

AWS ditches fat tree routing with new resilient network graph

AWS says its new Resilient Network Graphs architecture delivers one-third more throughput from 69% fewer routers.

Read More
arrow-square-up-right
VoidZero acquisition gives Cloudflare control of the JavaScript build stack

Jun 7, 2026

VoidZero acquisition gives Cloudflare control of the JavaScript build stack

The deal gives Cloudflare direct control over tooling used by millions of JavaScript developers.

Read More
arrow-square-up-right
Megaport expands into storage, targeting AI and backup workloads

Jun 7, 2026

Megaport expands into storage, targeting AI and backup workloads

Megaport's storage launch, combined with its Latitude.sh acquisition, is an attempt to compete with hyperscalers.

Read More
arrow-square-up-right
T-Mobile uses AI to adapt network capacity during live events

Jun 5, 2026

T-Mobile uses AI to adapt network capacity during live events

Dynamic CX monitors publicly available event data to pre-position network resources before large crowds arrive.

Read More
arrow-square-up-right
Google and IBM expand AI agent partnership

Jun 4, 2026

Google and IBM expand AI agent partnership

Google Cloud and IBM are building a shared portfolio of vertical AI agents, targeting banking, telecom, retail, and other sectors

Read More
arrow-square-up-right
Load more

Data Center

AWS ditches fat tree routing with new resilient network graph

AWS says its new Resilient Network Graphs architecture delivers one-third more throughput from 69% fewer routers.

M&A

VoidZero acquisition gives Cloudflare control of the JavaScript build stack

The deal gives Cloudflare direct control over tooling used by millions of JavaScript developers.

Storage

Megaport expands into storage, targeting AI and backup workloads

Megaport's storage launch, combined with its Latitude.sh acquisition, is an attempt to compete with hyperscalers.

AI

T-Mobile uses AI to adapt network capacity during live events

Dynamic CX monitors publicly available event data to pre-position network resources before large crowds arrive.

AI

Google and IBM expand AI agent partnership

Google Cloud and IBM are building a shared portfolio of vertical AI agents, targeting banking, telecom, retail, and other sectors

Business

Networking and AI demand drive HPE to earnings beat

A record $10.7 billion quarter and surging networking orders give HPE the numbers needed to defend the Juniper acquisition.

DevOps

Microsoft brings Linux command line utilities to Windows 11

Coreutils reflects Microsoft's sustained effort to position Windows as a first-class platform for software development

AI

Intel bets on power efficiency with new data center chips

Intel's first major data center releases under new CEO Lip-Bu Tan signal a deliberate shift away from competing on raw performance.

Emerging

Forward Networks launches Predict to verify changes before deployment

Forward Predict runs proposed configuration changes against a mathematically modeled replica of the production network.

Jun 7, 2026

As agent use grows, Cisco targets the token budget problem

Cisco is building observability and control tools across every layer of the AI stack to help enterprises manage token consumption.

As agent use grows, Cisco targets the token budget problem

At Cisco Live 2026, company executives made token cost management a central theme, arguing that enterprises are spending on AI without sufficient visibility into what they are consuming or what they are getting for it.

Cisco Chief Product Officer Jeetu Patel described a layered assessment model that spans GPU utilization, model performance, application behavior, and agent activity, with token spend as the connective thread across all of them.

The core commercial concern is that organizations are reaching the end of budget cycles having spent more than they expected, without corresponding gains in output or value.

The token billing model that now dominates AI services is structurally different from the fixed-cost software licensing most enterprises are accustomed to. Costs vary with prompt length, task complexity, and reasoning depth. Chain-of-thought reasoning, which enables more accurate model responses, can consume up to a hundred times more tokens per inference than standard tasks.

As AI becomes embedded in more workflows and as more capable models get deployed, consumption tends to rise even as per-token prices fall. This is a version of Jevons' Paradox: cheaper tokens do not necessarily mean lower total spending, because cheaper access encourages more expansive use. The result is that budgeting for AI looks less like procuring software and more like managing a utility with unpredictable and variable demand.

Cisco's response is to position itself as the observability and control layer for this problem. The stack assessment Patel described is not a product announcement in itself but a framing for how Cisco intends to organize a set of tools across infrastructure, model, application, and agent layers.

The most operationally specific element he described is agent behavior monitoring, which Cisco is building in part through its recent acquisition of Galileo, a company that developed models capable of detecting when an agent is operating outside its expected parameters.

Patel described the ability to intercept and terminate an agent that is consuming tokens beyond defined limits. The Galileo acquisition is a build-versus-buy decision in favor of buying specialized capability, rather than developing behavioral evaluation models internally. That choice reflects both the speed at which the agentic AI market is moving and the difficulty of building reliable behavior assessment from scratch.

Patel's broader argument is that enterprises are still in an early phase where they are learning to use AI effectively, and that token spending during this phase does not yet produce proportionate value. His framing is that value accretes meaningfully only after organizations get past the familiarization stage, and that the risk to the market is a pullback if costs and outputs stay out of alignment for too long.

For Cisco, the commercial opportunity is to reduce that risk by giving enterprises the instrumentation to manage spending before it becomes a boardroom problem.

Stay in the loop!

  • Subscribe to Uplink for free
  • Follow us on LinkedIn

Keep reading


M&A

VoidZero acquisition gives Cloudflare control of the JavaScript build stack

The deal gives Cloudflare direct control over tooling used by millions of JavaScript developers.

Storage

Megaport expands into storage, targeting AI and backup workloads

Megaport's storage launch, combined with its Latitude.sh acquisition, is an attempt to compete with hyperscalers.

AI

T-Mobile uses AI to adapt network capacity during live events

Dynamic CX monitors publicly available event data to pre-position network resources before large crowds arrive.

AI

Google and IBM expand AI agent partnership

Google Cloud and IBM are building a shared portfolio of vertical AI agents, targeting banking, telecom, retail, and other sectors

Business

Networking and AI demand drive HPE to earnings beat

A record $10.7 billion quarter and surging networking orders give HPE the numbers needed to defend the Juniper acquisition.

DevOps

Microsoft brings Linux command line utilities to Windows 11

Coreutils reflects Microsoft's sustained effort to position Windows as a first-class platform for software development

AI

Intel bets on power efficiency with new data center chips

Intel's first major data center releases under new CEO Lip-Bu Tan signal a deliberate shift away from competing on raw performance.

Emerging

Forward Networks launches Predict to verify changes before deployment

Forward Predict runs proposed configuration changes against a mathematically modeled replica of the production network.

Policy

FCC pushes harder on spectrum deployment with EchoStar deal

The FCC attached strict buildout requirements as it approved $40 billion in spectrum transfers to SpaceX and AT&T.

Product

Palo Alto folds CyberArk into broader identity platform

The new Idira platform extends privileged access controls to machine identities, workloads, and AI agents.

Business

Cisco layoffs reflect the AI reshaping underway across tech

Even amid strong earnings growth, Cisco says it needs a leaner structure to compete in the AI market.

Not all loops are bad. Uplink keeps you in the ones that matter.

Uplink is free, weekly newsletter covering the business of enterprise networking.

Explore





© 2026 Uplink.
Report abusePrivacy policyTerms of use
beehiivPowered by beehiiv