Tech »  Topic »  8 billion tokens a day forced AT&T to rethink AI orchestration — and cut costs by 90%

8 billion tokens a day forced AT&T to rethink AI orchestration — and cut costs by 90%


When your average daily token usage is 8 billion a day, you have a massive scale problem. This was the case at AT&T, and chief data officer Andy Markus and his team recognized that it simply wasn’t feasible (or economical) to push everything through large reasoning models. So, when building out an internal Ask AT&T personal assistant, they reconstructed the orchestration layer. The result: A multi-agent stack built on LangChain where large language model “super agents” direct smaller, underlying “worker” agents performing more concise, purpose-driven work. This flexible orchestration layer has dramatically improved latency, speed and response times, Markus told VentureBeat. Most notably, his team has seen up to 90% cost savings. “I believe the future of agentic AI is many, many, many small language models (SLMs),” he said. “We find small language models to be just about as accurate, if not as accurate, as a large language model ...


Copyright of this story solely belongs to venturebeat . To see the full text click HERE