Using AI to click around on a website burns 45x as many tokens as just using APIs
theregister.co.ukBusinesses deploying AI agents to automate computer usage may be spending far more money than necessary if those agents try to emulate human visual interaction.
Reflex, an enterprise application platform, recently set out to compare vision agents with API agents.
A vision agent in this context refers to an AI agent that mimics human interaction by relying on image processing and optical character recognition to operate an application. In this instance, that's Claude Sonnet navigating a web app user interface via browser-use 0.12, a tool for automated web browser operation.
An API agent here refers to Claude Sonnet interacting with a web app via tools and APIs. The agent calls the same handling mechanisms that the UI calls and receives structured data in response, rather than a web page screenshot that must be analyzed.
"Two agents target the same running app: one drives the UI via screenshots and ...
Copyright of this story solely belongs to theregister.co.uk . To see the full text click HERE

