OpenAI’s New GPT‑5.4 Surpasses Human Benchmark in Desktop Navigation and Reasoning Tests

2 days, 22 hours ago extremetech.com

OpenAI has introduced its latest frontier model, GPT‑5.4, to ChatGPT, Codex, and the API. Like every new generation, it is said to have better reasoning, agents, and long‑context professional work than its predecessor.

The model is available in two versions: the standard GPT‑5.4 and the higher‑end GPT‑5.4 Pro. OpenAI says GPT‑5.4 can handle complex knowledge tasks across more than forty professions, from finance and law to engineering and data work, and claims it replaces the need for a separate code‑specialized model in most cases.

GPT-5.4 can operate desktops and web browsers using screenshots, a mouse, and a keyboard. On the OSWorld‑Verified benchmark for GUI navigation, GPT‑5.4 reaches 75% task success, above GPT‑5.2's 47.3% and even a 72.4% human baseline on the same test.

The model also scored higher on web ...

Copyright of this story solely belongs to extremetech.com . To see the full text click HERE

Share: