DeepSeek tests “sparse attention” to slash AI processing costs

Ever wonder why ChatGPT slows down during long conversations? The culprit is a fundamental mathematical challenge: processing long sequences of text requires massive computational resources, even with the efficiency tricks that companies have already deployed. While US tech giants can afford to throw more hardware at the problem, Chinese AI company DeepSeek, which is cut off from a steady supply of some advanced AI chips by export restrictions, has extra motivation to squeeze more performance from less silicon.

On Monday, DeepSeek released an experimental version of its latest simulated reasoning language model, DeepSeek-V3.2-Exp, which introduces what it calls "DeepSeek Sparse Attention" (DSA). It's the company's implementation of a computational technique likely already used in some of the world's most prominent AI models. OpenAI pioneered sparse transformers in 2019 and used the technique to build GPT-3, while Google Research published work on "Reformer" models using similar concepts ...

Copyright of this story solely belongs to arstechnica.com . To see the full text click HERE

Share: