DeepSeek’s conditional memory fixes silent LLM waste: GPU cycles lost to static lookups

13 hours ago venturebeat

When an enterprise LLM retrieves a product name, technical specification, or standard contract clause, it's using expensive GPU computation designed for complex reasoning — just to access static information. This happens millions of times per day. Each lookup wastes cycles and inflates infrastructure costs.

DeepSeek's newly released research on "conditional memory" addresses this architectural limitation directly. The work introduces Engram, a module that separates static pattern retrieval from dynamic reasoning. It delivers results that challenge assumptions about what memory is actually for in neural networks. The paper was co-authored by DeepSeek founder Liang Wenfeng.

Through systematic experiments DeepSeek found the optimal balance between computation and memory with 75% of sparse model capacity allocated to dynamic reasoning and 25% to static lookups. This memory system improved reasoning more than knowledge retrieval.

Complex reasoning benchmarks jumped from 70% to 74% accuracy, while knowledge-focused tests improved from 57% to 61%. These improvements ...

Copyright of this story solely belongs to venturebeat . To see the full text click HERE

Share: