NAMMs: Sakana AI's New LLM Memory Solution

Memory acts as a pivotal element in the realm of human cognition, granting individuals the ability to sift through the incessant buzz of information that saturates daily lifeUnlike, however, the richly nuanced memory systems characteristic of human beings, large language models (LLMs) are bereft of such a capabilityThese models indiscriminately store and process all previous inputs, which can significantly hamper their performance and escalate operational costs, especially during long-duration tasks.

In a manner reminiscent of the human brain's selective memory retention, where vital information is conserved while trivial details fade into obscurity, artificial intelligence systems also require mechanisms for smart memory managementThe absence of such mechanisms can culminate in boundless demands for computational resources and memory as models grow increasingly expansive.

Researchers have long sought methods to imbue AI systems with memory capabilities that closely mirror those of humans

Advertisements

This innovation forms a critical cornerstone of NAMMs, introducing a distinctive attention mechanism that permits tokens to focus solely on “future” relevant content in the KV cacheThis unique design engenders a rivalry between tokens, allowing the model to learn the retention of the most informative tokensFor instance, when confronted with repetitive sentences or words, the model favors the most recently encountered instance, as it provides a fuller context.

Regarding the optimization strategy, the research team adopted the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithmTraditional gradient descent methods struggle with the discrete decision-making inherent in memory management problemsIn contrast, CMA-ES emulates natural evolutionary processes to directly optimize non-differentiable objective functionsThis approach employs an incremental evolution strategy, beginning with a single task and gradually expanding the number of training tasks, thereby enhancing regularization and improving the model's generalization capabilities.

The team selected Llama 3-8b as the foundational model for training NAMMs, conducting comprehensive evaluations on LongBench, InfiniteBench, and ChouBun

The results reveal that NAMMs significantly elevate the performance of Llama 3-8b Transformers, surpassing existing hand-crafted memory management techniques, namely H2O and L2.

For instance, within the LongBench benchmark testing, NAMMs reduced the KV cache size to just 25% of its original while simultaneously achieving an 11% performance enhancementIn the InfiniteBench test, the model's performance escalated from a baseline of 1.05% to 11%, all while shrinking the cache size to a mere 40% of its original dimensions.

Additionally, NAMMs exhibit a noteworthy feature—impressive zero-shot transferabilityThe research team discovered that NAMMs trained solely on language tasks could seamlessly apply to other architectures and modalitiesFor example, when utilized within the Llava Next Video-7B model, NAMMs showcased commendable performance on LongVideoBench and MLVU benchmarks, achieving a 1% performance boost in visual task capabilities and reducing video frame cache size by 72%.

In the domain of reinforcement learning, NAMMs yielded a 9% performance increase in D4RL benchmark tests utilizing decision transformers, concurrently shrinking the cache size to 81% of its original size.

Delving deeper into the operational schema of NAMMs, the research team unearthed a sophisticated memory management strategy emerging within the model

alefox

By observing the memory retention patterns at different layers, it became evident that the model tends to conserve a greater volume of older tokens within early and mid-layers—likely due to these layers’ roles in processing and aggregating information across long distancesConversely, in data-intensive coding tasks, the model learned to retain a relatively larger number of tokens.

Ultimately, NAMMs embody Sakana AI's previous research methodology, drawing insights from nature and optimizing AI systems through the mimicry of natural evolutionary processesThis research trajectory aligns seamlessly with the company's technological expertise in the domains of model merging and evolutionary optimization.

Similar to the automated “evolution” algorithm previously developed by Sakana AI, which autonomously identifies and integrates superior models, NAMMs operationalize evolutionary algorithms to enhance memory management systems, achieving continual performance improvements without human intervention.

This remarkable approach has propelled the year-old startup into a multi-billion dollar valuation, securing $210 million in Series A funding, with a total valuation reaching $1.5 billion.

Looking forward, the research team aspires to explore more complex memory model designs, potentially investigating finer-grained feature extraction techniques or examining how NAMMs can be synergized with other optimization technologies.

They stated, “This work is just the beginning of exploring the potential of our new class of memory models, and we anticipate that it may unlock a multitude of new opportunities for the evolution of Transformers in future generations.”

NAMMs: Sakana AI's New LLM Memory Solution

CATEGORIES

Recent posts

Han's M&A Faces Hurdles in Hong Kong

2024: A Milestone Year for AI

Polysilicon Production Cuts to Impact PV Market

Western Chip Firms Double Down on China

Hong Kong Dividend Plays Offer Value