Добавить в корзинуПозвонить
Найти в Дзене
НИИ Антропогенеза

Google presented Atlas (A powerful Titan): a new architecture with long-term in-context memory that learns how to memorize the context at

Google presented Atlas (A powerful Titan): a new architecture with long-term in-context memory that learns how to memorize the context at test time Atlas even outperforms Titans, and is more effective than Transformers and modern linear RNNs in language modeling tasks It further improves the effective context length of Titans and scales to 10M context window with +80% accuracy on the BABILong benchmark Bonus: Building on Atlas ideas, we also discuss another family of models that are strict generalization of softmax attention

Google presented Atlas (A powerful Titan): a new architecture with long-term in-context memory that learns how to memorize the context at test time

Atlas even outperforms Titans, and is more effective than Transformers and modern linear RNNs in language modeling tasks

It further improves the effective context length of Titans and scales to 10M context window with +80% accuracy on the BABILong benchmark

Bonus: Building on Atlas ideas, we also discuss another family of models that are strict generalization of softmax attention