[T@W intro] Drew Jaegle — Long-Context Anymodal Generation with Perceivers

preview_player
Показать описание
Drew Jaegle (Research Scientist at DeepMind) gives an overview of his upcoming talk at the Transformers at Work workshop.

Title: Long-Context Anymodal Generation with Perceivers

Abstract: A central goal of Artificial Intelligence is the development of systems that flexibly process data from any modality for any task. Perceivers are a family of architectures that scale well to very large inputs in many modalities by encoding data to a latent bottleneck. But latent-space encoding handles all elements in a single pass, while autoregressive generation which has become the go-to tool for generation in language and many other domains - assumes processing happens one element at a time. I will describe Perceiver AR, a recently proposed long-context autoregressive model that avoids these problems by carefully restructuring the Perceiver latent space. Perceiver AR obtains state-of-the-art performance on generation benchmarks on images, language, and music, while scaling to inputs several orders of magnitude longer than Transformer-XL, even when using very deep architectures. Perceiver AR's long context window allows it to easily support data without a natural left to right ordering, and its latent structure allows compute budget to be adapted at eval time for either improved performance or reduced generation time.
Рекомендации по теме
visit shbcf.ru