DeepSeek AI Releases Janus: A 1.3B Multimodal Model with Image Generation Capabilities

Показать описание

Researchers from DeepSeek-AI, the University of Hong Kong, and Peking University propose Janus, a novel autoregressive framework that unifies multimodal understanding and generation by employing two distinct visual encoding pathways. Unlike prior models that use a single encoder, Janus introduces a specialized pathway for each task, both of which are processed through a unified transformer. This unique design alleviates conflicts inherent in prior models and provides enhanced flexibility, enabling different encoding methods that best suit each modality. The name “Janus” aptly represents this duality, much like the Roman god, with two faces representing transitions and coexistence.

Audio Created by NotebookLLM and reviewed by real human

#opensource #artificialintelligence #neuralnetworks #datascience #ai

Marktechpost AI

Рекомендации по теме

Комментарии

is it possible to run it on a macbook pro m3 max 128gb ram?

BeastModeDR

DeepSeek AI Releases Janus: A 1.3B Multimodal Model with Image Generation Capabilities

DeepSeek AI Releases Janus: A 1.3B Multimodal Model with Image Generation Capabilities

Janus 1.3B: Small Multimodal Model for Image Generation and RAG

Skimming this week's AI paper abstracts - Oct 23, 2024