Apple’s Self-Developed Multimodal AI Model Manzano: Combining Understanding and Generative Capabilities

AI Daily News updated 3d ago dongdong

18 0

Apple is developing a multimodal AI model called Manzano, which integrates both image understanding and generation capabilities, aiming to address the trade-offs existing models face when handling visual tasks. Manzano employs a hybrid image tokenizer that produces both continuous and discrete tokens through a shared encoder, reducing task conflicts. Its architecture consists of a hybrid tokenizer, a unified language model, and an independent image decoder. The model’s parameter scale ranges from 900 million to 3.52 billion, supporting multiple resolutions.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

AI Code Review Startup CodeRabbit Secures $60 Million in Series B Funding

AI Code Review Startup CodeRabbit Secures $60 Million in Series B Funding

2w ago

0270

OpenAI Launches “Instant Checkout” Feature, Turning ChatGPT into an All-in-One Shopping and Payment Tool

OpenAI Launches “Instant Checkout” Feature, Turning ChatGPT into an All-in-One Shopping and Payment Tool

3d ago

0170

Seele AI: World’s First End-to-End AI 3D Game Generation Tool Goes Live

Seele AI: World’s First End-to-End AI 3D Game Generation Tool Goes Live

2m ago

01090

Kunlun Wanwei Releases and Open-Sources New Inference Large Model MindLink

Kunlun Wanwei Releases and Open-Sources New Inference Large Model MindLink

2m ago

0680

No comments yet...

none

No comments yet...