The Kimi-VL-A3B-Thinking-2506 model has been released
The Dark Side of the Moon has released Kimi-VL-A3B-Thinking-2506, a vision understanding model capable of reasoning about image content from inputs of both images and text, and generating textual outputs. For example, users can upload an image of Hong Tailang hitting Hui Tailang with a frying pan and ask why Hui Tailang doesn’t resist. The 2506 version is a fine-tuned update based on Kimi-VL-A3B-Instruct, with the biggest improvement being support for image inputs up to 3.2 million pixels—4 times larger than the previous version. Its performance also surpasses that of Qwen2.5-VL-7B in various tests.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...