Alibaba Group has introduced a new version of its artificial intelligence technology, Qwen VLo, which enables users to create and edit images using both text and visual inputs. This release is part of the company's broader range of AI services.
This new model is an upgrade of the earlier Qwen2.5-VL and generates both text-to-image and image-to-image. Through a fascinating technology called progressive generation, the users will be able to see the process as an image is created.
In a post on X the company announced the release of new model along with its features and a link to access it.
According to the blog post by the company on gitihub, Qwen VLo is a unified multimodal understanding and generation model. Not only does it “understands the world” but it also generates high quality images based on that understanding.
Text-to-image and image-to-image generation
Through Qwen Vlo you can directly send the prompt like ‘generate a picture of dog’ or ‘upload an image of a dog’ and ask to make edits in the image. According to the blogpost, the previous models struggled with semantic inconsistencies like misinterpreting a car as another object or failing to retain key features of the car. With Qwen VLo the company fixed that and it can correctly identify the key features of a car like its model, color etc.
Open ended instruction based editing
While editing an image Qwen VLo will respond to open ended instructions like add a sun to the sky or make the photo look like 19th century. It even allows the user to make traditional perception tasks like predicting depth maps, segmentation maps, detection maps, and edge information. It can perform multiple of these editing functions at the same time.
Multilingual support for prompt
While giving instructions the user will be free to write in multiple languages, including in Chinese and English. According to the company, the model will understand instructions regardless of the language.
Alibaba, popularly known for its e-commerce services, has been integrating AI and building standalone offerings around Qwen. In February, Chief Executive Officer Eddie Wu went so far as to say the company’s “primary objective” is now artificial general intelligence, meaning a goal to build AI systems with human-level intellectual capabilities.
Source: Hindustan Times
Bd-pratidin English/ ANI