HEADLINES

Alibaba Cloud launches open-source large vision language model

Qwen-VL is the multimodal version of Qwen-7B, Alibaba Cloud’s 7-billion-parameter model of its large language model Tongyi Qianwen (also available on ModelScope as open-source).

Upgrade Staff

Published

September 4, 2023

Alibaba Cloud, the digital technology and intelligence backbone of Alibaba Group, launched two open-source large vision language models (LVLM), Qwen-VL and its conversationally fine-tuned Qwen-VL-Chat. The models can comprehend images, texts and bounding boxes in prompts and facilitate multi-round question answering in both English and Chinese.

Qwen-VL is the multimodal version of Qwen-7B, Alibaba Cloud’s 7-billion-parameter model of its large language model Tongyi Qianwen (also available on ModelScope as open-source). Capable of understanding both image inputs and text prompts in English and Chinese, Qwen-VL can perform various tasks such as responding to open-ended queries related to different images and generating image captions.

Qwen-VL-Chat caters to more complex interaction, such as comparing multiple image inputs and engaging in multi-round question answering. Leveraging alignment techniques, this AI assistant exhibits a range of creative capabilities, which include writing poetry and stories based on input images, summarizing the content of multiple pictures, and solving mathematical questions displayed in images.

Contribution to open source and inclusivity

Advertisement. Scroll to continue reading.

In a bid to democratize AI technologies, Alibaba Cloud has shared the model’s code, weights, and documentation with academics, researchers, and commercial institutions worldwide. This contribution to the open-source community is accessible via Alibaba’s AI model community ModelScope and the collaborative AI platform Hugging Face. For commercial uses, companies with over 100 million monthly active users can request a license from Alibaba Cloud.

The introduction of these models, with their ability to extract meaning and information from images, holds the potential to revolutionize the interaction with visual content. For instance, leveraging its image comprehension and question-answering capability, the models could provide information assistance to visually impaired individuals during online shopping in the future.

The Qwen-VL model was pre-trained on image and text datasets. Compared to other open-source large vision language models that can process and understand images in 224*224 resolution, Qwen-VL can handle image input at a resolution of 448*448, resulting in better image recognition and comprehension.

Based on various benchmarks,Qwen-VL recorded outstanding performs on several visual language tasks, including zero-shot captioning, general visual question answering, text-oriented visual question answering, and object detection.

Qwen-VL-Chat has also achieved leading results in both Chinese and English for text-image dialogue and alignment levels with humans, according to the benchmark test of Alibaba Cloud. This test involved over 300 images, 800 questions, and 27 categories.

Advertisement. Scroll to continue reading.

Earlier this month, Alibaba Cloud open sourced its 7-billion-parameter LLMs, Qwen-7B and Qwen-7B-Chat as its ongoing contribution to the open-source community. The two models have had over 400,000 downloads within a month of their launch.

In this article:Alibaba Cloud, technology, technology adaption, technology investment

HEADLINES

Radenta features Google cloud security and Gemini at ITIP 2024

Radenta Technologies, one of the country’s leading solutions integrators, joined the 21st annual IT Interaction Philippines (ITIP) 2024 National Conference held recently at Fili NuStar...

Upgrade Staff3 hours ago

HEADLINES

Cisco’s predictions for 2025: AI, security, and sustainability converge to redefine the business environment

At its current level of mass scale impact, AI may well surpass cloud and even the internet in its significance as a technology disruptor....

Contributing Writer6 hours ago

HEADLINES

Asialink Finance Corporation marks new milestone with 200th branch opening in Cavite

The branch opening reflects Asialink’s strategic expansion to reach underserved communities, offering innovative loan products with fast, accessible and convenient loan approval, and services...

Upgrade Staff6 hours ago

HEADLINES

Kiehl’s Philippines partners with TikTok Shop

Kiehl’s introduced exclusive product bundles offering up to 50% savings, along with optimized product availability and enhanced product detail pages. These offerings were complemented...

Upgrade Staff7 hours ago

White Papers

5G standalone adoption paves way for performance-based business models

According to the report, the rate of mobile network traffic data is expected to grow almost three-fold by the end of 2030 from present...

Upgrade Staff7 hours ago

HEADLINES

BIGO Philippines Awards Gala 2024 held

BIGO Philippines Awards Gala 2024 was attended by more than 300 people including creators, users, agencies, partners and media. The awards night, which was...

Upgrade Staff7 hours ago

HEADLINES

IBM, SAP SE announced upcoming release of RISE with SAP on IBM Power Virtual Server

Together, IBM and SAP aim to help organizations more seamlessly transition and modernize their on-premises ERP environments to the cloud and support AI-powered business...

Upgrade Staff7 hours ago

HEADLINES

Genshin Impact launches promotion, a chance to win a Mavuika-themed motorcycle

The custom-designed motorcycle channels the essence of Mavuika and her Flamestrider, blending bold flames, intricate detailing, and a sleek, modern design. It’s more than...

Upgrade Staff1 day ago

Search UpgradeMag.com

HEADLINES

Radenta features Google cloud security and Gemini at ITIP 2024

HEADLINES

Cisco’s predictions for 2025: AI, security, and sustainability converge to redefine the business environment

HEADLINES

Asialink Finance Corporation marks new milestone with 200th branch opening in Cavite

Phones

HONOR X9c 5G unveiled

HEADLINES

Kiehl’s Philippines partners with TikTok Shop

White Papers

5G standalone adoption paves way for performance-based business models

GAMING

Alter Ego and Myth Avenue Gaming conquer APAC Predator League 2025

HEADLINES

BIGO Philippines Awards Gala 2024 held

HEADLINES

MVP’s response to PBBM directive to connect far-flung areas: Yes, Mr. President

MOTORING

Meet the 2025 Toyota bZ4X

Phones

Checking inside HONOR X9c 5G to see why it’s the toughest phone of 2025

HEADLINES

2GO charts growth path for 2025

HEADLINES

TCL launches new products at CES 2025

COMPUTERS

All-new Compact Ergonomic Keyboard launched by Incase

MOBILE PRODUCTS

Casio America, Inc. releases latest addition to prestigious G-SHOCK MR-G line

COMPUTERS

Acer launches new Nitro V Series laptops

Like Us On Facebook

You May Also Like

HEADLINES

Radenta features Google cloud security and Gemini at ITIP 2024

HEADLINES

Cisco’s predictions for 2025: AI, security, and sustainability converge to redefine the business environment

HEADLINES

Asialink Finance Corporation marks new milestone with 200th branch opening in Cavite

HEADLINES

Kiehl’s Philippines partners with TikTok Shop

White Papers

5G standalone adoption paves way for performance-based business models

HEADLINES

BIGO Philippines Awards Gala 2024 held

HEADLINES

IBM, SAP SE announced upcoming release of RISE with SAP on IBM Power Virtual Server

HEADLINES

Genshin Impact launches promotion, a chance to win a Mavuika-themed motorcycle