HEADLINES

Alibaba earns top spot in global visual question answering leaderboard

Published

August 27, 2021

Alibaba secured first place in the latest global VQA (Visual Question Answering) Leaderboard, better than a human’s performance in the same context. This is the first time that a machine has outperformed humans in understanding images for answering text questions, with Alibaba’s algorithm recording an 81.26% accuracy rate in answering questions related to images, comparing to human’s performance of 80.83% (in test-standard part).

The challenge, organized annually since 2015 by the worldwide leading visual conference CVPR, attracts global players including Facebook, Microsoft and Stanford University. The evaluation presents an image and a related natural language question, to which participants are asked to provide an accurate natural language answer. This year, the challenge contained more than 250,000 images and 1.1 million questions.

The breakthrough of machines intelligence in answering image-related questions was made possible thanks to the innovative algorithm design from Alibaba DAMO Academy, the global research and development initiative of Alibaba Group. By leveraging its proprietary technologies – including diverse visual representations, multimodal pretrained language models, adaptive cross-modal semantic fusion and alignment technology, the Alibaba team was able to make significant progress in not only analyzing the images and understanding the intent of the questions, but also in answering them with proper reasoning while expressing it in a human-like conversational style.

The VQA technology has already been widely applied across Alibaba’s ecosystem. For example, it has been used in Alibaba’s intelligent chatbot Alime Shop Assistant, which is used by tens of thousands of merchants on Alibaba’s retail platforms.

Advertisement. Scroll to continue reading.

“We are proud that we have achieved another significant milestone in machine intelligence, which underscores our continuous efforts in driving the research and development in related AI fields,” said Si Luo, Head of Natural Language Processing (NLP) at Alibaba DAMO Academy. “This is not implying humans will be replaced by robots one day. Rather, we are confident that smarter machines can be used to assist our daily work and life, and hence, people can focus on the creative tasks that they are best at.”

VQA can be used in a wide range of areas, Si Luo added. For example, it can be used when searching for products on e-commerce sites, for supporting the analysis of medical images for initial disease diagnosis, as well as for smart driving, as the auto AI assistant can offer basic analysis of photos captured by the in-car camera.

This is not the first time Alibaba’s machine-learning model has eclipsed others. Alibaba’s model also topped the GLUE benchmark rankings, an industry table perceived as the most-important baseline test for the NLP model. Alibaba’s model significantly outperformed the human baselines, marking a key milestone in the development of robust natural language understanding systems.

In 2019, Alibaba’s model exceeded human scores when tested by the Microsoft Machine Reading Comprehension dataset, one of the artificial-intelligence world’s most challenging tests for reading comprehension. The model scored 0.54 in the MS Marco question-answer task, outperforming the human score of 0.539, a benchmark provided by Microsoft. In 2018, Alibaba also scored higher than the human benchmark in the Stanford Question Answering Dataset – also one of the most-popular machine reading-comprehension challenges worldwide.

Alibaba’s model “AliceMind” earned the top spot in the global VQA Challenge 2021

Advertisement. Scroll to continue reading.

In this article:Alibaba, Alibaba Cloud, technology, technology adaption, technology investment

HEADLINES

AI revolution is not just about compute — it’s about connectivity, stresses Ciena study

To meet surging AI demands, 43% of new data center facilities are expected to be dedicated to AI workloads. With AI model training and...

Upgrade Staff6 hours ago

HEADLINES

Alibaba Cloud launches Qwen2.5-Omni-7B unified end-to-end multimodal model in Qwen series

Qwen2.5-Omni-7B delivers uncompromised performance and powerful multimodal capabilities. This unique combination makes it the perfect foundation for developing agile, cost-effective AI agents that deliver...

Upgrade Staff7 hours ago

HEADLINES

PMC’s Basecamp’s Creativity Camp now open for registration

As technology becomes increasingly integral to everyday life, digitally savvy kids can enhance their tech skills to further enrich their knowledge and creativity. This...

Upgrade Staff11 hours ago

HEADLINES

Maya Group Chief Technology Officer Alfred Lo unveils homegrown AI breakthroughs

Maya’s fraud detection approach, Transaction Sequence Embeddings, analyzes and uncovers subtle patterns between transactions —flagging those that resemble fraudulent behavior or deviate from a...

Upgrade Staff11 hours ago

SOFTWARE

Microsoft Copilot updated

With these enhancements, Copilot is now more accessible than ever across Windows 11, macOS, mobile apps, and Telegram. Plus, with improved local interoperability, Copilot...

Upgrade Staff3 days ago

HEADLINES

Lenovo opens 4 new Exclusive Stores

Within just one month, Lenovo has opened four new Exclusive Stores, strengthening its nationwide presence and making its innovative AI-powered devices more accessible to...

Upgrade Staff3 days ago

White Papers

PH leads GenAI adoption in Southeast Asia as more Filipinas strive for career growth

Coursera data shows that while the country has the highest share of women in GenAI enrollments in the region and ranks among the top...

Upgrade Staff3 days ago

HEADLINES

GCash, National Privacy Commission form partnership to enhance data privacy awareness, provide data privacy-related resources

According to Ren-Ren Reyes, president and CEO of G-Xchange, Inc., “This MOA with NPC underscores our commitment to upholding the highest standards of data...

Upgrade Staff4 days ago

Search UpgradeMag.com

HEADLINES

AI revolution is not just about compute — it’s about connectivity, stresses Ciena study

HEADLINES

Alibaba Cloud launches Qwen2.5-Omni-7B unified end-to-end multimodal model in Qwen series

ELECTRONICS

Beko launches new line of air conditioners

Phones

HONOR X5b to be launched on April 2

MOTORING

HATASU launches its first 4-wheeler ebike, HATASU Buggy, priced at SRP ₱86,990

HEADLINES

Hackers using secret method to attack Chrome, Kaspersky finds

HEADLINES

PMC’s Basecamp’s Creativity Camp now open for registration

HEADLINES

GCash Eco Run plants over 76,000 trees

MOTORING

Ford Philippines announces month-long promos

HEADLINES

KrisFlyer marks 10 million member milestone

HEADLINES

Asia’s trade outlook for the next five years remains positive, says DHL Trade Atlas 2025

HEADLINES

Converge sets sights on 14-16% revenue growth for 2025

GAMING

‘Proud Dad Simulator’ launched

HEADLINES

ONIC PH tapped by nubia Philippines

Phones

realme 14 Series 5G announced

Phones

Get HONOR Magic V3 from more Globe stores nationwide

Like Us On Facebook

You May Also Like

HEADLINES

AI revolution is not just about compute — it’s about connectivity, stresses Ciena study

HEADLINES

Alibaba Cloud launches Qwen2.5-Omni-7B unified end-to-end multimodal model in Qwen series

HEADLINES

PMC’s Basecamp’s Creativity Camp now open for registration

HEADLINES

Maya Group Chief Technology Officer Alfred Lo unveils homegrown AI breakthroughs

SOFTWARE

Microsoft Copilot updated

HEADLINES

Lenovo opens 4 new Exclusive Stores

White Papers

PH leads GenAI adoption in Southeast Asia as more Filipinas strive for career growth

HEADLINES

GCash, National Privacy Commission form partnership to enhance data privacy awareness, provide data privacy-related resources