Alibaba’s DAMO Academy, the group’s global research program, marks another major breakthrough in the machine-reading capability of artificial intelligence with its Natural Language Processing (NLP) model topping the GLUE benchmark rankings.
GLUE benchmark rankings is an industry table perceived as the most-important baseline test for the NLP model on March 3. DAMO’s NLP model also significantly outperformed the human baselines, a key milestone in the development of robust natural language understanding systems.
The DAMO’s existing model, which is widely deployed in Alibaba’s ecosystem, powers the company’s customer service AI chatbot, retail platform search engines and anonymous healthcare data analysis systems. The model was also used in the text analysis of medical records and epidemiological investigation by centers for disease control in different cities in China to fight against COVID-19.
Si Luo, head of NLP Research at Alibaba DAMO Academy, said: “We are excited to achieve a new breakthrough in driving research in NLP development. “As a core technology, not only does NLP underpin Alibaba’s various businesses, which serve hundreds of millions of customers, but it also serves as a critical technology in fighting the coronavirus. We hope we can continue to leverage our leading technologies and contribute to the community during this difficult time.”
Alibaba’s multitask machine-learning model StructBERT delivers impressive empirical results on a variety of downstream tasks, resulting in a GLUE benchmark of 90.3, higher than the human baselines of 87.1. The model, which is based on the pretrained language model BERT and incorporates word and sentence structures, also boosts performance in many language-understanding applications such as sentiment analysis, textual entailment and question-answering.
General Language Understanding Evaluation (GLUE) is a platform for evaluating and analyzing NLP systems. It attracts global key AI players, including Google, Facebook, Microsoft and Standard, to participate every year. The GLUE benchmark is an industry table perceived as the most important baseline test for training, evaluating and analyzing NLP systems.
In the Philippines, adoption of AI and NLP has been picking up pace these past few years. Just recently, Manila Mayor Francisco Moreno Domagoso announced it will adopt AI technology to create a platform allowing Manila resident to file complaints that can be directly submitted to the Office of the Mayor and the department tasked to handle such complaints.
Although faced by technology hurdles such as lack of internet infrastructure and slow connectivity, businesses in the Philippines have been eager to adopt AI and related technologies. The national government announced last year it will craft an AI roadmap to help in transforming the country and make it more competitive globally.
Alibaba has leveraged its proprietary technologies in recent months to help contain the coronavirus. Alibaba DAMO Academy has teamed up with Chinese medical institutions to develop an AI system that can expedite diagnosis and analysis of the virus. In February, Alibaba Cloud made its cloud-based AI-powered computing platform available for free to global research institutions to accelerate viral gene sequencing, protein screening and other research in treating or preventing the spread of the virus.
This recent BLUE top score is not the first time Alibaba’s machine-learning model has outdone others. On June 20, 2019, Alibaba’s model bested human scores in the Microsoft Machine Reading Comprehension dataset, one of the AI industry’s most-challenging tests for reading comprehension. The model scored 0.54 in the MS Marco question-answering task, outperforming the human score and Microsoft benchmark of 0.539. In 2018, Alibaba also scored higher than the human benchmark in the Stanford Question Answering Dataset, another popular machine reading-comprehension challenge worldwide.