A thousand times more powerful than BERT and capable of multitasking to connect information for users in new ways: since the presentation at the last Google I/O, Google MUM has been described with sensational tones, which helped raise expectations about the Multitask Unified Model technology (nicely called MUM). According to the first tests, this hype is perfectly justified, and the concrete results of MUM’s applications to Research already are outstanding.
Google MUM applied to Search: first results
This is what Pandu Nayak, Google Fellow and Vice President of Search, says in an article on the company blog in which he describes how MUM has improved Google searches for information on anti-Covid vaccines.
The starting problem is easy to understand: in the different languages there are over 800 variants of vaccine names – Astrazeneca, Coronavac, Moderna, Pfizer, Sputnik and other widely distributed vaccines all have in fact “many different names around the world” – and so people who google information about vaccines can use several queries, such as “Coronavaccin Pfizer”, “mrna-1273”, “Covaccine” and so on, with a list that could go on almost indefinitely.
For Google it is crucial “to correctly identify all these names to provide people with the latest reliable information about the vaccine”, but to be able to identify the different ways in which people refer to vaccines around the world “takes a lot of time and hundreds of hours in human resources”.
This is where MUM technology comes in, which has managed to “identify over 800 variants of vaccine names in more than 50 languages in a few seconds“, immediately completing the task of identifying and matching vaccine names in all languages. After validating these results, Google applied them to Search to give people a way to “find timely and high quality information about COVID-19 vaccines around the world”.
MUM’s characteristics
At its first official test, thanks to its ability to transfer knowledge MUM proved to be able to accomplish in a few seconds a job that otherwise would have taken several weeks.
The MUM technology is based on a transformer-type architecture, and according to the presentation provided by Prabhakar Raghavan can simultaneously perform a series of activities such as understanding and generating language – trained in 75 languages – or acting in multimodality, to learn information from multiple sources and forms, such as images, text and videos.
To clarify its operation, Nayak used the example of a person reading a book: if he is multilingual, he will be able to share the main conclusions of the book in the other languages he speaks – depending on the fluidity and properties of his skills – because it will have an understanding of the book that does not depend on language or translation. MUM transfers knowledge through languages in a very similar way.
Also, thanks to this property, MUM does not have to learn a new feature for each language, but it can transfer what it learns between the various languages, helping Google to quickly climb up the improvements achieved even for less tested languages than others. This, Nayak explains, is partly due to the efficiency of the MUM sample, which requires much less data inputs than previous models to accomplish the same task.
The difficulties in the search about vaccines
In the case of vaccines, with only a small sample of official names MUM was able to quickly identify all variations between languages.
The ambiguity situation about names is more frequent than we can imagine at first glance, and Nayak cites some terms that in English have practically the same meaning, like soda and pop, sweater and jumper or soccer and football: sometimes this depends on a characteristic of the reference language, other times on trends or cultural nuances, or simply on the part of the world in which we find ourselves while doing a search.
With the COVID-19, people have searched for information from all over the world, and Google has “had to learn to identify all the different phrases used to refer to the new coronavirus to make sure that timely and quality information made available by reliable health authorities emerged, such as the World Health Organization and the Centers for Disease Control and Prevention”. A year later, the search engine is facing a similar challenge with vaccine names, only this time it has a new and powerful tool that helps simplify and win the challenge, which is precisely the technology Multitask Unified Model (MUM).
Ready new MUM applications
Researching the names of anti-Covid vaccines is MUM’s first and so far only confirmed application in search results, but Google expects to continue using this technology in ways that make search increasingly useful to people, including improving existing services or creating new tools.
Nayak writes that Big G is “eager to discover the many ways in which MUM can make Search even more useful for people in the future”, starting with the ability to provide people with basic information in a timely manner, wherever they are.
Early business tests indicate that MUM will be able not only to improve many aspects of current search engine systems, but also that it will help create completely new ways to search for and explore information.
As users, therefore, it is interesting to see at work the potentials of a system that could offer concrete benefits in real-world use cases, allowing us to launch searches in ways we previously thought were too complex for search engines to understand. If this innovation is confirmed as efficient and sci-fy as in the premises, it will also help Google to maintain its position as market leader in Search, or even to gain another advantage over competitors.