Google Announces MUM: Multitask Unified Model

As part of several announcements from Tuesday May 18 and yesterday May 20, Google has announced a brand-new core component of their search algorithm called MUM, or Multitask Unified Model.

What is the significance of this new model?

It’s similar to BERT, but a thousand times more powerful. In other words, BERT is now obsolete.

Pandu Nayak, Google Fellow and Vice President of Search, went into more detail about how MUM works.

This new model aims to make sense of multiple queries simultaneously. This synthesis of information allows MUM to provide answers based on its understanding of multiple queries.

Google’s goal with MUM is to make Google a searcher’s one-stop shop for answers. Basically making Google your one-stop digital personal assistant—all powered by AI. Their aim is to keep people on Google, rather than redirecting them to another site to find answers.

Pandu remarked that search engines are not quite sophisticated enough to actually act like a real expert would, by providing you with alternative answers based on one question.

He illustrated the following example:

“Take this scenario: You’ve hiked Mt. Adams. Now you want to hike Mt. Fuji next fall, and you want to know what to do differently to prepare. Today, Google could help you with this, but it would take many thoughtfully considered searches — you’d have to search for the elevation of each mountain, the average temperature in the fall, difficulty of the hiking trails, the right gear to use, and more. After a number of searches, you’d eventually be able to get the answer you need.

But if you were talking to a hiking expert; you could ask one question — “what should I do differently to prepare?” You’d get a thoughtful answer that takes into account the nuances of your task at hand and guides you through the many things to consider.”

Indeed, this technology is likely to shift user search behaviors dramatically in the years to come after it is released.

How Does MUM Work?

While Google has not mentioned exactly what problems MUM attempts to solve, it appears that it is being engineered to work on long form questions, and should help solve some of the more ambiguous search queries.

Long form questions are, essentially, big-picture questions that require several smaller questions to find a complete answer or solution to the problem.

When it comes to simple answers, Google does pretty well already. But it does require multiple searches on certain topics in order to find all the information that’s required in order for someone to make an informed decision on something consequential.

MUM aims to get rid of this multi-tiered search query process and save the searcher work in the end.

Google uses a transformer architecture with MUM, much like BERT does. Google explains the following about how their text generation process works in Long-Form Open Domain Question Answering and NLP:

“The main workhorse of NLP models is the Transformer architecture, in which each token in a sequence attends to every other token in a sequence, resulting in a model that scales quadratically with sequence length. The RT model introduces a dynamic, content-based sparse attention mechanism that reduces the complexity of attention in the Transformer model from n2 to n1.5, where n is the sequence length, which enables it to scale to long sequences. This allows each word to attend to other relevant words anywhere in the entire piece of text, unlike methods such as Transformer-XL where a word can only attend to words in its immediate vicinity.

The key insight of the RT work is that each token attending to every other token is often redundant, and may be approximated by a combination of local and global attention. Local attention allows each token to build up a local representation over several layers of the model, where each token attends to a local neighborhood, facilitating local consistency and fluency. Complementing local attention, the RT model also uses mini-batch k-means clustering to enable each token to attend only to a set of most relevant tokens.

Attention maps for content-based sparse attention mechanism used in routing transformer

Attention maps for the content-based sparse attention mechanism used in Routing Transformer. The word sequence is represented by the diagonal dark colored squares. In the Transformer model (left), each token attends to every other token. The shaded squares represent the tokens in the sequence to which a given token (the dark square) is attending. The RT model uses both local attention (middle), where tokens attend only to other tokens in their local neighborhood, and routing attention (right), in which a token only attends to clusters of tokens most relevant to it in context. The dark red, green and blue tokens only attend to the corresponding color of lightly shaded tokens.

We pre-train an RT model on the Project Gutenberg (PG-19) data-set with a language modeling objective, i.e, the model learns to predict the next word given all the previous words, so as to be able to generate fluent paragraph long text.”

Clearly, the effectiveness of this transformer model cannot be underestimated. This is most likely why they use it as a core component of MUM.

What, Exactly, Is Question Answering?

Wikipedia calls question answering a computer science discipline. This discipline operates within the information retrieval and natural language processing (NLP) fields.

The primary goal of NLP is to create systems that are digital assistants, able to answer questions posed by human beings in a natural language.

With existing search technologies, they are not able to extrapolate answers based on a series of questions, say a series of 10 questions that each reveal different information about a specific subject. At least, not yet.

For every question, you have to perform a single query. Then, your brain looks for the next logical query to surface the information you need, and so on.

Question answering is an opportunity for Google to create a complete question and answer system that can answer any question completely and accurately, and perform logical “next-query” decisions like a human being can.

Theoretically, all you would need to do is answer one query, and you could surface the information you need for all your questions related to that initial query.

Advanced Question Answering

This is not a new challenge in the tech world. Facebook published a research paper on long-form question answering in July of 2019, which included the first large-scale dataset, code and baseline models for long-form QA on their Github account.

They also proposed the following hypothesis: questions about daily tasks should be something that’s relatively simple for any intelligent assistant. An assistant should be able to help out with a myriad of daily tasks. In order to do so, the AI must have the ability to also assist with answers to a very wide range of questions all at once.

At the current level, search engines as smart assistants can only answer questions directly, singularly and specifically, leaving much of the searching up to the person in order to find all the information that’s relevant to what the person is trying to accomplish.

Advanced question answering enables the digital smart assistant to perform sophisticated analysis through AI and provide the searcher with all of the answers they are looking for. This prevents them from having to perform a multitude of different searches to arrive at their final understanding of a complex question.

This is at the heart of Google’s MUM. Judging from how Google describes it, we are somewhat certain that they are attempting to achieve this ultimate goal of solving the ambiguous question answer problem with the integration of MUM.

In a few bullet points, MUM helps us to:

solve complex problems without having to perform multiple searches;
reach our goal faster by not having to do those searches ourselves;
find alternative answers to related queries we may have thought about but require at a later date; and
make more efficient searches resulting in find the solution several times faster.

With the research papers that have been released, we are confident that Google is attempting to solve these problems once and for all.

But Didn’t Google Release Passage Ranking? Doesn’t This Solve the Issue?

No, it does not. Passage ranking is nowhere near advanced enough or complex enough to be able to ask multiple questions at once and synthesize answers based on the questions.

Passage ranking was meant to surface answers to singular queries only. It was not meant to synthesize complex information.

Even with this advancement Google still has significant issues with answering questions that have a compare and contrast component.

How Google Provides Reliable Information in 2021

Providing reliable information is a core component of the process of making sure that question-answer queries are accurate.

In this blog post, Danny Sullivan explains how Google provides reliable information with a simplified explanation:

First, we fundamentally design our ranking systems to identify information that people are likely to find useful and reliable.
To complement those efforts, we also have developed a number of Search features that not only help you make sense of all the information you’re seeing online, but that also provide direct access to information from authorities—like health organizations or government entities.
Finally, we have policies for what can appear in Search features to make sure that we’re showing high quality and helpful content.

All of these steps work through the singular question model, where you must perform multiple searches in order to find and understand information about multiple aspects of any given subject.

MUM takes everything above and pushes it a step further, synthesizing information from multiple queries, sources, and entities so you don’t have to perform multiple searches in quite the same way.

A Brief History of Question Answering Systems

Let’s take a relatively brief trip down memory lane and examine exactly what happened before this major advancement.

Question answering systems have existed before, although they haven’t exactly been as advanced as MUM.

Two of these early 1970s systems included BASEBALL, and LUNAR. In BASEBALL, this system would answer questions about the United States Baseball League over one year.

LUNAR, by comparison, would help answer questions regarding rocks discovered on the Apollo moon missions. While not quite as advanced as MUM, apparently both were still relatively effective at actually answering questions based on their chosen topics.

LUNAR actually had a 90% accuracy rate at answering questions, and this was successful despite people not being fully trained on using the system.

To get an idea of how these systems worked language-wise, we can think of them as being quite similar to the first chatbot programs.

Then, the 1970s brought with it knowledge bases. These knowledge bases helped target specific question and answer systems to certain fields of knowledge.

As the technology progressed, we have seen significant improvements in text comprehension along with answering questions.

This included advancements in certain technologies such as computational linguistics, which eventually led to the development of NLP.

In the field of information retrieval, we have what we call an open-domain question answering system. This open-domain question answering system works by returning a singular query response in direct reply to a user’s question.

It is only because of the combination of NLP and advanced question answering that we now have Google’s technology called the Multitask Unified Model.

That is not a detailed history, but it should give you a basic understanding of where we came from and how we ended up with MUM.

Examples of Ambiguity with Search Queries

With the advent of MUM, Google may be able to finally solve the issue of ambiguity with certain search queries.

These queries have ambiguity because they involve a compare and contrast component to properly answer them. This compare and contrast component creates ambiguities in search that are not easily rectified.

By solving the problem of ambiguity with search queries, Google essentially becomes the world’s first smart digital assistant, able to assist with any task requiring complex compare and contrast processing.

One such example of ambiguity in a search query could be the following:

What is the best nearest restaurant?

The existing technology of Google search is not going to be able to help you. It will be able to provide you with a list of suggestions that are based on peer recommendations and reviews, but that’s just about as far as that goes.

What MUM could theoretically do is provide you with a direct definitive answer based on both questions: the nearest restaurant as well as the best one.

Another longform question answer example that could be improved with MUM is one like, “What makes restaurant A better than restaurant B?”

Currently, you need a taste test (along with other elements of being human) to figure out the answer to that question.

With MUM, that question could be solved by looking at a treasure trove of documents trained and processed by an AI processing model. This model could compare and contrast multiple documents like reviews, and menus and provide you with an answer as a result.

Not to mention that MUM does not need to be human to answer the question, because it could deduce from these multiple documents what other humans think of the restaurants, and use their opinions, taste and perception in defining the answer.

MUM’s Impact on Search: Zero-Click Searches

Rand Fishkin’s Zero-Click controversy—that he wrote about previously and we have covered—could see an increase in these types of results.

From featured snippets to rich snippets and other types of results that yield zero-clicks, we could see even more zero-clicks as a result of Google implementing MUM.

By using MUM as a method of providing answers directly to the searcher, there would be no incentive to click through to another website.

This is one reason the community isso up in arms about this new technology: it has the potential to increase zero-click searches, because if Google doesn’t need to provide a link to the answer (and can provide the answer itself), what incentive is there to drive traffic to other websites?

This also brings up an issue with copyrights and publishing rights. If Google creates answers all on their own, they will need documents that have the text that they use to train the new question answering system on.

Do they plan on providing credit to the website itself in the form of a link? Do they plan on providing information on where these potential zero-clicks go? What do we do if Google replaces the entire first page with nothing but potential answers for a user’s question? Do we reformulate our SEO strategy? If so, how?

All of this remains to be seen, for sure. For now, we’re going to need to take an apprehensive and cautious approach at lauding this as the future of search, because editorially-speaking, this appears to be something that has a lot of bad juju behind it.

Google’s MUM Technology and Its Impact on Search

Truthfully, we don’t know the impact this is going to have on search. We don’t know what impact it’s going to have on search’s reliability. We also won’t know what to expect until its full release.

We do know that question and answers are the “big thing” right now. It’s how Google formulates some of their search queries.

With the advent of MUM, clearly question and answer queries are not going away, and are only going to increase in importance.

Because Google appears to want to eliminate ambiguities with such search queries, it’s possible this is only going to increase from here.

It will, indeed, be interesting to see what MUM is capable of when it’s released.