【全訳】Google BERTアップデートとは何か？公式情報を徹底解説してみた！

今回は「Googleが公式に発表した情報を全訳形式で確認していこう！」の第2弾です！

2019年10月25日にGoogleより『BERT アップデート』のアナウンスがありましたので、

今回はこの内容を見ていきます！

■ Googleのツイート↓

Apologies! Forgot to a link to our blog post with more information about BERT. You’ll find that here: https://t.co/NuKVdg6HYM

— Google SearchLiaison (@searchliaison) October 28, 2019

Meet BERT, a new way for Google Search to better understand language and improve our search results. It’s now being used in the US in English, helping with one out of every 10 searches. It will come to more counties and languages in the future. pic.twitter.com/RJ4PtC16zj

— Google SearchLiaison (@searchliaison) October 25, 2019

BERTは、言語をよりよく理解し、検索結果を改善するためのGoogle検索の新しい手法です。現在、アメリカの英語検索に使われており、検索10回に１回の割合で役立っています。将来的には、その他多くの国や言語に適用していくことになっています。

【全訳】Google BERTアップデートとは何か？公式情報を徹底解説！

2019年10月25日に『BERT アップデート』がアナウンスされました。

「BERT」とは「Bidirectional Encoder Representations from Transformers」の略語で、AI（人工知能）ベースの自然言語処理技術のことを指しており、今回行われた「BERTアップデート」は「BERT」がGoogle検索システムに導入されたことによる変更ということになります。

本記事では、Googleが公式に発表した情報を全訳形式で確認していくわけですが、

本アップデートに伴い、

「どのような対策を講じる必要があるのか？」
「詳細は良いから内容だけ確認したい」

という方は『Google BERTアップデートの要点を徹底解説！【基本から対策まで】』という記事をお読みください↓[kanren postid=5508]

ちなみに、BERTアップデートは、アナウンスされたのが「2019年10月25日」ですが、アップデート自体はその時には始まっていたようです。

【Google公式】BERTアップデートの公表内容を全訳で確認する

2015年10月25日、Google公式ページにてBERTアップデートに関する情報「Understanding searches better than ever before」が公表されました。

ここからは上記ページに記載されている英文の内容を順々に確認していきます！

では、いきましょう！

これまで以上に優れた検索理解
Understanding searches better than ever before

If there’s one thing I’ve learned over the 15 years working on Google Search, it’s that people’s curiosity is endless. We see billions of searches every day, and 15 percent of those queries are ones we haven’t seen before–so we’ve built ways to return results for queries we can’t anticipate.

15年という月日をGoogle検索の業務に費やし、その結果分かったことを１つ挙げるとするならば、それは「人の好奇心は無限である」ということです。私は毎日、何十兆もの検索を見てきましたが、それでもそのうちの15％は初めて見るものです。このような経験から我々は予期せぬクエリーに対する検索結果を返す方法を確立してきました。

When people like you or I come to Search, we aren’t always quite sure about the best way to formulate a query. We might not know the right words to use, or how to spell something, because often times, we come to Search looking to learn–we don’t necessarily have the knowledge to begin with.

あなたや私が検索をする時、「自分が出すクエリは最善なものである」といつも自信を持っているわけではありません。私達は正しい言葉やスペルを知らない時もあります。それは、私達が検索を行う時に最初から最低限の知識を持っているとは限らないからです。

At its core, Search is about understanding language. It’s our job to figure out what you’re searching for and surface helpful information from the web, no matter how you spell or combine the words in your query. While we’ve continued to improve our language understanding capabilities over the years, we sometimes still don’t quite get it right, particularly with complex or conversational queries. In fact, that’s one of the reasons why people often use “keyword-ese,” typing strings of words that they think we’ll understand, but aren’t actually how they’d naturally ask a question.

根本的には、「検索」とは『言語を理解すること』です。あなたが出したクエリー内でどんなスペルや言葉の組み合わせが使われていようが、あなたが探している情報を理解し、ウェブ上の役立つものを上位表示させることが我々の仕事です。我々が何年もかけて検索における言語理解能力を向上させてきた一方で、まだ適切に内容が理解されないこともあります。特に複数の節や句により構成されたクエリー、または会話口調のクエリーなどがよく当てはまります。このような状況ではみなさんは、検索する時にGoogleが理解できるようキーワードを並べた「keyword-ese（キーワードを分かりやすく並べたもの）」というものを使うようになるわけです。しかしながら、それらは自然な質問の仕方とは言い難いものです。

With the latest advancements from our research team in the science of language understanding–made possible by machine learning–we’re making a significant improvement to how we understand queries, representing the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search.

機械学習による言語理解の分野において、我々の研究チームの最新報告によると、クエリを理解する方法に関して重大な進歩を遂げたということです。ここ５年で成し遂げた飛躍は検索の歴史の中でも最大級とも言えるのです。

BERTモデルを検索に適応する
Applying BERT models to Search

Last year, we introduced and open-sourced a neural network-based technique for natural language processing (NLP) pre-training called Bidirectional Encoder Representations from Transformers, or as we call it–BERT, for short. This technology enables anyone to train their own state-of-the-art question answering system.

昨年、我々は「Bidirectional Encoder Representations from Transformers」略して「BERT」というニューラルネットワークに基づく自然言語処理技術を導入し、オープンソース化しました。この技術により、どんな人でも最新の質問回答システムをトレーニングことができます。

This breakthrough was the result of Google research on transformers: models that process words in relation to all the other words in a sentence, rather than one-by-one in order. BERT models can therefore consider the full context of a word by looking at the words that come before and after it—particularly useful for understanding the intent behind search queries.

このブレークスルーはトランスフォーマーに関するGoogle検索の成果と言え、文章の理解を順番通りに1単語ずつではなく、文中の他の全単語に関連させて単語を処理することができるようになりました。BERTモデルは言葉の前後も見ることで文章全体を考慮することができるのです。これは検索クエリの『検索意図』を理解するのに役立ちます。

But it’s not just advancements in software that can make this possible: we needed new hardware too. Some of the models we can build with BERT are so complex that they push the limits of what we can do using traditional hardware, so for the first time we’re using the latest Cloud TPUs to serve search results and get you more relevant information quickly.

また、これを可能にしたのはソフトウェアの進歩だけではありません。我々は新たなハードウェアが必要でした。というのも、BERTを用いて築いたモデルはとても複雑で、従来のハードウェアでできることの限界を超えてしまうことがありました。そこで、我々は初めて検索結果を提供するために最新のCloud TPUs を使用し、すばやく関連した情報を入手できるようになったのです。

クエリを解読する
Cracking your queries

So that’s a lot of technical details, but what does it all mean for you? Well, by applying BERT models to both ranking and featured snippets in Search, we’re able to do a much better job helping you find useful information. In fact, when it comes to ranking results, BERT will help Search better understand one in 10 searches in the U.S. in English, and we’ll bring this to more languages and locales over time.

さて、技術においてはたくさんの詳細な話がありますが、あなたにとってどういう意味があるのでしょうか。BERTモデルを検索の表示順位や強調スニペットに適応することによって、我々はあなたがより一層役立つ情報を見つけることに役立つことができます。実際、表示順位では、BERTはアメリカ内の英語において１０回に１回の割合で、検索を適切に理解することに役立っており、そしてゆくゆくは他の言語や地域に広げていくつもりです。

Particularly for longer, more conversational queries, or searches where prepositions like “for” and “to” matter a lot to the meaning, Search will be able to understand the context of the words in your query. You can search in a way that feels natural for you.

特に長く、会話調のクエリーの場合、つまり “for” や “to ” などの前置詞の使われ方や意味が重要であるような場合、検索はクエリ内の言葉のコンテクストを理解することができます。そうしてあなたは自分にとって自然な方法で検索することができるのです。

To launch these improvements, we did a lot of testing to ensure that the changes actually are more helpful. Here are some of the examples that showed up our evaluation process that demonstrate BERT’s ability to understand the intent behind your search.

これらの改善を始めるために、私達は変更により実際に便利になることを確かめるために、たくさんのテストを実施してきました。ここで検索意図を理解するBERTの脳力を示すための評価過程をいくつか紹介します。

Here’s a search for “2019 brazil traveler to usa need a visa.” The word “to” and its relationship to the other words in the query are particularly important to understanding the meaning. It’s about a Brazilian traveling to the U.S., and not the other way around. Previously, our algorithms wouldn’t understand the importance of this connection, and we returned results about U.S. citizens traveling to Brazil. With BERT, Search is able to grasp this nuance and know that the very common word “to” actually matters a lot here, and we can provide a much more relevant result for this query.

「2019 brazil traveler to usa need a visa（2019年アメリカに旅行するブラジル人はビザが必要）」という検索があるとします。この場合、クエリー内の “to” と他の単語との関係性がこのクエリを理解する上で重要となります。ここではブラジル人がアメリカに旅行する意味であって、その逆ではないからです。以前では、我々のアルゴリズムはこの繋がりの重要性が理解できず、「ブラジルに旅行するアメリカ市民」に関する検索結果を表示していました。しかしBERTを使うと、検索がこのニュアンスを理解することができ、”to” が重要な役割を果たすことを理解し、クエリにより関係のある情報を提供することができるのです。

“2019 brazil traveler to usa need a visa.”における検索結果の違い

Let’s look at another query: “do estheticians stand a lot at work.” Previously, our systems were taking an approach of matching keywords, matching the term “stand-alone” in the result with the word “stand” in the query. But that isn’t the right use of the word “stand” in context. Our BERT models, on the other hand, understand that “stand” is related to the concept of the physical demands of a job, and displays a more useful response.

他のクエリを見てみましょう。「do estheticians stand a lot at work.」以前の我々のシステムではキーワードとマッチさせるアプローチをとり、クエリ内の “stand”という言葉と検索結果の “stand-alone”という言葉とをマッチングさせてしまっています。しかし、この文脈での “stand” はその使われ方とは違います。一方で、BERTでは、”stand” が『仕事の身体的負荷』に関係していることを理解し、もっと役立つ情報を表示しています。

“Do Estheticians Stand A Lot At Work”における検索結果の違い

Here are some other examples where BERT has helped us grasp the subtle nuances of language that computers don’t quite understand the way humans do.

こちらには、コンピューターが人間のように理解できなかった言葉の微妙なニュアンスを理解するのにBERTが役立った例を紹介します。

“can you get medicine for someone pharmacy”というクエリの検索結果の違いです。BERTでは、”for someone (誰かの代わりに)”が重要な役割を持っていることを理解していますが、従来の検索結果ではそれを見落とし、処方箋薬を出すことについて一般的な情報を表示しています。

クエリー：parking on a hill with no curb (縁石のない丘の上の駐車)。以前の我々のシステムはこのタイプのクエリーに困惑していました。というのも、“curb (縁石) “ という言葉に重きを置きすぎて、クエリーに重要な役割を果たしている“no (無い)” という単語の重要さが理解できておらず、無視していました。そのため、丘の上での縁石有りの駐車について表示してしまっているのです。

以前は “young adults (青年)” のカテゴリーの本が含まれてたページを表示していましたが、BERTでは “adults (大人)” を文脈から切り離してマッチさせ、より為になる検索結果を拾い上げています。

多言語における検索の改善
Improving Search in more languages

We’re also applying BERT to make Search better for people across the world. A powerful characteristic of these systems is that they can take learnings from one language and apply them to others. So we can take models that learn from improvements in English (a language where the vast majority of web content exists) and apply them to other languages. This helps us better return relevant results in the many languages that Search is offered in.

我々は、検索が世界中の人々にとってより良いものになるようにBERTを適応していきます。このシステムの強力な特徴は、１つの言語から学び、多言語に応用することが出来る点です。だから、我々はウェブ上のコンテンツとして最も多く存在している言語である英語の改善から学び、他の言語へと適用するというモデルを採用しました。これは検索が提供されているたくさんの言語において、検索結果を表示するのに役立つでしょう。

For featured snippets, we’re using a BERT model to improve featured snippets in the two dozen countries where this feature is available, and seeing significant improvements in languages like Korean, Hindi and Portuguese.

「強調スニペット」においても、我々はBERTモデルを使い、スニペットが表示される20数カ国において「強調スニペット」を改善し、韓国語やヒンディー語、そしてポルトガル語のような言語においても大きな改善が見られました。

検索は解決された問題ではない
Search is not a solved problem

No matter what you’re looking for, or what language you speak, we hope you’re able to let go of some of your keyword-ese and search in a way that feels natural for you. But you’ll still stump Google from time to time. Even with BERT, we don’t always get it right. If you search for “what state is south of Nebraska,” BERT’s best guess is a community called “South Nebraska.” (If you’ve got a feeling it’s not in Kansas, you’re right.)

あなたがどんな情報を探していようと、どんな言語を使用していようと、”keyword-ese (キーワードの列挙)”ではなく、自然な言葉で検索できることを我々は望んでいます。しかし、あなたはまだ時々Googleを困らせるでしょう。BERTを使ってさえ、必ず正確に理解できるとは限りません。ネブラスカ州の南に位置する州は何かを検索すると、BERTはSouth Nebraskaと呼ばれているコミュニティを予測します。（カンザスではないのかと思ったあなたは正解です。）

Language understanding remains an ongoing challenge, and it keeps us motivated to continue to improve Search. We’re always getting better and working to find the meaning in– and most helpful information for– every query you send our way.

言語理解には、まだ取り組み中の課題が残っており、我々も日々、検索を改善し続けるのに奮闘しています。我々は今後も改良を重ね、あなたが送る全てのクエリの意味を見つけ出し、そして最も役立つ情報を提供できるよう邁進していきます。

Google BERTアップデート翻訳まとめ

BERTアップデートをまとめると、

BERT翻訳まとめ✔ BERTアップデートの対策は「特になし！」

✔ BERT導入で求められるのは結局「コンテンツ」

✔ BERTによりGoogle検索は「前置詞」を理解する

✔ BERTはアメリカの英語検索にて10回に1回の割合で影響する

✔ BERTは強調スニペットに影響する

✔ Googleは “keyword-ese” による検索をなくしたい

ということになりますね。

上記翻訳まとめについてはこちらの記事で詳細に解説しているのでご確認ください↓

やはり公式を確認した上でインフルエンサーや起業家の方の情報を受け取ると、

バイアスや偏見なく見ることができ、情報の理解度や透明度が上がりますよね！

何か変更があった際はぜひ、原文を癖をつけてみてはいかがでしょうか。

また、この「全訳シリーズ」は、今年 (2019年) の９月に行われたGoogleコアアルゴリズムのアップデート『September 2019 Core Update』においても記事にしていますので、『September 2019 Core Update』を確認したい方はぜひこちらの記事も読んでみてください↓