SearchImagesVideoTranslateMail
Login

Products & Technologies

MatrixNet: New Level of Search Quality

Search Quality

The job of a search engine is, first and foremost, to provide answers to user’s queries. In response to each query, a search engine returns links to web pages it finds in its index – a database of web pages known to this particular search engine. Thus, an answer to the user’s query comes in the form of search results – a list of hyperlinks to web pages, whose content matches this query.

This is how it works:

These days, a search query that would return fewer than a dozen results is hard to find. Most searches will retrieve links to millions of web pages. The number of answers potentially matching any given search query is growing increasingly fast along with the rapid growth of the internet. It doesn’t make much sense to provide the user with all potentially matching pages that exist – a person would have to browse through dozens of resources before anything useful comes up. Instead, what a search engine does is rank the search results placing the most relevant of them on top.

Looking at these search results, the user may feel quite satisfied, not really satisfied, or not satisfied at all. This subjective feeling of getting (or not getting) what one was searching for is what describes the quality of search from the user’s point of view – is this information useful for me? The trick is to describe and measure all these subjective attitudes and to take into account everyone. The quality of search depends on how well search results are ranked. Ranking means sorting search results in a way that meets user's expectations.

Machine learning

It’s impossible to build a perfect algorithm that would come up with the best possible result for every possible query. Yandex’s search engine processes almost 200,000,000 queries every day. Almost half of these queries are unique. To deal with this load of questions successfully, a search engine has to be able to make decisions based on the previous experience, that is, it has to learn.

Machine learning is essential not only in search technology. Speech or text recognition, for instance, is also impossible without a machine being able to learn. The term ‘machine learning’ coined in the ‘50s, basically, means the effort to make a computer perform the tasks natural to human behavior, but difficult for breaking down into algorithmic patterns ‘understandable’ by machines. A machine that can learn is a machine that can make its own decisions based on input algorithms, empirical data and experience.

Decision making, however, is a human quality, which a machine cannot really master. What it can do, though, is learn to create and apply a rule that would help to decide whether a particular web page is a good answer to user’s question or not.

This rule is based on properties of web pages and user’s queries. Some of these properties, like the number of links leading to a particular page, are static – describing a web page, while others, like whether a web page has words matching a search query, how many and where on a page, are dynamic – describing both a web page and a search query. There are also properties specific only to search queries, such as geolocation. For a search engine, this means that to give a good answer to a user’s question it has to factor in where this question has come from.

These quantifiable properties of web pages and search queries are called ranking factors. These factors are key in performing exact searches and making the decision on which results are the most relevant. For a search engine to return relevant results for a user’s query, it needs to consider a multitude of such factors.

Three types of ranking factors:

To approximate users’ expectations, a search engine requires sample user queries and matching results, which have already been considered satisfactory by the users. Assessors – people, who decide whether a particular web page offers a ‘good’ response to a certain search query – provide their evaluations. A number of search responses, together with corresponding queries, make up a learning sample for a search engine ‘to learn to find’ certain dependencies between these web pages and their properties. To represent real users’ search patterns truthfully, a learning sample has to include all kinds of search queries in the same proportion as they occur in real life.

After a search engine has found dependencies between web pages in the learning sample and their properties, it can choose the best ranking formula for the search results it can deliver to a specific user’s query and return the most relevant of them on top of all the rest.

Think of teaching a machine how to pick the most delicious apples. First, assessors take a bite of each apple in a ‘tasting crate’ and put all tasty apples to the right and all sour apples to the left. This crate contains all sorts of apples in the same proportion as they are likely to grow in the garden. A machine cannot taste apples, but it can analyze their properties, like size, color, sugar content, firmness, presence or absence of a leaf. The tasting crate is a learning sample, which allows the machine to learn to select the apples with the winning combination of properties: size, color, sweetness and firmness. Errors are unavoidable, though. For instance, if a machine does not have any information about insect larvae, the best apples it has selected might hide a worm. To minimize the probability of error, a machine needs to consider a maximum number of apples’ properties.

MatrixNet

Machine learning has been implemented in search technologies since the early noughties. Different search systems use different models. One of the problems in machine learning is overfitting. An algorithm that overfits its data is like a sophomore medical student who diagnoses himself with every possible symptom he has read about in his manual. Not having been exposed to the real practice yet, he makes up causes for the natural things he observes.

When a computer uses a large number of factors (properties of web pages and search queries, in our case) on a relatively small learning sample (‘good’ results as estimated by assessors), it begins to find dependencies that do not exist. For example, a learning sample might accidentally include two different pages each having the same particular combination of factors, like they both are 2 KB, with purple background and feature text, which starts with “A”. And, by sheer chance, these pages both happen to be relevant to the search query [apple]. A computer may deem this accidental combination of factors to be essential for a search result to be relevant to the search query [apple]. At the same time, all web pages offering really relevant and useful information about apples, but lacking this particular combination of factors, will be considered less important.

In 2009 Yandex launched MatrixNet, a new method of machine learning. A key feature of this method is its resistance to overfitting, which allows the Yandex’ search engine take into account a very large number of factors when it makes the decision about relevancy of search results. But now, the search system does not need more samples of search results to learn how to tell the ‘good’ from the ‘not so good’. This safeguards the system from making mistakes by finding dependencies that do not exist.

MatrixNet allows generate a very long and complex ranking formula, which considers a multitude of various factors and their combinations. Alternative machine learning methods either produce simpler formulas using a smaller number of factors or require a larger learning sample. MatrixNet builds a formula based on tens of thousands of factors, which significantly increases the relevance of search results.

Another important feature of MatrixNet is that allows customize a ranking formula for a specific class of search queries. Incidentally, tweaking the ranking algorithm for, say, music searches, will not undermine the quality of ranking for other types of queries. A ranking algorithm is like complex machinery with dozens of buttons, switches, levers and gauges. Commonly, any single turn of any single switch in a mechanism will result in global change in the whole machine. MatrixNet, however, allows to adjust specific parameters for specific classes of queries without causing a major overhaul of the whole system.

Change of a single parameter in different ranking formulas:

In addition, MatrixNet can automatically choose sensitivity for specific ranges of ranking factors. It’s like trying to hear someone whisper on the airfield. Figuratively speaking, MatrixNet can hear both the whisper and the sound of planes landing or taking off.

Ranking

For each user’s query, a search engine has to evaluate properties of millions of pages, assess their relevancy and rank them accordingly with the most relevant on top. Scanning each page in succession either would require a huge number of servers (that could deal with all those pages very quickly) or would take a lot of time – but a searcher cannot wait. MatrixNet solves this problem as it allows to check web pages for a very large number of ranking factors without increasing processing power.

In response to each query, more than a thousand servers simultaneously perform a search. Each server searches within its own part of index to produce a list of the best results. This list is guaranteed to include web pages most relevant to this query.

The next step is to produce one final list of top results based on all those lists of the most relevant pages produced by each server. These results are, then, ranked using that long and complicated MatrixNet formula, which allows to consider a multitude of ranking factors and their combinations. Thus, the most relevant websites find their way to the top of search results for the user to receive an answer to their question almost instantly.

This is approximately how ranking works:

Yandex N.V.
Registered Office in Amsterdam
Address: Schiphol Boulevard 165
1118 BG Schiphol
The Netherlands
tel.: +31 (0) 20 206 6970
Yandex LLC
Headquarters in Russia
Office in 
Moscow
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
16, Leo Tolstoy St., Moscow 119021, Russia
Reception
tel. +7 495 739-70-00, fax +7 495 739-70-70
Commercial Department
tel. +7 495 739-22-22 ext.1247, fax +7 495 739-23-32, eng-ad@yandex-team.ru
Toll-free number for regional users 8 800 333-9639 (or 8 800 333-YNDX)
Marketing & Media Relations
pr@yandex-team.ru
Investor Relations
tel. +7 495 974 3538
Office in 
Saint Petersburg
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
Piskarevskiy prospekt, building 2, block 2, 4th floor, Benois Business Centre, 195027, Saint Petersburg, Russia
Reception
tel. +7 812 633-36-00, fax +7 812 633-36-99
Toll-free number for regional users 8 800 333-9639 (or 8 800 333-YNDX)
Office in 
Ekaterinburg
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
10 Hohryakova St., Ekaterinburg 620014, Russia
Reception
fax +7 343 385-01-99
Sales Office
tel. +7 343 385-01-00 uralsales@yandex-team.ru
Toll-free number for regional users 8 800 333-9639 (or 8 800 333-YNDX)
Office in 
Novosibirsk
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
35 Krasnoyarskaya St., Novosibirsk 630004, Russia
Reception
fax +7 343 385-01-99
Sales Office
tel. +7 383 230-43-06 sales-nsk@yandex-team.ru
Office in 
Kazan
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
6 Spartakovskaya Street, 11th floor, right wing, Kazan 420107, Russia
Reception
tel. +7 843 524-71-71
Sales Office
kzn@yandex-team.ru
Office in 
Rostov-on-Don
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
70D, Gvardeysky Business Centre, Dolomanovsky Lane, Rostov-on-Don, 344011, Russia
Tel. +7 (863) 2-688-300
Office in 
Nizhny Novgorod
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address:
Lobachevsky Plaza, ulitsa Alekseevskaya, 10/16, Nizhny Novgorod, 603000
Tel. +7 (831) 233 06 06
Office in 
Kiev
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
Suite 30, 19–21 Bohdana Khmelnytskoho Street, Kyiv, 01030 Ukraine
Reception
+38 044 586 41 48, fax. +38 044 586 41 48 ext. 6665
Marketing & Media Relations
pr@yandex-team.com.ua
For general inquiries
info@yandex-team.com.ua
Client Service Department
+38 044 586 41 48 ext. 2601
Skype
yandex.ukraine (reception)
Office in 
Odessa
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
Polsky Uzviz 11, Morsky-2 Business Center (8th floor), Odessa 65026, Ukraine
Reception
tel./fax: +38 0487 37 44 10, +38 0445 86 41 48
Marketing & Media Relations
pr@yandex-team.com.ua
For general inquiries
info@yandex-team.com.ua
Client Service Department
+38 044 586 41 48 ext. 2601
Skype
yandex.ukraine (reception)
Office in 
Simferopol
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
1a (4th Floor), Kazanskaya St., Simferopol
Reception
tel. +7 495 739 70 00, fax +7 495 739 70 70
Office in 
Minsk
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
Office 308, Rubin Plaza Business Center, 5 Dzerzhinskogo prospekt, Minsk, 220036, Belarus
Reception
+375 17 328-19-61, fax. +375 17 328-15-14
Office in 
San Jose
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
2001 Gateway place, Suite 400W, San Jose, CA, 95110
Tel. +1-650-838-0880
Office in 
Istanbul
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address
Mecidiyeköy Yolu Caddesi No: 12,
Trump Towers 2. Kule, 15. Kat, Ofis No: 1501-1502
34387 Mecidiyeköy, Şişli, İstanbul - Türkiye
Tel. +90 212 386 87 60 (pbx), fax. +90 212 284 46 48
Marketing & Media Relations
pr@yandex.com.tr
Contact Us
Office in 
Lucerne
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address:
Citybay Business Center, Werftestrasse 4, CH 6005 Lucerne, Switzerland
Tel. +41-41-248-08-60, fax. +41-41-248-08-63
Office in 
Zürich
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address:
Odeonhaus Limmatquai 2, 8001 Zürich, Switzerland
Tel. + 41 44 252 50 00
Office in 
Berlin
  • Moscow
  • Saint Petersburg
  • Ekaterinburg
  • Novosibirsk
  • Kazan
  • Rostov-on-Don
  • Nizhny Novgorod
  • Kiev
  • Odessa
  • Simferopol
  • Minsk
  • San Jose
  • Istanbul
  • Lucerne
  • Zürich
  • Berlin
Pic / Map
Address:
Karl-Liebknecht-Straße, 1, 10178 Berlin, Germany
enEn