Yandex Data Factory Predicts ‘Churn’ for World of Tanks
2 March, 2015
Customer loyalty and satisfaction is crucial in community-based gaming, where every single player matters, and devoted, experienced gamers are especially valuable for the game. Our big data unit,Yandex Data Factory, took game churn prediction – knowing how many gamers are likely to leave the game – to another level. Wargaming, an international MMOG developer, whose game World of Tanks, one of the world’s most financially successful games, with over 100 million registered players, can now determine more accurate which players are likely to stop playing soon and take measures to prevent that.
The challenge presented to the YDF team was to help increase WoT players’ loyalty and satisfaction with a minimal effort and at a minimal cost. To approach this challenge, a sample dataset of 100,000 random players who had 20 games or more in the past year was selected – this was done to exclude those who joined the game by accident or just to have a try. Based on a similar concept used in telecom and Wargaming’s own understanding, YDF analysts defined a ‘churner’ as a player who had zero games in the month following a gaming session. Next, the raw data for the ‘churners’, which included over 100 parameters – personal (obfuscated payment balances, purchase logs, etc.), as well as gaming (game logs, number of battles, battle types, number of destroyed tanks, clan battles data, free experience etc.) – was fed to our proprietary machine-learning algorithm, MatrixNet, to find similarities in gamers' behaviour and personal profiles. In result, a probability of churn was assigned to every gamer in the dataset.
WoT could then apply this churn prediction formula to the whole gaming community to spot top potential churners and target customer retention measures, such as special offers, new frictions, bonuses or community activities, specifically to them. The accuracy of YDF formula’s churn prediction measured at least 20-30% better than the current standard used in the gaming industry. Churn prevention – developing a formula for personalised retention measures – is the next challenge that YDF is ready to take on. Read more about YDF's churn prediction project for Wargaming.
Yandex’s School of Data Analysis Joins LHCb Collaboration
13 January, 2015
The Yandex School of Data Analysis has joined in collaboration with CERN’s Large Hadron Collider beauty (LHCb) experiment. The project is one of four large particle detector experiments at the Large Hadron Collider, and collects data to study the interactions of heavy particles, called b-hadrons.
As a result of this collaboration, the LHCb researchers will receive continuous support from existing applications (EventIndex, EventFilter) and the development of new services designed for the LHCb by the Yandex School of Data Analysis. YSDA will contribute its data processing skills and capabilities, and perform interdisciplinary research and development on the edge of physics and data science that will serve the aims and needs of the LHCb experiment.
LHCb experiment. Photo by Tim Parchikov.
The researchers at the LHCb experiment are seeking, among other things, to explain the imbalance of matter and antimatter in the observable universe. This programme requires collecting, processing and analysing a very large amount of data. Yandex has already been contributing its search technologies, computing capabilities and machine-learning methods to the LHCb experiment since 2011, helping the physicists gain quick access to the data they need. Since January 2013, Yandex has been providing its core machine-learning technology MatrixNet for the needs of particle physics as an associate member of CERN openlab, CERN’s collaboration with industrial partners.
The Yandex School of Data Analysis is now part of the game, with its exceptional talent, a strong tradition in hard-core mathematics, and proven experience of converting new theoretical knowledge into practical solutions. The YSDA is the only member of the LHCb collaboration that does not specialise in physics. Other collaborators in the project include such prestigious institutions as MIT (USA), EPFL (Switzerland), University of Oxford and Imperial College, London (UK).
The Yandex School of Data Analysis is a free Master’s-level program in computer science and data analysis, offered by Yandex since 2007 to graduates in engineering, mathematics, computer science or related fields. It trains specialists in data analysis and information retrieval. The school’s program includes courses in machine learning, data structures and algorithms, computational linguistics and other related subjects. It runs a number of joint programs, both at Master’s and PhD levels, with leading education and research institutions including the Moscow Institute of Physics and Technology, the National Research University Higher School of Economics (HSE), and the Department of Mechanics and Mathematics of Moscow State University. In seven years, the Yandex School of Data analysis has prepared more than 320 specialists.
Yandex Data Factory Opens for Business
9 December, 2014
As far as the laws of mathematics refer to reality, they are not certain,
and as far as they are certain, they do not refer to reality.
A search engine is all about very big data and very advanced mathematics. What we have been doing here at Yandex for more than 17 years already, is develop and implement technologies and algorithms which from a billion of pages on the internet would pick the one that would offer an answer to a web user’s question or solve their problem.
The technologies that power our search are based on machine learning – an approach that allows automating the process of making a decision. Our core machine learning technology, MatrixNet, not only makes its own decisions about whether a certain piece of information is a good answer to a user’s question or not, based on previous experience, but it does so based on a relatively limited experience.
At this point in time, when we can feel that our technologies can be put to use in spheres other than internet search, we are prepared to offer what we’ve got for a larger range of applications.
Today, at the LeWeb innovation conference in Paris, we’re cutting the red ribbon for Yandex Data Factory, our new B2B-service for corporate and enterprise clients, who would like, using our machine-learning technologies, to turn large volumes of data they posses into hands-on business tools, and, by doing so, increase sales, cut costs, optimise processes, prevent losses, forecast demand, develop new or improve existing methods of audience targeting.
We first branched out of our natural realm with our collaboration with CERN on their Large Hadron Collider beauty (LHCb) experiment. For this project we trained our MatrixNet to search for specific types of particle collisions, or events, among thousands of terabytes of information about these events registered by the detector in the LHCb. Yandex provided the LHCb researchers with an instant access to the details of any specific event.
The success of this project gave us reasons to believe it can be repeated in other areas of application. Any industry producing large amounts of data and focused on business goals could benefit from our expertise and our MatrixNet-based technologies: personalisation of search suggestions, recommendations or search results, image or speech recognition, road traffic monitoring and prediction, word form prediction and ranking for machine translation, demographic profiling for audience targeting.
Prior to today's announcement we have run pilot projects for about a year designing experimental custom-made solutions for clients all over the world. Most of these projects involved using the data that already exist, which we used for training a MatrixNet-based model, which then was applied to new data – depending on the goal of a client, to generate suggestions for buying a specific product, or predict, with a high degree of accuracy, based on behaviour of thousands or millions of shoppers with similar behaviour patterns, which product exactly will be bought.
Using this machine-learning technique, we helped one of the leading European banks increase their sales by matching each of their products that needed upselling with the best communication channel for each customer. By applying MatrixNet to behavior data on a few million of the bank’s clients, we created a model that could predict net present value of communication of a product to a specific client via a specific channel. This model was then applied to the bank’s new data to generate personalised product recommendations for each client paired with communication channel and ranked by potential net profit value. Preliminary results of the first wave of the bank’s marketing campaign, which was run on three million of clients, were used to fine-tune the original model, which, in its turn, was used in the second wave on a much larger number of the bank’s customers. The resulting sales increase beat the increase forecasted by the bank’s own analysts by 13%.
The same machine-learning approach, together with our own data and expertise in geolocation, helped a road and traffic management agency boost their accident prediction accuracy making it 30 times more accurate. To enable the agency take measures to prevent road traffic accidents, we provided them with one-hour forecasts for traffic jams, as well as alerts for high-risk traffic conditions, in real time, and visualized potential congestion on interactive maps. Using MatrixNet, we first trained predictive formulas on our own UGC information about almost 40,000 road accidents and 5bn speed tracks minded over 2.5 years, complemented by the information provided by the agency: traffic information (i.e., number of cars passing through a given segment of the road in any given time), information about road conditions (type of surface, number of lanes, gradient etc.), weather information. These formulas were then applied to larger data sets and a predictive system for road traffic accidents was developed and deployed in the agency’s situation rooms.
Currently, we’re continuing to work on about 20 projects in various stages of completion across the globe. In essence, we're continuing to experiment, but this time, we know in which direction, or rather – in which directions – we are to move. While the majority of our potential partners, as well as data, come from finance, telecommunications, retail, logistics, utilities, and even the new-fangled 'smart cities', anyone who has data and a business goal can discover new opportunities brought about by mathematics. No matter what industry your business is in, mathematics will work for you. Despite what Einstein said.
New Yandex.Browser Paves Way to Future
27 November, 2014
At a point in time when web pages have stopped merely hosting content and now look more like fully-fledged applications interacting with their users in more ways than one; when websites no longer redirect their visitors to other places on the internet to give them what they need – web browsers cannot remain the same square windows through which to look at 'carved-in-stone' content.
Facebook.com users, for instance, can now watch a video right in their timeline, play music and talk to their friends. Soundcloud.com isn't just a music hosting website, it offers their visitors nearly professional music recording, streaming and sharing experience, complete with advanced search functions and equaliser settings. In the existing environment, when most web-based resources – from social networking websites to newspapers to shopping platforms – are expected to have their mobile reincarnation as an app, the role of the desktop browser, as well as its look and feel, cannot remain the same.
In response to and accordance with the evolutionary change of the web, Yandex releases a new alpha version of its browser. The new streamlined Yandex.Browser is a new step in its evolution. It reflects the current trend in web user experience, which puts an emphasis on interaction and personalisation. The new Yandex.Browser lets users experience the web directly, while offering secure protection from the darker side of the internet. It is designed to respond to all the current needs of a web user, which aren’t limited to mere browsing, but now also include shopping, reading websites in a foreign language, booking flights, trains, taxies or hotel rooms and restaurant tables.
By bringing all these changes to our browser, we're hoping to make the internet more user-friendly for everyone. The new Yandex.Browser is a weighty contribution to our goal of creating a smart and transparent environment for a happy and comfortable internet experience. Instant page view, 'pages as apps', see-through user interface, rich search results, personalisation, integrated Yandex products and services and many more – are all implemented in the new Yandex.Browser, a trailblazer for the future of internet experience. With user feedback, we're hoping to understand how well we're faring on this path
The new alpha version of Yandex.Browser is currently available for download for Windows and OS X devices.
In Memory of Alexey Yakovlevich Chervonenkis
29 September, 2014
Alexey Yakovlevich Chervonenkis tragically died on September 22. A professor of the Moscow Institute of Physics and Technology as well as Royal Holloway, University of London, and a lecturer at the Yandex School of Data Analysis, he made a huge contribution to the theory of machine learning.
So far there have been three periods in the science of machine learning: pre-computer, computational, and the contemporary period of big data.
The first great work of Chervonenkis and Vapnik was this article from 1971. The theory of the uniform convergence of the frequencies of occurrence of events to their probabilities set the course for the development of this field of science for several decades ahead.
This was the period of the “theoretical” development of machine learning. At that time, only some kind of M-200 or, at best, a BESM was available for computing so there was not yet even any talk of widespread 'practical application in the nation's industry'. But even then it could already be used to find targets in the air, for example, or to help detect abnormalities in echocardiograms.
Then came the second period in the history of machine learning – the computational stage. In the 1990s people learned, for example, how to quite effectively recognise and digitise texts (including handwritten documents) and keep e-mail free of spam. Half of these methods worked on the renowned SVM (Support Vector Machine) method conceived in the early 1990s by Chervonenkis and Vapnik (Vapnik-Chervonenkis dimension). In the mid-2000s all the well-known companies worked on SVMs – including us, and Yahoo!, and Google, and Amazon. SVM is described in any textbook on the subject.
And then came the third era in the development of machine learning, with the appearance of big data and methods for working with it. Now it appears that everything around us, all objects and services, will become a bit smarter and learn to help us in every detail, anticipating our desires to some extent. This is similar to how various mechanical and chemical inventions have changed our lives, only now in a slightly different sphere.
In this third era, Chervonenkis taught at the Yandex School of Data Analysis, and presented the development of his fundamental 1971 work at our conference.
Alexey Chervonenkis loved to walk. He would walk 20 kilometres a day – around Moscow, or London, or forests – that’s how he thought. In summer he had an operation and he couldn’t walk for three weeks. Then one day he started walking again – first a kilometre, then two, then three. And last week he set off on a 20-kilometre walk along a familiar route through Losiny Ostrov National Park.
New Marketplace to Organise Household Services in Russia
23 September, 2014
Yandex launches a marketplace for household services. The new web-based service exchange allows providers of a variety of services ranging from appliance repairs and installations to cleaning and moving home to find their clients, while those who require such services can find the best deals. The marketplace website lists service providers’ information, including their prices and client reviews. Service consumers can leave their feedback and also rate the quality of the service they received to help others choose the best provider.
The new marketplace, called Yandex.Master, where ‘master’ means someone who does their job really well, is currently available in Moscow and St Petersburg. Residents in the two of Russia’s largest cities make about 800,000 searches about help with small errands and household services per month. Yandex.Master was designed to bring transparency, structure and safety to a chaotic market of private household services in Russia. Provider ratings and client feedback on the marketplace are expected to promote open competition, which in its turn will organise service pricing and improve the quality of services in general.
Yandex has already seen success with structuring an offline market. Yandex.Taxi, our taxi service aggregator, was launched in Moscow in 2011 when the ‘gypsy cab’ culture reigned on the road. Random pricing, old and dirty cars, unpredictable and illegal drivers, road accidents have all been eliminated with the introduction of a desktop service supplemented with an app that was based on automated algorithms, aggregation, ranking, client feedback and a tight control of service quality. It took only slightly more than one year to completely transform the taxi service market in Moscow and St Petersburg and make a taxi ride predictable price-wise, safe and enjoyable for any client. Taxi service companies in these two cities receive more than 700,000 taxi bookings per month through Yandex.Taxi. It’s a valuable client source for them and it makes them care about their reputation, while Yandex receives percentage from each booking.
Just like Yandex.Taxi, Yandex.Master also has a strict quality control system, which makes sure each service provider is vetted before their offer appears on the website. Yandex.Master’s quality control team make trial service requests and manage negative feedback. Provider’s feedback history and pricing also play a part in how service quality is maintained. Yandex.Master is available at master.yandex.ru and will also be released soon as an app for iPhone. Currently, the marketplace partners with over 70 service providers and aggregators. To try this form of marketing, service providers will be able to promote their offers on the marketplace for free until 2015, when we’re planning to introduce a pay-per-lead model.
Yandex's Translation App for iPhone and iPad Now Provides Translations When Offline
5 September, 2014
Translation apps on mobile devices sure come in handy when you’re travelling to different countries where you don’t know the local language. But usually they use the internet to perform their translations, which means you need to go online to get help communicating with people or deciphering restaurant menus.
We have recently released an offline version of our Yandex.Translate mobile app for iPhone and iPad. So now this application can work without connecting to the internet, saving you the cost of internet roaming or the trouble of finding a wi-fi hotspot.
Yandex.Translate can be installed from App Store for free. After that, just go to the settings and choose the language pair or pairs that you need to translate to and from, and download the translation database onto your device. Five language pairs with English are available for offline translation: English-German, English-French, English-Italian, English-Spanish and English-Russian. If an English-speaking user is travelling from the UK to Spain, they’ll only need the English-Spanish offline translation database to get by day-to-day, while the other language pairs will only be available online (the app warns about this). After returning home to the UK, the user can delete the English-Spanish offline translation data to free up space.
While we’re on the subject of space and size, we really went to great lengths to get the balance right. As you might know, statistical machine translation is based on searching and indexing parallel texts on the internet. We look for already translated texts and phrases, compare them with the original and rank them according to how often they occur. You can read more about that here. These parallel texts, phrases and word combinations are quite cumbersome, taking up gigabytes on our servers.
For the translation app to work offline, we had to streamline the parallel translation database, so that only the most common translations remained. For example, if the full translation database gives 100 different ways to translate “where can I get the best tapas in Barcelona” into Spanish, the mobile version will retain only the 10 most commonly used translations.
We understand that this kind of streamlining can lower the quality of offline translations, and solving this problem was our biggest challenge. We ran a multitude of different tests and experiments to determine the optimal database size that would retain an acceptable translation quality for offline mobile gadgets. For every kind of translation, the optimal size turned out to be 500 MB. Larger (that is, with the addition of more translation variations) brings insignificant improvements to quality, even if the size is multiplied. And reducing the size and the number of translation options causes serious loss of quality without freeing up much space.
Yandex’s offline translation app can help users in daily life situations in foreign countries, whether they’re on vacation or a business trip: telling a taxi driver where to take them, comprehending what’s on a menu, understanding what street signs and warning signs say. These are the situations in which mobile translation apps are most often used. Our statistics also show that Yandex.Translate is used in private correspondence, school homework and university assignments, for reading tourism guides or news on the internet, for translating recipes, poems and songs. The average length of translations on mobile devices is five to seven words. At present, Yandex.Translate handles about 400,000 translations on mobile platforms every day. Most of our users are in Russia, but we are working on making the app popular outside our home country.
Since we first announced it in December 2012, the capabilities of mobile Yandex.Translate have grown enormously: it now “knows” 44 languages, offering a text-to-speech function for some – meaning it not only translates a phrase from your native language into a foreign one, but also lets you hear how it sounds when spoken.
Yandex.Taxi and MTS Take Clients for a Supercar Ride
26 August, 2014
Yandex.Taxi, our cab-hailing service, has teamed up with MTS, one of Russia’s leading mobile providers, to give a few lucky Moscovites a chance to have a free ride in a luxury car.
For the three months starting in August, after making an order for a taxi through the Yandex.Taxi app or on the service’s website, a lucky customer in Russia’s capital has a chance to have a free ride to their chosen destination in one of four luxury cars – two Maserati Quattroportes, a Porsche Panamera or a Chevrolet Camaro. The journey will also be complemented with free high-speed WiFi 4G LTE service, courtesy of MTS.
The MTS-sponsored supercar ride ‘lottery’ is part of Yandex.Taxi’s regular service, which automatically picks the closest available taxi to client’s location and delivers a safe and comfortable cab to a client in about seven minutes on average.
Currently, Yandex.Taxi serves more than 630,000 taxi bookings in Moscow per month. In the three months of the duration of the ‘lottery', each supercar is expected to respond to about 17 cab requests per day, which gives a chance to approximately one in 300 Yandex.Taxi rides to be something special.
Everything Is Solved – Yandex.Algorithm Programming Championship Celebrates Winners in Berlin
5 August, 2014
Yandex's annual competitive programming championship, Yandex.Algorithm, has produced its winners. The top winner is Gennady Korotkevich of Belarus (currently a student of St. Petersburg ITMO University) with four solutions out of a possible six and -66 minutes of penalty time. He received his well-deserved 300,000 rubles (about €6,000 of the total prize fund of €10,800) – this is the second time in a row that he has won the grand prix of Yandex.Algorithm.
Kazuhiro Hosaka (Tokyo University, Japan) also solved four problems but with -90 minutes of penalty time, and was awarded 150,000 rubles (about €3,000) of the prize fund and second place. And student Qinshi Wang from Tsinghua University (Beijing, China) claimed the remaining 90,000 (€1,800) by solving four problems with -125 minutes of penalty time and finishing in third place.
The final round of Yandex.Algorithm took place on August 1 at the Radisson Blu SAS Hotel in Berlin, just next door to newly opened Yandex R&D office in the capital of Germany.
Having started in 2011 with only a few programmers flexing their algorithmic muscles in a special event organised by Yandex's Summer School, this year Yandex.Algorithm saw 3,890 competitors from 72 countries vying for a place in the finals. Luck, reinforced with talent and skill, was on the side of 25 participants from Belarus, Kazakhstan, Ukraine, Russia, Poland, China, Taiwan, Japan and the United States who reached the final round.
Competition in the finals was tough, as expected, with some of the world's strongest players in competitive programming showing what they are worth against previous Yandex.Algorithm winners, as well as multiple champions of other renowned international championships, including ACM ICPC and TopCoder Open, and hands-on computer engineers working at Facebook and Google. Yandex employees are excluded from participation under the terms and conditions of the contest.
All problems in the contest, including the six algorithmic tasks in the final round, were developed by a team of professional computer engineers and active competitive programmers from Carnegie Mellon University, Moscow State University, St. Petersburg State University, Google and Yandex. You can see the problems and solutions here. In addition to having practical experience solving problems similar to those presented in the contest on a daily basis, the Yandex specialists who have contributed to the contest share their knowledge with students at the Yandex School of Data Analysis.
The Yandex School of Data Analysis is a free Master’s-level program in computer science and data analysis offered by Yandex to graduates in engineering, mathematics, computer science or related fields. Three hundred and twenty-two students have graduated from the school since it was founded in 2007. Headquartered in Moscow at Yandex, the School of Data Analysis partners with leading research centres and has branches in other cities in Russia, Ukraine and Belarus.
Yandex School of Data Analysis Graduates Win Medals at World Team Programming Championship
27 June, 2014
Yandex School of Data Analysis graduates Mikhail Kolupayev and Vyacheslav Alipov have won bronze medals at the World Finals of the International Collegiate Programming Contest (ACM ICPC). The 2014 graduates of the school competed in this week’s finals in Yekaterinburg as members of the National Research University Higher School of Economics team.
ACM ICPC is the premier team programming championship worldwide and the most renowned competition of its kind.
First place at ACM ICPC 2014 went to a team from St. Petersburg State University. Team members Dmitry Egorov and Egor Suvorov are students of St. Petersburg’s Computer Science Center, which the School of Data Analysis helped create. The other gold medal-winning teams (ACM ICPC awards gold to the top four) are from Moscow State University, Peking University and the National Taiwan University.
Yandex congratulates the winners and place-getters, and looks forward to the Algorithm-2014 individual programming finals this August in Berlin, organised by the company.
Registered Office in Amsterdam
Address: Schiphol Boulevard 165
1118 BG Schiphol
tel.: +31 (0) 20 206 6970
Headquarters in Russia
AddressPiskarevskiy prospekt, building 2, block 2, 4th floor, Benois Business Centre, 195027, Saint Petersburg, Russia
Toll-free number for regional users
Address10 Hohryakova St., Ekaterinburg 620014, Russia
Receptionfax +7 343 385-01-99
Sales Officetel. +7 343 385-01-00
Toll-free number for regional users 8 800 333-9639 (or 8 800 333-YNDX)
Address35 Krasnoyarskaya St., Novosibirsk 630004, Russia
Receptionfax +7 343 385-01-99
Sales Officetel. +7 383 230-43-06
Address6 Spartakovskaya Street, 11th floor, right wing, Kazan 420107, Russia
Receptiontel. +7 843 524-71-71
Address70D, Gvardeysky Business Centre, Dolomanovsky Lane, Rostov-on-Don, 344011, Russia
Tel. +7 (863) 2-688-300
Address:Lobachevsky Plaza, ulitsa Alekseevskaya, 10/16, Nizhny Novgorod, 603000
Tel. +7 (831) 233 06 06
AddressSuite 30, 19–21 Bohdana Khmelnytskoho Street, Kyiv, 01030 Ukraine
Reception+38 044 586 41 48, fax. +38 044 586 41 48 ext. 6665
Marketing & Media Relations
For general inquiries
Client Service Department+38 044 586 41 48 ext. 2601
AddressPolsky Uzviz 11, Morsky-2 Business Center (8th floor), Odessa 65026, Ukraine
Receptiontel./fax: +38 0487 37 44 10, +38 0445 86 41 48
Marketing & Media Relations
For general inquiries
Client Service Department+38 044 586 41 48 ext. 2601
Address1a (4th Floor), Kazanskaya St., Simferopol
Receptiontel. +7 495 739 70 00, fax +7 495 739 70 70
AddressOffice 308, Rubin Plaza Business Center, 5 Dzerzhinskogo prospekt, Minsk, 220036, Belarus
Reception+375 17 328-19-61, fax. +375 17 328-15-14
Mecidiyeköy Yolu Caddesi No: 12,
Trump Towers 2. Kule, 15. Kat, Ofis No: 1501-1502
34387 Mecidiyeköy, Şişli, İstanbul - Türkiye
Tel. +90 212 386 87 60 (pbx), fax. +90 212 284 46 48
Marketing & Media Relationspr@yandex.com.tr
Address:Citybay Business Center, Werftestrasse 4, CH 6005 Lucerne, Switzerland
Tel. +41-41-248-08-60, fax. +41-41-248-08-63
Address:Odeonhaus Limmatquai 2, 8001 Zürich, Switzerland
Tel. + 41 44 252 50 00
Address:Karl-Liebknecht-Straße, 1, 10178 Berlin, Germany
© 1997–2015 Yandex