July 21, 2015

Providing 20,000 Autocomplete Suggestions per Second: A Brief HOW TO

Implementing a feature like autocomplete may seem trivial at first glance. But as requirements are specified in more detail, it clearly becomes a bigger technical challenge. When you have to consider non-functional requirements (such as latency and scalability) for a website like willhaben.at that receives more than a billion page impression per month, things are a little bit trickier.

The term autocomplete is used to describe a range of different features and its actual requirements heavily depend on the overall context and the goal that should be achieved with this feature. Normally, what we mean by "autocomplete" is to complete a word the user is currently typing by providing him with a range of possible words that match the input string. But that's only one of the many ways in which this feature works.

So let's have a closer look at some particularly useful implementations of autocomplete solutions.

Different concepts for autocomplete suggestions

Display specific products

Geizhals.at provides an excellent example of a purposeful autocomplete feature. They suggest certain products to the user based on his/her input. The user can select a product directly without going to the search result page at all. Furthermore, they provide a preview of what would be found when submitting a search with the suggested search terms.

autocomplete at geizhals.at

As geizhals.at want to suggest specific products to their users, they focus on showing results that are as specific as possible to significantly shorten the path to the product page.

Suggest relevant keywords

A completely different approach is used by amazon. Instead of suggesting certain products, they display search words and categories. In addition, their autocomplete implementation also features spell correction.

autocomplete on amazon.de

Unlike geizhals.at, Amazon do not want to provide exact product-based suggestions, but they want to provide more generic search-based suggestions.

Comparing these two different approaches, they may seem similar at first glance but are in fact significantly different from a technical point of view.

Choose what you need

Our needs at willhaben.at are a little bit different. Unlike amazon or geizhals, we only have user-generated content and we also do not want to suggest terms that are too product-specific. The goal was to suggest generic key words, like amazon does. For example, when the user types in "ip" we are almost certain that he/she is searching for an iphone or ipad - so we want to suggest a list of all relevant Apple products. This is the most helpful information that we can show to the user at this point, and the suggestions serve as an indicator of both suitable products and product availability. As we do not have a product catalogue or anything alike, we have to auto-generate the suggestions ourselves.

As a result, this is an example of what we would not want to provide as a suggestion - as you can see, the output is way too long, specific and contains too much irrelevant data:


Consequently, we wanted to arrive at more generic suggestions that also illustrate the variety of user-generated input, as you can see in the example below:

autocomplete at willhaben.at
However, when implementing auto complete, there a more requirements to be considered:
  • control what is shown
    • As we only have user-generated content, the suggested words are also based on what users are selling on our platform. 
  • synonyms
    • These enhance the autocomplete suggestions. For example, it does not matter which of the following terms the user types in - auto complete always provides the same results: 90er, 90 er, 90ern, 90-er, 90iger, 90-iger
  • output composition
    • We only provide suggestions that are between 0 and 3 three words long. These suggestions are automatically generated based on all the items we have. Suggestions in that range are generic enough to allow the user to find a lot of adverts, but at the same time specific enough to adequately narrow down the search.
  • performance
    • As the largest website in Austria (with over 1,000,000,000 page impressions per month), performance is an issue. Our implemented autocomplete solution can handle over 20,000 requests per second per instance, even including infix autocompletion
  • latency
    • The response time of the autocomplete component should be not more than 20ms because you have to consider that data transmission, especially on mobile devices, takes considerable time. And if the response takes longer than 100ms overall, it tends to feel laggy as you type.
  • error handling
    • A feature like this must be designed in a very fault-tolerant way. We have to avoid that an error in a supporting feature like this (vs. core features that always have to be up and running) affects other parts of our platform. Recently, this has been described as resilient software design

Choose your tool

Our search back end is powered by Apache SOLR. This infrastructure component has proven to be bulletproof so we decided to stick with it. Obviously, however, there are other tools that can provide similar functionality, such as Elastic Search. Both SOLR and Elastic Search are based on lucene (https://lucene.apache.org/core/)so their features are nearly the same.

While we chose to create our own autocomplete solution, SOLR does provide out-of-the-box components that would be suitable for implementing the autocomplete functionality, i.e:
Each of these components (each of which would require a blog post in its own right) provides a wide range of configuration options, which is why it may fit different needs. This figure shows a quick comparison:

comparison SOLR components

As we can see, some components master spell correction while others support multiterm suggestion, but without a result count. In conclusion, none of the illustrated components fully provided all the features that we wanted to have.

Therefore, we decided to generate the suggestions on our own and use SOLR as a search back end for the suggestions that match the user input. The system environment (without load balancers, etc.) looks approximately like this:

autocomplete architecture

As you can see in the illustration above, we have different front end devices like mobile phones or browsers that access our autocomplete feature directly. As a result, the requests to autocomplete do not pass through the normal front end servers, as this could lead to reduced latency.

All relevant item titles are then exported from the database to a file. Because of this, it's very easy to add other sources for autocomplete terms and we do not depend on an up-and-running database for our import.

Content Push & SOLR

The heart of our feature is the AC content push. It takes every title and generates the suggestions. For example, when you have an online item that has the following title:

"Super Lederhose aus Bisonleder"

The generated suggestions are:
  • super lederhose
  • lederhose aus bisonleder
  • super
  • lederhose
  • bisonleder
Spell correction is also applied at this step. For example, for this term:


The generated suggestion will be:
  • Fahrrad
This spell correction works in both ways, i.e. on indexing time and query time. If the user enters "Fahrad" in the search field the suggestions will not contain "Fahrad" but "Fahrrad". Due to the fact that we exactly know what users are searching for and which mistakes they make, we can provide spell correction that no standard algorithm could provide. There are quite good algorithms out there using for example the Levenshtein distance to calculate the distance for strings. Depending on your requirements, that may fit your needs better than implementing your on dictionary-based approach as we did.

We also remove linking words (Bindewörter) and other tokens that we do not want to provide as suggestions. There are different algorithms used to generate only valid suggestions. Based on over 3 ,000,000 adverts, this takes about one hour to compute. Additionally, we remove special words or tokens like "cm" or "und" at the beginnings and ends of suggestions.

At the end of the process, we have about 20,000,000 suggestions ready. These terms will be indexed in SOLR with some metrics that define how to order the suggestions. This information is needed for determining which 10 suggestions should be shown to the user. The index size is < 2 GB, so not even a high-memory server is needed.

Precomputing autocomplete terms results in very fast query times. As a result, we now have response times between 5 to 10 ms depending on the load. The higher the load, the shorter response times we achieve because the cache hit ratio is better.

Autocomplete Component

This component provides a REST interface that will be queried by the clients using e.g. the JQuery autocomplete plugin. It simply takes the string the user has entered and returns an array of suggestions. This component has proven to be very useful if you want to provide a service with high availability and low latency.
As shown in the above figure "autocomplete architecture", this component actually queries the SOLR servers. To provide a high degree of fault tolerance, e.g. during a deployment, we use Hystrix  from the Netflix OOS stack. If the SOLR servers are not available or when there is heavy load, the component will stop querying the servers and trigger a fall-back procedure. Therefore, we use caches with this component so that we can serve most of the requests, even when the SOLR back end is down. As a result of designing components like this we get a very resilient system environment.


The requirements of an autocomplete solution are the driving force behind the technical solution. Depending on them and the non-functional requirements, such as fault tolerance and latency, the appropriate setup has to be chosen. There are a wide range of tools and technologies available that can provide autocomplete features in different facets. 

Last but not least - do not forget about your users and perform user tests. Even if the technical solutions turns out to be the latest and fanciest around, it is only as good as the value that it provides to the users. Or in other words: don't make it a case of style over substance.

No comments:

Post a Comment