What Is Amazon Kendra? Review Of The AWS Search Service

Amazon Kendra is the new search engine from Amazon Web Services (AWS), intended to facilitate access to information using machine learning.

With Kendra, Amazon is trying to conquer further digital channels by taking a direct competitive position with other companies such as Google or Microsoft.

Table of Contents

What is AWS’ Amazon Kendra search service?

Kendra’s goal is a user-friendly combination of data with advanced natural language processing to achieve the best possible customer experience.

The idea is that Kendra focuses on natural language queries (e.g. “What is Amazon Kendra?”) Instead of simple keyword searches (e.g. “AWS Kendra”) and answers them in the best possible way.

Using machine learning, Kendra then tries to extract the best possible answer from all linked data sources, deliver it and mark the most important information.

This procedure follows the development of Google search, which no longer spits out a document, but already shows which information within the document was decisive for the search success. Also know about what is amazon erc number

Content Search Challenges

Amazon Kendra tries to solve search, especially full-text search in the content area, elegantly. To understand the benefits of Kendra, one must first understand some of the challenges in content search:

BIG DATA IN SEARCH: LARGE AND RAPIDLY CHANGING AMOUNTS OF DATA

As the amount of data available for the search increases, so do the challenges facing the search system. Performing a full-text search on all documents live has not been sufficient for some time; instead, hash tables have moved to the centre of caching. These hash tables – constructed from relevant words within documents – show which terms can be found in which documents.

While this approach allows quick matching from query to result, this intermediate step is nevertheless cost-intensive when creating the tables. Since the database is changing increasingly quickly in the age of big data and user-generated data, the technological components of updates must be dealt with early.

There are two methods here: A full re-index vs change log tables. As the name suggests, the first is the approach that the index in the hash tables is completely rebuilt (e.g. at night as a batch process), while the second only uses changes in the database as the basis for an update of the hash.

The former is easier to implement and certainly makes sense for smaller amounts of data. At the same time, the latter, on the other hand, is more complex but allows the search index to be expanded even when the system is running.

Lastly, the infrastructure is the focus of search services. With increasing inquiries, one has to think about the scaling of the underlying architecture. In particular, NoSQL systems such as MongoDB are regarded as the answer to problems that RDMBS systems cannot solve.

OUTPUT THE BEST SEARCH RESULT: SEARCH INTENT VS SEARCH RESULT

But not only the technical basis of data availability and search query processing is relevant, but at the heart of a search result is the best possible delivery of results to the user.

The search intent is the basic truth (e.g. a user wants a white bike), while the search input (e.g. “white bike”) does not always clearly reflect this meaning (does he want a bike or a single bike?).

To serve the intent as optimally as possible, it is generally necessary to weigh all available data (product information, behavioural data, analytics data, etc.) and play them out according to the request.

A ranking cocktail (e.g. the title is more relevant than the description), frequencies (e.g. which results are clicked more often) or methods of natural language processing (e.g. TF-IDF) and machine learning (e.g. neural networks ) can be used.

MEANT BUT NOT WRITTEN: SYNONYMS, ABBREVIATIONS, TYPING ERRORS IN THE SEARCH QUERY

Finally, the big aspect of content search is to resolve unclear user inquiries and transfer them accordingly to possible results. The above example with the hash tables would not map the request “was Kendra” to “Amazon Web Services” and would not return a valid result.

The same goes for synonyms, abbreviations and typing errors. A variety of solutions are used to solve these challenges. From phonetic search (against typing errors) to business rules (to create synonyms), statistical methods (e.g. fuzzy search) to machine learning models, there are many possible solutions to process difficult but correct search queries.

Benefits and features of AWS Kendra

Kendra from Amazon Web Services avoids all three defined challenges through their basic functionality.

BIG DATA & SCALING BY KENDRA

Due to the direct integration into the cloud services of AWS, the scaling function is not an issue at Kendra. Both storage via S3 and cloud computing via EC2 or Lambda are almost infinitely scalable.

Therefore there are no problems when large amounts of data are used, and these amounts of data have to be processed in a highly complex manner. In general, Kendra allows access to all common databases using a variety of connectors (currently 17 common connectors: S3, file systems (SMB), Web crawler, Databases, SharePoint Online, SharePoint on-prem, Box, Dropbox, Exchange, OneDrive, Google Drive, Salesforce, Confluence, Jira, ServiceNow, Zendesk, Jive).

SEARCH INTENT VS SEARCH RESULT OPTIMIZATION BY KENDRA

The next interesting aspect is the matching between search intent and search results. This challenge is at the heart of Kendra. The main goal of Kendra is to provide answers to questions in natural language (e.g. “What functions does AWS Kendra have?”).

The extended input of a natural question allows the search engine to search for and find the most precise answer possible. But even with simple keywords (e.g. “functions”), Kendra links the blocks in a document with the user.

As a fallback mechanism, there is also an output of URLs trained for deep learning, which the user can follow accordingly. But also, the “classic” way of giving certain attributes in structured data (e.g. title, date, hits) more weight (“Ranking Cocktail”) is possible with Kendra.

All search results are based on domain-specific models (currently available for 16 domains: industrial manufacturing, I.T., legal, financial services, tourism and hospitality, insurance, pharmaceuticals, oil and gas, media and entertainment, healthcare, human resources, communications, telecommunications, mining, food and beverages and automotive), which further increases the quality of the search results.

SYNONYMS, ABBREVIATIONS AND AUTOMATIC IMPROVEMENT OF RESULTS

While AWS already covers the two previous points at Kendra, the tool is still weak on the last challenge. But the emphasis is on “still” – because most of the associated functions are marked as “available soon” ( as of July 2020 ).

For example, synonyms should be covered by lists. Amazon Kendra should learn which results fit well and use automatic search completion (so-called suggest)

One can be particularly excited about the analysis of the activities at Kendra (e.g. by tracking the search behaviour and the quality of the results), as this allows one to conclude the success of the search engine quickly.

Amazon would like to provide a range of metrics (e.g. most frequent search queries, most popular results, quality metrics such as mean reciprocal rank (MRR) and ratings) to support the optimization of the system.

Examples of applications from AWS Kendra / Use Cases

As shown at the beginning, there are many use cases for the use of AWS Kendra. Above all, everything that searches large amounts of text is particularly suitable for a Kendra search.

INTERNAL SEARCH USING AWS KENDRA

In the enterprise environment, in particular, search queries for knowledge in documents are increasing. Use cases range from internal FAQs to product information, research documents, document indexing, internal knowledge databases, and onboarding new employees using extensive material.

EXTERNAL SEARCH QUERIES BY KENDRA, E.G., ON-SITE SEARCH

In addition to optimizing the internal search, the supreme discipline for any content search is serving external customers or interested parties.

Whether in the support area, with digital products (e.g. magazines), e-commerce descriptions, content articles or more: Leading the user quickly to the goal is central to not losing him.

DATA MANAGEMENT USING INDEXING FROM KENDRA FOR ECOMMERCE OR PRODUCT SEARCHES

While there are special solutions for searching within product data (e.g. Fredhopper), AWS Kendra also tries to conquer this field. The intended use would be primarily in the e-commerce area, i.e. online shops to find the best product for the interested party.