AWS ElasticSearch – your data comes alive

The Challenge

Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and operate Elasticsearch at scale with zero down time. The service offers open-source Elasticsearch APIs, managed Kibana, and integrations with Logstash and other AWS Services, enabling you to securely ingest data from any source and search, analyze, and visualize it in real time. Amazon Elasticsearch Service lets you pay only for what you use – there are no upfront costs or usage requirements. With Amazon Elasticsearch Service, you get the ELK stack you need, without the operational overhead.

A considerable amount of this data is loaded in the client databases daily and with a fast, careful and detailed preparation and analysis, data becomes a valuable business resource. The client search cluster contains an extremely comprehensive data set as it indexes over 20 documents daily with millions of records which include all the products that can be purchased on the site. The product documents include all the information related to that particular product: the product title, its description, the image link, keywords, meta information for the technical specifications of these products, stock status so they know how many days it will take to ship the product, pricing information, product category, etc.

AWS ElasticSearch allows indexing of data in the same time while data is loading into the database. After the load is done, client has ready-to-search and analyze data.

 

Solution

Ranging from tailing a simple log file to complete, complex and critical business analytics, ELK stack comes together for playing the role for you.

In our use case, after the file with millions of records has been placed on S3, several worker Lambda functions start asynchronously to process the file and index searchable data. The process of indexing is done in parallel with the process of loading the data into the database. Index itself provides autocomplete search by n-grams. The final API that is consumed by the end user provides fast and natural way of searching the large amount of data with millions of records.

AWS provides endpoints for managing indexes and check search/analytic behavior of indexed data. AWS also provides out-of-the box Kibana for data visualization. Thanks to the real-time nature of Elastic search, users can refine their search before clicking the search button by giving them a preview of their results.

At the heart of ELK Stack is Elasticsearch being JSON-based and RESTful search engine designed for scaling millions of events per second providing maximum reliability. AWS ELK provides full text search and metrics together with ELK log analytics use cases like: Risk management, Market intelligence, E-commerce personalization, Compliance, Security analysis, Fraud detection etc.

Benefits and Results

  • Provide ready to search and analyze data right after the data is loaded into the system. Final user does not wait for additional processing time for indexing large amount of data;
  • Client can reduce the number of server and resources needed for data search – ES handles hard indexing and time-consuming search.
  • Indexing of data is fast and reliable. When the built-in analyzers do not fulfill your needs, ES allows creation of a custom analyzer which uses the appropriate combination of tokenizers and filters.

 

See the complete case study here >>