Step 5 - Elasticsearch

AmazonES

EcommCo’s Data Lake leverages Amazon Elasticsearch Service for data governance over Data Providers' submissions, Curated Datasets and Published Data. All metadata pertaining to data in Data Lake are indexed and available in the Elasticsearch Service dashboard.

a. Demonstrate data governance

The diagram below illustrates how the availability of new data in S3 triggers events, which get published to SNS, which trigger Lambda functions to index data in the Elasticsearch Service.

b. Observe data being indexed

{% include 'error_box.html' %}

When you click this button, the following steps will be performed within your AWS account:

  • Metadata of objects in S3 Buckets are indexed in the Elasticsearch Service:
    • submissions Bucket
    • Curated Datasets Bucket
    • published Bucket
  • Note: Metadata indexing was enabled when this Data Lake Quick Start was launched in your account