AWS Data Exchange

AWS Data Exchange makes it easy to find, subscribe to, and use third-party data in the cloud. It simplifies access to data and provides you a secure and easy-to-use way of consuming third-party data products. With datasets from AWS Data Exchange, you can make more data-driven decisions. AWS Data Exchange has hundreds of data products from multiple data providers. After creating a subscription, you can export data into you Amazon S3 bucket and then analyze it with services such as Amazon Athena, Amazon Redshift, build machine learning models using Amazon Sagemaker, transform and process data with Amazon EMR and AWS Glue, or add it to a data lake with AWS Lake Formation.

Task - learn about AWS Data Exchange.

Watch the following video to learn about AWS Data Exchange.

Here are important AWS Data Exchange terminologies:

  • An Asset is data object (a single file). Each asset is uniquely identifiable via an ID. One or more assets are grouped together to create a revision.

  • A Revision is a point-in-time view/update of data set. Each revision is uniquely identifiable via an ID. One or more revisions are grouped together in form of a dataset.

  • A Data Set is a logical grouping of data which you subscribe to. Each dataset is uniquely identifiable via an ID

This is how you use AWS Data Exchange as a subscriber:

  • Browse the catalog – you will explore the data products published by sellers on AWS Data Exchange.
  • Subscribe to the product – Once you have identified a product you want to use, you will subscribe to it. Note that you are billed on your AWS bill for paid products. The product you will use in today’s workshop is a free product.
  • Use the product – After you have subscribed to the product, you get access to the product and you can export the dataset into an Amazon S3 bucket.

Task - subscribe to a dataset

In today’s workshop, you will subscribe to a sample dataset (500 Image & Metadata free sample) from shutterstock which contains a single revision. As you will see, the dataset contains camera footage of workplace which allows pets and also some images from whole foods, a grocery store chain.

Step 1: Subscribe to AWS Data Exchange product

  1. Login to your AWS account.
  2. Open AWS Data Exchange service.
  3. Browse different dataset listings.
  4. Search for 500 Image & Metadata free sample product from Shutterstock.
  5. Choose Continue to Subscribe.
  6. Choose Subscribe.
  7. Note that you dont need to run the export job.

Congratulations, you have just completed step 1 from the following architecture diagram.