Building a Data Lake on AWS – Whitepaper – AWS

A data lake is an architectural approach that allows you to store massive amounts of data in a central location, so that they are easily available to be categorized, processed, analyzed and consumed by various groups within an organization.

Since data – structured and unstructured – can be stored as they are, there is no need to convert them to a predefined schema and you no longer need to know in advance what questions are going to be asked about the data.

