# Extract Transform Load Explained ![rw-book-cover](https://a0.awsstatic.com/libra-css/images/logos/aws_logo_smile_1200x630.png) ## Metadata - Author: [[aws.amazon.com]] - Full Title: Extract Transform Load Explained - Category: #articles - Summary: ETL (Extract, Transform, Load) tools prepare raw data for data warehouses by transforming it in various ways. This includes basic tasks like cleaning and deduplicating data, as well as advanced transformations such as joining and summarizing data. Overall, ETL helps improve data quality and makes analysis easier. - URL: https://aws.amazon.com/what-is/etl/ ## Highlights - Extract, transform, and load (ETL) is the process of combining data from multiple sources into a large, central repository called a data warehouse. ETL uses a set of business rules to clean and organize raw data and prepare it for [storage](https://aws.amazon.com/what-is/cloud-storage/), [data analytics](https://aws.amazon.com/what-is/data-analytics/), and [machine learning (ML)](https://aws.amazon.com/what-is/machine-learning/). ([View Highlight](https://read.readwise.io/read/01jxbe9mmzzaf9rrw2s7jtttqm)) - A [data warehouse](https://aws.amazon.com/what-is/data-warehouse/) is a central repository that can store multiple databases. Within each database, you can organize your data into tables and columns that describe the data types in the table. The data warehouse software works across multiple types of storage hardware—such as solid state drives (SSDs), hard drives, and other cloud storage—to optimize your data processing. ([View Highlight](https://read.readwise.io/read/01jxbed2j9jv0mhsem12m4mvce)) - With a [data lake](https://aws.amazon.com/what-is/data-lake/), you can store your structured and unstructured data in one centralized repository and at any scale. You can store data as is without having to first structure it based on questions you might have in the future. Data lakes also allow you to run different types of analytics on your data, like SQL queries, big data analytics, full-text search, real-time analytics, and machine learning (ML) to guide better decisions. ([View Highlight](https://read.readwise.io/read/01jxbedhesddwc3k3q1xzt3wxd)) - The ETL process works in three steps: 1. Extract the relevant data from the source database 2. Transform the data so that it is better suited for analytics 3. Load the data into the target database ([View Highlight](https://read.readwise.io/read/01jxbee6cb84swr7bs60fn8pgc))