Data Warehousing on AWS

I. Overview:

Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift. This course demonstrates how to ingest, store, and transform data in the data warehouse. Topics covered include: the purpose of Amazon Redshift, how Amazon Redshift addresses business and technical challenges, features and capabilities of Amazon Redshift, designing a Data Warehousing Solution on AWS by applying best practices based on the Well-Architected Framework, integration with AWS and non-AWS products and services, performance tuning, orchestration, and securing and monitoring Amazon Redshift.

II. Duration: 24 hours (3 days)
III. Objectives:
  • Describe Amazon Redshift architecture and its roles in a modern data architecture
  • Design and implement a data warehouse in the cloud using Amazon Redshift
  • Identify and load data into an Amazon Redshift data warehouse from a variety of sources
  • Analyze data using SQL QEV2 notebooks
  • Design and implement a disaster recovery strategy for an Amazon Redshift data warehouse
  • Perform maintenance and performance tuning on an Amazon Redshift data warehouse
  • Secure and manage access to an Amazon Redshift data warehouse
  • Share data between multiple Redshift clusters in an organization
  • Orchestrate workflows in the data warehouse using AWS Step Functions state machines
  • Create an ML model and configure predictors using Amazon Redshift ML
IV. Intended Audience:
  • Data engineers
  • Data architects
  • Database architects
  • Database administrators
  • Database developers.
V. Prerequisites:

Fundamentals of Analytics on AWS – Part 1 (Digital course), Fundamentals of Analytics on AWS – Part 2 (Digital course), Building Data Lakes on AWS (Instructor led Training), Building Data Analytics Solutions Using Amazon Redshift (Instructor led Training).

VI. Course outlines:

1. Module 1: Data Warehouse Concepts

  • Overview
  • Modern Data Architecture
  • Introduction to the Course Story
  • Data warehousing with Amazon Redshift
  • Amazon Redshift Serverless architecture
  • Instructor Demonstration: Creating an Amazon Redshift Serverless Data Warehouse
  • Instructor Demonstration: Amazon Redshift Provisioned Architecture
  • Knowledge Check
  • Lab 1: Access and Examine Data with Amazon Redshift Serverless

2. Module 2: Setting Up Amazon Redshift

  • Overview
  • Data Models for Amazon Redshift
  • Instructor Demonstration: Amazon Redshift Data Model
  • Instructor Demonstration: Connecting to Amazon Redshift through AWS Secrets Manager
  • Data Management
  • Instructor Demonstration: Distribution Styles, Sort Keys, and Compression Encodings
  • Managing Permissions
  • Knowledge Check
  • Lab 2: Setting Up a Data Warehouse Using Amazon Redshift Serverless

3. Module 3: Loading Data

  • Overview
  • Setting the Context
  • Loading Data from Amazon S3
  • Instructor Demonstration: Load Validation and Troubleshooting
  • ETL and ELT
  • Loading Streaming Data
  • Loading Data from Relational Databases
  • Instructor Demonstration: Aurora Zero-ETL
  • Knowledge Check
  • Lab 3: Populating the Data Warehouse

4. Module 4: Deep Dive into sql Query editor v2 and Notebooks

  • Overview
  • Features of Amazon Redshift query editor v2
  • Instructor Demonstration: Using Amazon Redshift query editor v2
  • Advanced Queries
  • Instructor Demonstration: Aggregation Extensions
  • Knowledge Check
  • Lab 4: Data Wrangling for Amazon Redshift

5. Module 5: Disaster Recovery

  • Overview
  • Disaster recovery
  • Instructor Demonstration: AWS Backup
  • Knowledge Check

6. Module 6: Amazon Redshift Performance Tuning

  • Overview
  • Amazon Redshift Performance Tuning
  • Table Maintenance and Materialized views
  • Query Analysis
  • Workload Management
  • Amazon Redshift Monitoring
  • Knowledge Check
  • Lab 5: Performance Tuning the Data Warehouse

7. Module 7: Securing Amazon Redshift

  • Overview
  • Authentication with Amazon Redshift
  • Access control with Amazon Redshift
  • Instructor Demonstration: Amazon Redshift Row-Level Security
  • Instructor Demonstration: Amazon Redshift Column-Level Security
  • Instructor Demonstration: Amazon Redshift Data Masking
  • Instructor Demonstration: Access Control with AWS Lake Formation
  • Data encryption with Amazon Redshift
  • Auditing and Compliance with Amazon Redshift
  • Knowledge Check
  • Lab 6: Securing Amazon Redshift

8. Module 8: Orchestration

  • Overview
  • Overview of Data Orchestration
  • Orchestration with AWS Step Functions
  • Orchestration with Amazon Managed Workflows for Apache Airflow
  • Instructor Demonstration: Creating State Machines
  • Knowledge Check
  • Lab 7: Orchestrating the Data Warehouse Pipeline

9. Module 9: Amazon Redshift ML

  • Overview
  • Machine Learning Overview
  • Knowledge Check
  • Getting started with Amazon Redshift ML
  • Amazon Redshift ML workflow scenarios
  • Amazon Redshift ML Usage
  • Knowledge Check
  • Lab 8: Predicting Ticket Sales with Amazon Redshift ML

10. Module 10: Amazon Redshift Data Sharing

  • Overview
  • Overview of Data Sharing in Amazon Redshift
  • Amazon DataZone for Data as a service
  • Instructor Demonstration: Amazon DataZone
  • Knowledge Check
  • Challenge Lab: The Lifecycle of the Data Warehouse
  • Học trực tuyến

  • Học tại Hồ Chí Minh

  • Học tại Hà Nội


Các khóa học khác