Skip to content

Latest commit

 

History

History
64 lines (63 loc) · 3.14 KB

File metadata and controls

64 lines (63 loc) · 3.14 KB
layout flashcard-topic
title DP 900 - Azure Data Analytics
main_card_title Azure Data Analytics
main_card_bg #6586c3
card_bg #9aacd5
cards
title description
Data Analytics
Goal - Convert raw data to intelligence
title description
Data Analytics Approach
Ingest, Process, Store (data warehouse or a data lake), Analyze
title description
Data Ingestion
Capture raw data from various sources (stream or batch)
title description
Data Processing
Clean, filter, aggregate, and transform data to prepare for analysis
title description
Data Storage
Store data in a warehouse or lake for easy retrieval
title description
Data Querying
Run queries to analyze the data and gain insights
title description
Data Visualization
Create visualizations to help business spot trends, outliers, and patterns in data
title description
Descriptive analytics
Based on historical/current data, monitor status and generate alerts.
title description
Diagnostic analytics
Take findings from descriptive analytics and dig deeper to understand why something is happening.
title description
Predictive analytics
Predict probability based on historical data to mitigate risk and identify opportunities.
title description
Prescriptive analytics
Use insights from predictive analytics to make data-driven informed decisions.
title description
Cognitive analytics
Combine traditional analytics techniques with AI and ML features to make analytic tools that think like humans.
title description
Big Data - 3Vs
Volume, Variety, Velocity
title description
Data warehouse
PBs of storage and compute, data stored after processing, uses specialized hardware - Azure Synapse Analytics
title description
Data lake
Retains raw data, typically uses object storage, supports ad-hoc analysis - Azure Data Lake Storage Gen2
title description
Star Schema
Data warehouses organize data as Dimensions and Facts. De-normalized and easier to query.
title description
Azure Synapse Analytics
End-to-end analytics solutions with SQL and Spark pools
title description
Azure Data Factory
Fully managed serverless service for ETL and data integration
title description
Azure Power BI
Unify data and create BI reports & dashboards
title description
Azure HDInsight
Managed Apache Hadoop Azure service
title description
Azure Databricks
Managed Apache Spark service
title description
Massive Parallel Processing (MPP)
Split processing across multiple compute nodes - Spark, Azure Synapse Analytics etc
title description
Batch Pipelines
Buffering and processing data in groups. Read from storage (Azure Data Lake Store) and process.
title description
Streaming Pipelines
Real-time data processing
title description
Apache Parquet
Open source columnar storage format. High Compression.
title description
ETL
Extract, Transform, and Load - Retrieve data, process and store it
title description
ELT
Extract, Load, and Transform - Data is stored before it is transformed