Extract - Raw data is copied or exported from source locations to a staging area during data extraction data mangement.
Transform - In this stage, raw data undergoes data processing. In this step, the data is processed and consolidated for its intended analytical use case.
Load - The converted data is moved from the staging area into the target data warehouse in this final stage.
An ETL pipeline (or data pipeline) is the system that allows ETL activities to take place. Data pipelines are a collection of tools and actions used to transfer data from one system with ites own manner of data storage and processing to another system where it might be kept and managed differently.
OLAP - Online Analytical Processing- is an approach to answer multi-dimensional analytical queries. Operates on Data Warehouse.
OLTP - Online Transaction Processing - facilitates and manages transaction oriented applications. Operates on Database.
CRUD - Create, Read, Update and Delete Operations are performed on the Database
OLTP Architecture - Database Server, Application Server, User Interface
OLTP (Online Transacti onal Processing) | OLAP (Online Analytical Processing) |
Current data | Historical data |
Day-to-day Transactional operations | Data analysis and decision making |
Normalised data structures | Star schema or Snow flake Schema |
Simple Queries | Complex Queries |
Requires Fast Response | Can have Longer Response Times |
Used by Front Line Employees, Managers | Analyst, Executives and Decision Makers |
Data in OLTP is processed | Data in OLAP is processed is periodically refreshed |
Used by Front Line Employees | Analyst, Executives and Decision Makers |
Oracle, MySQL, SQL Server, DB2 | Tableau, Power BI and SAP |
Star Schema
A star schema is a type of data modeling technique used in data warehousing to represent data in a structured way.
The fact table in a star schema contains the measures or metric
The dimension table in a star schema contain the descriptive
Snowflake Schema