Dataware House - Storage Area used for Analytical Purpose is stored in Dataware House. Dataware house combines data from multiple sources. Generally, it includes historical Data.
1. Non-Volatile Data (Non-Modified).
2. Subject Oriented
3. Integrated Data
4. Time-Variant
Extract - Raw data is copied or exported from source locations to a staging area during data extraction data mangement.
Transform - In this stage, raw data undergoes data processing. In this step, the data is processed and consolidated for its intended analytical use case.
Load - The converted data is moved from the staging area into the target data warehouse in this final stage.
An ETL pipeline (or data pipeline) is the system that allows ETL activities to take place. Data pipelines are a collection of tools and actions used to transfer data from one system with ites own manner of data storage and processing to another system where it might be kept and managed differently.
OLAP - Online Analytical Processing- is an approach to answer multi-dimensional analytical queries. Operates on Data Warehouse.
OLTP - Online Transaction Processing - facilitates and manages transaction oriented applications. Operates on Database.
CRUD - Create, Read, Update and Delete Operations are performed on the Database
OLTP Architecture - Database Server, Application Server, User Interface
OLTP (Online Transactional Processing) | OLAP (Online Analytical Processing) |
Current data | Historical data |
Day-to-day Transactional operations | Data analysis and decision making |
Normalised data structures | Star schema or Snow flake Schema |
Simple Queries | Complex Queries |
Requires Fast Response | Can have Longer Response Times |
Used by Front Line Employees, Managers | Analyst, Executives and Decision Makers |
Data in OLTP is processed | Data in OLAP is processed is periodically refreshed |
Used for Analysing the Business | Used for Running the Business |
Oracle, MySQL, SQL Server, DB2 | Tableau, Power BI and SAP |
Provides Summarized and Consolidated Data | Provides Detailed and flat relation view of Data |
Example: Database | Example: Data warehouse |
In Database Testing, we will focus on tables, relationship, columns and datatypes.
In ETL Testing, we will focus on data.
Data Model
1. Conceptual Data Model
3. Physical Data Model - Entities, Attributes and Datatypes are defined
Feature | Conceptual | Logical | Physical |
---|---|---|---|
Entity Name | ✓ | ✓ | |
Entity Relationships | ✓ | ✓ | |
Attributes | ✓ | ||
Primary Key | ✓ | ✓ | |
Foreign Keys | ✓ | ✓ | |
Table Names | ✓ | ||
Column Names | ✓ | ||
Column Data Types | ✓ |
Types of Schema - Schema is a collection of entities and their relationships
Star Schema
A star schema is a type of data modeling technique used in data warehousing to represent data in a structured way. It cotains fact table in the middle and dimensional tables around it
Fact Table in a star schema contains the measures or metric. Fact Table contains relationship with other tables. Every column in the fact table is the primary key
The dimension table in a star schema contain the descriptive
Snowflake Schema
Slowly Changing Dimensions (SCD)
ETL Tools
Types of Data Load in ETL Testing
Masking - Hiding/Encrypting Sensitive Data is called Masking
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment