Data Warehouse Schemas, ETL and Reporting Tools

Last updated on Sep 27 2021
Jatin Gupta

Table of Contents

Data Warehouse Schemas, ETL and Reporting Tools

A schema is defined as a logical description of database where fact and dimension tables are joined in a logical manner. Data Warehouse is maintained in the form of Star, Snowflakes, and Fact Constellation schema.

Star Schema

A Star schema contains a fact table and multiple dimension tables. Each dimension is represented with only one-dimension table and they are not normalized. The Dimension table contains a set of attributes.

Characteristics

• In a Star schema, there is only one fact table and multiple dimension tables.
• In a Star schema, each dimension is represented by one-dimension table.
• Dimension tables are not normalized in a Star schema.
• Each Dimension table is joined to a key in a fact table.

The following illustration shows the sales data of a company with respect to the four dimensions, namely Time, Item, Branch, and Location.

Data Warehouse Schemas, ETL and Reporting Tools
Data Warehouse Schemas, ETL and Reporting Tools

There is a fact table at the center. It contains the keys to each of four dimensions. The fact table also contains the attributes, namely dollars sold and units sold.

Note − Each dimension has only one-dimension table and each table holds a set of attributes. For example, the location dimension table contains the attribute set {location_key, street, city, province_or_state, country}. This constraint may cause data redundancy.

For example − “Vancouver” and “Victoria” both the cities are in the Canadian province of British Columbia. The entries for such cities may cause data redundancy along the attributes province_or_state and country.

Snowflakes Schema

Some dimension tables in the Snowflake schema are normalized. The normalization splits up the data into additional tables as shown in the following illustration.

Data Warehouse Schemas, ETL and Reporting Tools
Data Warehouse Schemas, ETL and Reporting Tools

Unlike in the Star schema, the dimension’s table in a snowflake schema are normalized.

For example − The item dimension table in a star schema is normalized and split into two dimension tables, namely item and supplier table. Now the item dimension table contains the attributes item_key, item_name, type, brand, and supplier-key.
The supplier key is linked to the supplier dimension table. The supplier dimension table contains the attributes supplier_key and supplier_type.

Note − Due to the normalization in the Snowflake schema, the redundancy is reduced and therefore, it becomes easy to maintain and the save storage space.

Fact Constellation Schema (Galaxy Schema)

A fact constellation has multiple fact tables. It is also known as a Galaxy Schema.
The following illustration shows two fact tables, namely Sales and Shipping −

Data Warehouse Schemas, ETL and Reporting Tools
Data Warehouse Schemas, ETL and Reporting Tools

The sales fact table is the same as that in the Star Schema. The shipping fact table has five dimensions, namely item_key, time_key, shipper_key, from_location, to_location. The shipping fact table also contains two measures, namely dollars sold and units sold. It is also possible to share dimension tables between fact tables.

For example − Time, item, and location dimension tables are shared between the sales and shipping fact table.

Data Warehouse – ETL & Reporting Tools

An ETL tool extracts the data from all these heterogeneous data sources, transforms the data (like applying calculations, joining fields, keys, removing incorrect data fields, etc.), and loads it into a Data Warehouse.

Extraction

A staging area is required during the ETL load. There are various reasons why staging area is required. The source systems are only available for specific period of time to extract data. This period of time is less than the total data-load time. Therefore, staging area allows you to extract the data from the source system and keeps it in the staging area before the time slot ends.

The staging area is required when you want to get the data from multiple data sources together or if you want to join two or more systems together.

For example − You will not be able to perform an SQL Query joining two tables from two physically different databases.

The data extractions’ time slot for different systems vary as per the time zone and operational hours. The data extracted from the source systems can be used in multiple Data Warehouse Systems, Operation Data Stores, etc.
ETL allows you to perform complex transformations and requires extra area to store the data.

Data Warehouse Schemas, ETL and Reporting Tools
Data Warehouse Schemas, ETL and Reporting Tools

 

Transform

In data transformation, you apply a set of functions on extracted data to load it into the target system. The data that does not require any transformation is known as a direct move or pass through data.

You can apply different transformations on extracted data from the source system. For example, you can perform customized calculations. If you want sum-of-sales revenue and this is not in database, you can apply the SUM formula during transformation and load the data.

For example − If you have the first name and the last name in a table in different columns, you can use concatenate before loading.

Load

During the Load phase, data is loaded into the end-target system and it can be a flat file or a Data Warehouse system.

BI Reporting Tool

BI (Business Intelligence) tools are used by business users to create basic, medium, and complex reports from the transactional data in data warehouse and by creating Universes using the Information Design Tool/UDT. Various SAP and non-SAP data sources can be used to create reports.

There are quite a few BI Reporting, Dashboard and Data Visualization Tools available in the market. Some of which are as follows −

• SAP Business Objects Web Intelligence (WebI)
• Crystal Reports
• SAP Lumira
• Dashboard Designer
• IBM Cognos
• Microsoft BI Platform
• Tableau Business Intelligence
• JasperSoft
• Oracle BI OBIEE
• Pentaho
• QlickView
• SAP BW
• SAS Business Intelligence
• Necto
• Tibco Spotfire

So, this brings us to the end of blog. This Tecklearn ‘Data Warehouse Schemas, ETL and Reporting Tools’ blog helps you with commonly asked questions if you are looking out for a job in Cognos BI. If you wish to learn Cognos Analytics and build a career in Business Intelligence domain, then check out our interactive, IBM Cognos Analytics Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

IBM Cognos Analytics 11 Training

IBM Cognos Analytics 11 Training

About the Course

IBM Cognos 11 training provides in-detailed knowledge on how to build complex and advanced reports and dashboards using IBM Cognos reporting tool, and this will enable you to become a Cognos developer. Major topics of this Cognos reports and dashboard training include Cognos architecture, the components of Cognos, Analysis Studio, OLTP system, Query Studio, advanced report designing, Cognos Framework Manager and Cognos Administration. By the end of this training, you will get an insight into working with IBM Cognos reporting tool by implementing real-time Cognos projects, and you will be able to use this knowledge in real-time environments.

Why Should you take IBM Cognos Analytics 11 Training?

• The average Cognos developer salary is $78417 or an equivalent hourly rate of $38-Erieri.com.
• Cognos has a market share of around 9% globally.
• Cognos Analytics V11 enhances the efficiency and capabilities of business users, report authors, and administrators alike through a simplification of and the graduated nature of its user experience.

What you will Learn in this Course?

Overview of IBM Cognos BI and Cognos BI Architecture

• Overview of IBM Cognos Analytics 11
• IBM Cognos Framework Manager Modelling
• New UI features of Cognos Analytics 11

Cognos Analytics 11: Data Modules

• Overview of Cognos Analytics 11 Data Modules
• Creation of Data Server Connection
• Using Data Modules to modulate data without using the Framework Manager
• Data Discovery through intent-based modelling
• Creating Data modules with inputs from various sources

Working with Dashboards

• Process of Dashboarding in IBM Cognos
• Creating a Dashboard
• Filter Data in Dashboard
• Widget-to-Widget Communication
• Sorting, Grouping and Calculating Data
• Storytelling

Cognos Environment

• Active Report
• New Visualization Types
• Sharing and Embedding Contents

Cognos Reports

• Cognos Report Studio
• Reporting using Filters, Formatting, Charts
• Advance Report Building Techniques
• Cognos Active Reports
• Geo Spatial Analytics

Cognos Administration

• Deep Dive into Cognos Administration
• Analysing and Administering Queries

Dimension Modelling

• New Features in Smart Modelling
• Transformer Cube and Drill through
• Dynamic Cube and Drill Though

0 responses on "Data Warehouse Schemas, ETL and Reporting Tools"

Leave a Message

Your email address will not be published. Required fields are marked *