Data Modeling Concept in Datawarehouse

Data Modeling Concept in Datawarehouse :  


Data Modeling is about representing the real world set of data structures or entities and their relationships in the form of Data Models, required for a database. Simply put, data model is a visual representation of the database.
Data Modeling consists of various types and phases like
1.conceptual data modeling,
2.logical data modeling,
3.physical data modeling,
4.enterprise data modeling,
5.relational data modeling and
6.dimensional data modeling.

Conceptual Data Modeling:
Conceptual Data Modeling visualizes the overall structure of the database and provides high-level information about the subject areas or data structures of an organization and it does not contain much detailed level of information about attributes.
Logical Data Modeling: Logical Data Modeling is an extension to Conceptual Data Modeling and its includes almost all of the entities, attributes and their relationship. A logical Data Model will not contain any attribute specific information like type, length etc., instead it defines and conveys business information and rules.

Physical Data Modeling: Physical Data Model includes data structures like tables, columns, properties and the relationship between them.

Enterprise Data Modeling:
Enterprise Data Modeling is known as global business model as it consolidates the information across the enterprise.

Relational Data Modeling:
Relational Data Model revolves around Entity-Relationship Modeling where entities(tables) are normalized to avoid possible redundancy and this type is the prefered technique in OLTP.
Dimensional Modeling:
Dimensional Model comprises a fact table and many dimension tables and is used for calculating summarized data. Since Business Intelligence reports are used in measuring the facts(aggregates) across multiple dimensions, dimensional data modeling is the prefered modeling technique in a BI environment. A Fact table contains various measures or facts like sales amount, loan amount etc., whereas a Dimension table describes the particular entity like time, state etc., based on which the required facts are measured.

Star Schema:
In a Dimensional Data Model, a fact table is the centralized table which is connected to multiple dimensions related to that fact. This type of approach is known as the Star Schema design based on which data warehouse and data marts are built. Since BI takes advantage of data displayed in the form of mutli-dimensional cubes, this star schema approach helps analyze and produce complex reports very easy by slicing and dicing the dimensions of interest. From the sample diagram shown below, the required Fact 'Loan Amount' can be calculated across various dimensions like state, branch, time, product, loan officer and investor dimensions.
Sample Star Schema Diagram
Star Schema
ETL Process:
ETL, an acronym for 'Extraction, Transformation and Loading' is a collection of processes associated with extracting the source data, transforming that data and finally loading that data into a data warehouse. Before loading the required into data warehouse, it should be transformed in order to meet the needs of the data warehouse. This transformation involves several processes like data cleansing, data profiling, data type conversion, validating for referential integrity, performing aggregation if needed, denormalization and normalization.
Data Warehouse:
Data warehouse is a centralized repository where all the information for analysis is kept in an organization. This is the data collected from variant sources for the purpose of analytical processing and reporting. This data is non-volatile and a data warehouse is built on a dimensional data model. From this data warehouse, data can be extracted for reporting needs with the help of query tools or many data marts can be built based on subject area requirements.
Data Mart:
Data Mart is subject oriented, basically a sub-set of data warehouse, built for the purpose of analyzing a particular line of business or department. It holds the data specific to a particular subject area like sales, purchase etc. Data marts can be of derived from a data warehouse or built for the sole purpose of BI directly from the source and like data warehouse, data marts are also constructed from dimensional data models.
What is OLAP?
OLAP, an acronym for 'Online Analytical Processing' is a technique by which the data sourced from a data warehouse or data mart is visualized and summarized to provide perspective multidimensional view across multiple dimensions. Generally OLAP refers to OLAP Tools(e.g Cognos, Business Objects etc.,) that help to accomplish these tasks. Since data warehouse is designed using a dimensional data model, data is represented in the form of data cubes enabling us to aggregate facts, slice and dice across several dimensions. OLAP tools provide options to drill-down the data from one hierarchy to another hierarchy.
For example sales amount can be calculated for a particular year or it can be drilled down to its next hierarchies like month, week, day etc. In the same way, data can be rolled up for summarization from product to product group, product group to product sub-class then from product sub-class to product class. Thus with this cube structure, data can be viewed from multiple points providing the data analysts, a greater insight into data.
There are many OLAP hybrids or variants like MOLAP(Multidimensional OLAP), HOLAP(Hybrid OLAP), ROLAP(Relational OLAP), DOLAP(Desktop OLAP or Database OLAP) available in the market and can be used depending on the needs and requirements of an organization
OLAP - Examples:
Topmost executives of an organization are really interested in aggregated facts or numbers to take decisions rather than querying several databases (that are normalized) to get the data and do the comparison by themselves. OLAP tools visualize the data in an understandable format, like in the form of Scorecards and Dashboards with Key Performance Indicators enabling managers to monitor and take immediate actions. In todays business life, OLAP plays a vital role by assisting decision makers in the field of banking and finance, hospitals, insurance, manufacturing, pharmaceuticals etc., to measure facts across geography, demography, product, and sales.
OLAP can be performed in data warehouses that undergo frequent updates and that do not. Following are some of the examples to show how OLAP solves complex queries involving facts to be measured across company’s best-interested dimensions.
  • Comparison of sales (fact) of a product (dimension) over years (dimension) in the same region (dimension).
  • How may members (fact) have opened a savings account (dimension), in USA branch (dimension), over a period (dimension)?
  • How many mortgage loans (fact) have been approved in fixed mortgage (dimension) or Adjustable Rate Mortgage (dimension) in New York City (dimension), over a period (dimension)?
  • What is the total sales value (fact) of a particular product (dimension) in a particular grocery store (dimension), over a period (dimension)?
  • What is the amount spent (fact) for a particular product promotion (dimension) in a particular branch (dimension) or in a particular city (dimension), over a period (dimension)?
What is Data Mining?
Data Mining is a set of processes related to analyzing and discovering useful, actionable knowledge buried deep beneath large volumes of data stores or data sets. This knowledge discovery involves finding patterns or behaviors within the data that lead to some profitable business action. Data Mining requires generally large volumes of data including history data as well as current data to explore the knowledge. Once the required amount of data has been accumulated from various sources, it is cleaned, validated and prepared for storing it in the data warehouse or data mart. BI reporting Tools capture the required facts from these data to be used by the knowledge discovery process. Data Mining can be accomplished by utilizing one or more of the traditional knowledge discovery techniques like Market Basket Analysis, Clustering, Memory Based Reasoning, Link Analysis, Neural Networks and so on.
Data Mining Life Cycle:
  • Find out the Business Problem: Consider a company's current year sales dropped by a percentage when compared to the previous year. By using OLAP Tools, the exact sales fact can be determined across several dimensions like region, time etc.
  • Knowledge Discovery: Given this business problem, various reasons for the decrease in sales have to be analyzed utilizing one or more of the Data Mining Techniques. Causes may include poor quality or service of the product or flaws in marketing schemes or less demand for the product or seasonal changes or regulations enforced by the Government or competitors pressure, and so on. The exact solutions have to be found out in order to resolve this sales drop, which we call it as the Knowledge Discovery here.
  • Implement the Knowledge: Based on above discovery, proper actions should be taken in order to overcome the business problem.
  • Analyze the Results: Once it is been implemented, results need to be monitored and measured to find out outcomes of that action.
OLAP vs Data Mining:
OLAP helps organizations to find out the measures like sales drop, productivity, service response time, inventory in hand etc. Simply, OLAP tell us 'What has happened' and Data Mining helps to find out 'Why it has happened' at the first place. Data Mining can also be used to predict 'What will happen in the future' with the help of data patterns available within the organization and publicly available data.
For example if a borrower with bad credit and employment history apllies for a mortgage loan, his/her application may be denied by a mortgage lender since he/she may default the loan if approved. The mortgage lender would have come to this decision based upon the historical data previously mined following a similar pattern.
Business Intelligence Tool Guide:
Business Intelligence Tools assist organizations to improve their overall performance by helping them to plan, track, monitor, analyze and report the business activities. These tools improve customer relationship management thereby increasing company's profitability significantly. Below are the few guidelines which may be of help while working with BI tools.
·                            How to install and setup the BI software?
·                            How to get license and training from BI software vendors?
·                            How to create users, administrators and assign privileges to users?
·                            How to connect to the different database servers from BI applications?
·                            How to understand and work on BI data models or universe?
·                            How to frame the select statement according to the business requirements?
o                                             How to select the tables that have to be used in the report?
o                                             How to select the columns that are required for reporting?
o                                             How to write the join condition to join (inner join, outer join, equi join) different tables in select statement?
o                                             How to write multiple select statements in a single report?
o                                             How to write the filters (null, in, equal to, greater than) that are required after the where clause?
o                                             How to create dimensions and facts?
o                                             How to drill up and drill down?
o                                             How to set user prompts for user to enter values?
o                                             How to process the query and retrieve the results?
·                            How to work on results?
o                                             How to modify field formats?
o                                             How to sort data?
o                                             How to create computer items like date functions, numeric and string functions?
·                            How to create pivots?
o                                             How to add data?
o                                             How to create totals?
o                                             How to group data?
·                            How to create charts?
·                            How to create reports?
o                                             How to work on reporting body?
o                                             How to work on report group headers?
o                                             How to work on report header/footer?
o                                             How to work on page header/footer?
o                                             How to design the report layout?
o                                             How to use page breaks?
·                            How to schedule, monitor, modify, delete, and refresh a job(report)?
·                            How to write report design document, /report testing document, test reports and get user acceptance?
·                            How to distribute reports and results via email, printers, intranet server, and web?
·                            How to export and import data?
·                            How to track on scorecards, balancing scorecards, forecasting, key performance indicators and dashboards?
Business Intelligence & Key Performance Indicators:
Key Performance Indicators, commonly referred to as KPIs, are a list of measurements that are identified as critical factors in achieving the organizational goals or mission. KPIs are often identified in a business to help them drive a business towards its success and are associated with a number of business activities like Customer Relationship Management(CRM), Supply Chain Analytics or any other activity that is happening within the organization.
Requirements of a good KPI:
There can be a number of factors related with the success of a company; All of these factors cannot be chosen as the indicators; Only those that are mission critical, strictly adhering to the organizational goals and accurately measurable should be selected as the company's KPIs. It is always better to keep the number of KPIs to a minimum to make sure that greater focus can be given to each of these indicators. So the important factors to be considered in selecting a KPI are as follows:
  • Measurable: A KPI should be quantifiable in terms of numbers.
  • Reflect the organizational Goals: A KPI should drive a business towards success.
  • Actionable: It should help the managers to initiate some business action as a result of all the analysis and measures lead by KPI.
Examples of KPIs:
A KPI may reflect regional sales by sales person, supply chain statistics by supplier, productivity by units, customer satisfaction, customer growth or it may reflect employee turnover. In either case, it should give a high-level, real time information to the top level managers enabling them to concentrate in the company's success
What is a Dashboard in Business terms?
A Business Intelligence Dashboard visually represents the key organizational performance data in a near real time, user friendly manner that can be understood instantaneously. Technically speaking, a Dashboard is a visual representation that reflects the Key Performance Indicators(KPIs) of interest for managerial review and not only that it enables them to drill-down further. Business Intelligence Dashboard is similar in function to a car dashboard in that it displays and provides access to the powerful analytical systems and key performance metrics in a form enabling business executives to analyze trends and more effectively manage their areas of responsibility.
Features of Dashboard:
A typical web based Business Intelligence Dashboard encompasses the following features:
  • Web based Interface: Managers can gain broad visibility into the real-time key measurements of a business with the help of this multi-window, intuitive and interactive interface.
  • Role Based View: Executives can clearly track their organization's overall performance against its goals.
  • Reports: Configurable, user-level as well as management-level reports.
  • Charting and Graphing: Dashboards are better known for their easy one-click charts and graphs that gives instant access to complex solutions.
  • Pre-defined Performance Metrics: All the Dashboards are built with the common pre-defined metrics by default which eases the business user in tracking the regular yet important performance metrics.
Benefits of using Business Intelligence Dashboard:
Dashboards quickly convert and communicate the complex corporate data into a meaningful display of charts, graphs, gauges and other formats concurrently. A dynamic, intelligence Dashboard will allow the managers to drill-down data to go deeper into the anaysis. It eliminates the need to go through several reports, in one shot Dashboard gives a clear picture about how a company is performing in its critical areas
Scorecards:
Scorecards are similar to Dashboards in a way that it provides easy-to-understand, summarized, at-a-glance data for the managers and top officials to tell them about their company's present and past performance. Scorecards thus help to monitor the Key Performance Indicators accurately and to communicate the goals and strategies across the organization in an efficient and elegant manner. In a Business Intelligence environment, Scorecards allows managers to set metrics or targets and monitor them to see their impact on every department.
Balanced Scorecards:
A methodology created by Drs. Robert S. Kaplan and David P. Norton in 1992, is a management concept which helps managers at all levels monitor results in their key areas, including financial performance, customer knowledge, internal business processes and learning. It has been implemented in thousands of corporations, organizations and government agencies worldwide.
Dashboard Softwares:
Following are few of the famous Business Intelligence Dashboard softwares available in the current market.
Cognos ReportNet: A product from Cognos that helps to build, distribute, and view multi-object dashboards with graphical elements such as gauges, charts, and clickable pictures. Connects dashboards with underlying tabular data, letting people go from a high-level view to a deeper understanding.
Source: www.Cognos.com
MicroStrategy: Microstrategy Dashboards are designed to deliver maximum visual impact in a format optimized for quick absorption, using a combination of tables, graphics, gauges, dials and other graphical indicators, as well as conditional formatting, free-form labels, borders and background colors.
Source: www.microstrategy.com
BusinessObjects Dashboard Manager: Dashboard Manager lets one choose the level of detail they need - an overview, a specific chart or graph, or the underlying report.
Source: www.businessobjects.com
The Compliance Management Dashboard: A product from Hyperion provides easy-to-use screens, meters, and stoplights that quickly convey critical information. The Compliance Management Dashboard includes a data model with pre-built connectors to data sources, a set of compliance metrics, and a series of easy-to-use screens, meters, and stoplights designed to quickly convey critical information needed by finance executives.
Business Intelligence Tools Directory:
  • Actuate
  • Business Objects
  • Cognos
  • HummingBird
  • Hyperion
  • Information Builders
  • MicroStrategy
  • ProClarity
  • Siebel

No comments:

Post a Comment