Thursday, June 4, 2015

3rd (Last) GSLC) for Database Management Class (BE11)

Dear students of Database Management Class (BE11),
Below is a list of questions for 3rd (last) GSLC). Each of you only needs to choose a question and then answer it as soon as possible. You may not choose and answer a question which has been taken and answered by someone (thus redundant answers are not allowed). Please be the first as you will have the greater freedom to choose a question as your preference. If you become the last one to answer then you will not have a choice any more.



1. Describe how a dimensional model (DM) differs from an Entity–Relationship (ER) model.
2. Describe how the fact and dimensional tables of a star schema differ.
3. The star, snowflake, and starflake schemas offer important advantages in a data warehouse environment.  Describe these advantages.
4. Describe the main activities associated with each step of the Nine-Step Methodology for data warehouse database design.
5. Describe the purpose of assessing the dimensionality of a data warehouse.
6. Describe OLAP applications and identify the characteristics of such applications.
7. Describe the characteristics of multidimensional data and how this data can be represented.
8. Describe Codd’s rules for OLAP tools.
9. Describe the architecture, characteristics, and issues associated with each of the following categories of OLAP tools:
(a) MOLAP,
(b) ROLAP,
(c) HOLAP,
(d) DOLAP.

Rules:
1. You may answer in the comment field below or at binusmaya forum
2. Only high quality answer will be rewarded 10 points for your final exam.
3. Deadline is 7th June 2015.

Thank you.

8 comments:

  1. Aendy suwarno - 1601226283
    hi sir so this is my answer about Nine-Step Methodology for data warehouse database design no 4

    1. Choosing the process
    choosing the process means tha we make the step by step so it become structured. and Choosing the process is important because it defines a specific design target and allows the grain, dimensions, and facts to be declared

    2. Choosing the grain
    in choosing grain step we see all numeric thing that comes from the table so it can be establishes exactly what a single fact table row represents

    3. Identifying and conforming the dimensions
    Dimensions in here is provide the “who, what, where, when, why, and how” context surrounding a business process event. Dimension tables contain the descriptive attributes used by BI applications for filtering and grouping the facts

    4. Choosing the facts
    seeing all the fact that connect between the all the entity to the fact table

    5. Storing pre-calculations in the fact table
    put all the calcuation entity inside the fact table

    6. Rounding out the dimension tables
    Make the relation of all table so it have connection between the entity and the fact table

    7. Choosing the duration of the database
    choosing the duration for the databe so it can be evaluate and report and go to archival storage. the duration usually in term of 3 until 5 years

    8. Tracking slowly changing dimensions
    this step telling us to see is there any plus inside each of the entity is change or not

    9. Deciding the query priorities and the query modes
    choosing the priority for each entity so it get the information about all plan usually this step is calculated by decision maker

    ReplyDelete
  2. 1. Multi-dimensional conceptual view
    2. Transparency
    3. Accessibility
    4. Consistent reporting performance
    5. Client–server architecture
    6. Generic dimensionality
    7. Dynamic sparse matrix handling
    8. Multi-user support
    9. Unrestricted cross-dimensional operations
    10. Intuitive data manipulation
    11. Flexible reporting
    12. Unlimited dimensions and aggregation levels

    1. Multi-dimensional conceptual view
    OLAP tools should provide users with a multi-dimensional model that corresponds to users’ views of the enterprise and is intuitively analytical and easy to use. Interestingly, this rule is given various levels of support by vendors of OLAP tools who argue that a multi-dimensional conceptual view of data can be delivered without multi-dimensional storage.

    2. Transparency
    The OLAP technology, the underlying database and architecture, and the possible heterogeneity
    of input data sources should be transparent to users. This requirement is to preserve the user’s productivity and proficiency with familiar frontend environments and tools.

    3. Accessibility
    The OLAP tool should be able to access data required for the analysis from all heterogeneous enterprise data sources such as relational, non-relational, and legacy systems.

    4. Consistent reporting performance
    As the number of dimensions, levels of aggregations, and the size of the database increases, users should not perceive any significant degradation in performance. There should be no alteration in the way the key figures are calculated. The system models should be robust enough to cope with changes to the enterprise model.

    5. Client–server architecture
    The OLAP system should be capable of operating efficiently in a client–server environment.
    The architecture should provide optimal performance, flexibility, adaptability, scalability, and interoperability.

    6. Generic dimensionality
    Every data dimension must be equivalent in both structure and operational capabilities.
    In other words, the basic structure, formulae, and reporting should not be biased towards any one dimension.

    7. Dynamic sparse matrix handling
    The OLAP system should be able to adapt its physical schema to the specific analytical model that optimizes sparse matrix handling to achieve and maintain the required level of performance. Typical multi-dimensional models can easily comprise millions of cell references, many of which may have no appropriate data at any one point in time. These nulls should be stored in an efficient way and not have any adverse impact on the accuracy or speed of data access.

    8. Multi-user support
    The OLAP system should be able to support a group of users working concurrently on the same or different models of the enterprise’s data.

    9. Unrestricted cross-dimensional operations
    The OLAP system must be able to recognize dimensional hierarchies and automatically perform associated roll-up calculations within and across
    dimensions.

    10. Intuitive data manipulation
    Slicing and dicing (pivoting), drill-down, and consolidation (roll-up), and other manipulations should be accomplished via direct ‘point-and-click’ and ‘drag-and-drop’ actions on the cells of the cube.

    11. Flexible reporting
    The ability to arrange rows, columns, and cells in a fashion that facilitates analysis by intuitive visual presentation of analytical reports must exist. Users should be able to retrieve any view of the data that they require.

    12. Unlimited dimensions and aggregation levels
    Depending on business requirements, an analytical model may have numerous dimensions, each having multiple hierarchies. The OLAP system should not impose any artificial restrictions on the number of dimensions or aggregation levels.

    ReplyDelete
  3. sorry for my late post sir, here my answer for question no.9

    MOLAP
    MOLAP tools use specialized data structures and multi-dimensional database management systems (MDDBMSs) to organize, navigate, and analyze data. To enhance query performance the data is typically aggregated and stored according to predicted usage

    issue associated :
    - Navigation and analysis of data are limited because the data is designed according to
    previously determined requirements.
    - Only a limited amount of data can be efficiently stored and analyzed.

    ROLAP
    Relational OLAP (ROLAP) is the fastest-growing type of OLAP tool. This growth is in response to users’ demands to analyze ever-increasing amounts of data and due to the realization that users cannot store all the data they require in MOLAP databases.

    issue associated :
    - Performance problems associated with the processing of complex queries that require
    multiple passes through the relational data.
    - Development of middleware to facilitate the development of multi-dimensional applications,
    that is software that converts the two-dimensional relation into a multidimensional
    structure.
    - Development of an option to create persistent multi-dimensional structures, together
    with facilities to assist in the administration of these structures.

    HOLAP
    Hybrid OLAP (HOLAP) tools provide limited analysis capability, either directly against RDBMS products, or by using an intermediate MOLAP server. HOLAP tools deliver selected data directly from the DBMS or via a MOLAP server to the desktop (or local server) in the form of a data cube, where it is stored, analyzed, and maintained locally.

    issue associated :
    -The architecture results in significant data redundancy and may cause problems for networks
    that support many users.
    - Ability of each user to build a custom data cube may cause a lack of data consistency
    among users.
    - Only a limited amount of data can be efficiently maintained.

    DOLAP
    An increasingly popular category of OLAP tools is Desktop OLAP (DOLAP). DOLAP tools store the OLAP data in client-based files and support multi-dimensional processing using a client multi-dimensional engine. DOLAP requires that relatively small extracts of data are held on client machines.

    issue associated :
    - Provision of appropriate security controls to support all parts of the DOLAP environment.
    - Reduction in the effort involved in deploying and maintaining the DOLAP tools.
    - Current trends are towards thin client machines.

    ReplyDelete
  4. fact table is centralized table in a star scema. fact table is usually have 2 types of collums. one of them containing facts and one of them containing foreign keys to dimention tables. Primary key of a fact table usually composite key that made up of many froreign keys.
    fact tables on data marts is populated by data from OLTP system or data warehouse. a snapshot of daata is reguraly taken and moved to the data mart.that take place every day week month depending on the companies requirements.
    facts table might contain either detail level of facts that have been agregated. in a real world fact table can contain no facts and that called factkless fact table.

    star scema is the simples data warehouse schema. it called that ecause the scema is connected from one and the other. and they resemble a star. a simple star scema may contain some measure such as ssales, time, product, market. that dimention will collect information about each things.
    star scema is usually composed on many facts table. set of dimension table and join tables. they will join the relation that connected to a dimention table to the fact tables.
    advantages of star sceme is providing intuitive mapping and optimize performance

    Fagra Hanif 1601270813
    Answer Number 2

    ReplyDelete
  5. answer for number 3

    The main advantages of star schemas are that they:
    - Reduced Joins
    - Faster Query Operation.
    - Provide a direct and intuitive mapping between the business entities being analyzed by end users and the schema design.
    - Provides highly optimized performance for typical data warehouse queries.
    - normalized tables are easier to maintain.it also saves the storage space.

    snow flake:
    - Some OLAP multidimensional database modeling tools are optimized for snowflake schemas.
    - Normalizing attributes results in storage savings, the tradeoff being additional complexity in source query joins.
    - normalized structures are easier to update and maintain.
    - Distributed data
    - Easier to obtain fact-less data

    starflake: because starflake is a hybrid mode from star schema and snowflake schema so it have many advantages like:
    - Efficiency : The consistency of the underlying database structure allows more efficient access to the data by various tools including report writers and query tools.
    - Ability to handle changing requirements :The start schema can adapt to changes in the user’s requirements, as all dimensions are equivalent in terms of providing access to the fact table.
    - Extensibility : The dimensional model is extensible.
    -Ability to model common business situations : There are a growing number of standard approaches for handling common modeling situations in the business world.
    - Predictable query processing : Data warehouse applications that drill down will simply be adding more dimension attributes from within a single star schema.

    ReplyDelete
  6. im sorry for the late post sir
    here is the answer to number 1
    -ivandiwira 1601258890

    Dimensional model and Entity Relation model

    Dimensional modelling is very flexible for the user perspective. Dimensional data model is mapped for creating schemas. Where as ER Model is not mapped for creating shemas and does not use in conversion of normalization of data into denormalized form.

    ER Model is utilized for OLTP databases that uses any of the 1st or 2nd or 3rd normal forms, where as dimensional data model is used for data warehousing and uses 3rd normal form.

    ER model contains normalized data where as Dimensional model contains denormalized data.

    ER modeling that models an ER diagram represents the entire businesses or applications processes. This diagram can be segregated into multiple Dimensional models. This is to say, an ER model will have both logical and physical model. The Dimensional model will only have physical model.

    ReplyDelete
  7. Hardian Ikhsan
    1601247483
    Database Management - Multidimensional Data
    Question No. : 7


    MULTIDIMENSIONAL DATA

    The multidimensional data model is a part of On-Line Analytical Processing (OLAP). Because OLAP is on-line, it must provide answers quickly. Users pose queries during interactive sessions, not in batch jobs that run overnight. And because OLAP is also analytic, the queries are complex. The multidimensional data model is designed to solve complex queries in real time.

    Multidimensional data found often in a OLAP applications. A multidimensional database (MDB) is a type of database that is optimized for data warehouse and online analytical processing (OLAP) applications. Multidimensional databases are frequently created using input from existing relational databases.

    Spreadsheet is one of the example on representing dimensional data.
    1 dimension could be represented by either a single column or a single row of data.
    2 dimensions could be represented by multiple columns and rows of data.
    3 dimentions could be represented by multiple spreadsheets. Each spreadsheet containing the same number of rows and columns.

    here's an example about how multidimensional data are being analyzed with crosstab. Suppose that our dataset contains the following sales information:

    - SalesMonth
    - State
    - AmountPaid

    The Crosstab component can be used to produce a grid showing the total, average and number of sales summarized by state for each month. The output of the report is shown in the link below.

    https://www.digital-metaphors.com/image/screenshots/crosstab6.png

    ReplyDelete