多维数据模型中的 OLAP 操作

OLAP Operations in the Multidimensional Data Model

在多维模型中,记录被组织成不同的维度,每个维度包括由概念层次结构描述的多个抽象(abstraction)层次。该数据组织方式支持用户灵活地从各种角度查看数据。存在许多 OLAP 数据立方体操作(data cube operation)来演示这些不同的视图,允许交互式查询和搜索手头的记录。因此,OLAP 支持交互式数据分析的用户友好环境。

考虑要对多维数据执行的 OLAP 操作。该图显示了商店销售额的数据立方体。多维数据集包含维度、位置、时间和项目,其中位置与城市值相关,时间与季度相关,项目与项目类型相关。

In the multidimensional model, the records are organized into various dimensions, and each dimension includes multiple levels of abstraction described by concept hierarchies. This organization support users with the flexibility to view data from various perspectives. A number of OLAP data cube operation exist to demonstrate these different views, allowing interactive queries and search of the record at hand. Hence, OLAP supports a user-friendly environment for interactive data analysis.

Consider the OLAP operations which are to be performed on multidimensional data. The figure shows data cubes for sales of a shop. The cube contains the dimensions, location, and time and item, where the location is aggregated with regard to city values, time is aggregated with respect to quarters, and an item is aggregated with respect to item types.

卷起(Roll-Up)

上卷操作(也称为上钻或聚合操作)通过向下概念层次结构(即降维)对数据立方体执行聚合。汇总就像缩小数据立方体。该图显示了对维度位置执行的汇总操作的结果。位置的层次结构定义为 Order Street、城市、省或州、国家/地区。汇总操作通过将位置层次从城市级别提升到国家级别来聚合数据。

当通过维度缩减执行汇总时,会从多维数据集中移除一个或多个维度。例如,考虑一个具有两个维度的销售数据立方体,位置和时间。可以通过移除出现在按位置而不是按位置和按时间的总销售额的聚合中的时间维度来执行汇总。

The roll-up operation (also known as drill-up or aggregation operation) performs aggregation on a data cube, by climbing down concept hierarchies, i.e., dimension reduction. Roll-up is like zooming-out on the data cubes. Figure shows the result of roll-up operations performed on the dimension location. The hierarchy for the location is defined as the Order Street, city, province, or state, country. The roll-up operation aggregates the data by ascending the location hierarchy from the level of the city to the level of the country.

When a roll-up is performed by dimensions reduction, one or more dimensions are removed from the cube. For example, consider a sales data cube having two dimensions, location and time. Roll-up may be performed by removing, the time dimensions, appearing in an aggregation of the total sales by location, relatively than by location and by time.

下图说明了汇总的工作原理。

向下钻取(Drill-Down)

向下钻取操作(也称为向下滚动)向上滚动的逆操作。向下钻取就像放大数据立方体。它从不太详细的记录导航到更详细的数据。可以通过逐步降低维度的概念层次结构或添加其他维度来执行向下钻取。

该图显示了通过逐步降低定义为日、月、季度和年的概念层次结构对维度时间执行的向下钻取操作。通过将时间层次结构从季度级别降到更详细的月份级别来显示向下钻取。

因为向下钻取向给定数据添加了更多详细信息,所以也可以通过向多维数据集添加新维度来执行向下钻取。例如,可以通过引入额外的维度(例如客户组)来向下钻取图的中央多维数据集。

The drill-down operation (also called roll-down) is the reverse operation of roll-up. Drill-down is like zooming-in on the data cube. It navigates from less detailed record to more detailed data. Drill-down can be performed by either stepping down a concept hierarchy for a dimension or adding additional dimensions.

Figure shows a drill-down operation performed on the dimension time by stepping down a concept hierarchy which is defined as day, month, quarter, and year. Drill-down appears by descending the time hierarchy from the level of the quarter to a more detailed level of the month.

Because a drill-down adds more details to the given data, it can also be performed by adding a new dimension to a cube. For example, a drill-down on the central cubes of the figure can occur by introducing an additional dimension, such as a customer group.

下图说明了向下钻取的工作原理。

切片

切片是多维数据集的子集,对应于维度的一个或多个成员的单个值。例如,当客户想要在三维多维数据集的一维上进行选择时,会执行切片操作,从而生成二维站点。因此,切片操作对给定多维数据集的一维执行选择,从而产生一个子多维数据集。

slice is a subset of the cubes corresponding to a single value for one or more members of the dimension. For example, a slice operation is executed when the customer wants a selection on one dimension of a three-dimensional cube resulting in a two-dimensional site. So, the Slice operations perform a selection on one dimension of the given cube, thus resulting in a subcube.

下图说明了 Slice 的工作原理。

这里 Slice 使用标准 time = "Q1" 对维度 "time" 起作用。

它将通过选择一个或多个维度来形成一个新的子立方体。

切块(Dice)

切块操作通过在二维或更多维度上操作选择来描述子立方体。

下图显示了切块的操作。

基于以下选择标准对立方体的骰子操作涉及三个维度。

  • The dice operation on the cubes based on the following selection criteria involves three dimensions.

  • (location = "Toronto" or "Vancouver")

  • (time = "Q1" or "Q2")

  • (item =" Mobile" or "Modem")

旋转(Pivot)

枢轴操作也称为旋转。Pivot 是一种可视化操作,它在视图中旋转数据轴以提供数据的替代表示。它可能包含交换行和列或将行维度之一移动到列维度中。

下图显示了旋转操作。

其他 OLAP 操作

执行包含多个事实表的查询。钻取操作利用关系 SQL 有助于将数据立方体的底层向下钻取到其后端关系表。

其他 OLAP 操作可能包含对列表中前 N 或后 N 元素进行排名,以及计算移动平均线、增长率和利息、内部收益率、折旧、货币兑换和统计任务。

OLAP 提供分析建模功能,包含一个计算引擎,用于确定比率、方差等,并用于计算各个维度的度量。它可以在每个粒度级别和每个维度交叉处生成汇总、聚合和层次结构。OLAP 还提供用于预测、趋势分析和统计分析的功能模型。在这种情况下,OLAP 引擎是一个强大的数据分析工具。

Executes queries containing more than one fact table. The drill-through operations make use of relational SQL facilitates to drill through the bottom level of a data cubes down to its back-end relational tables.

Other OLAP operations may contain ranking the top-N or bottom-N elements in lists, as well as calculate moving average, growth rates, and interests, internal rates of returns, depreciation, currency conversions, and statistical tasks.

OLAP offers analytical modeling capabilities, containing a calculation engine for determining ratios, variance, etc. and for computing measures across various dimensions. It can generate summarization, aggregation, and hierarchies at each granularity level and at every dimensions intersection. OLAP also provide functional models for forecasting, trend analysis, and statistical analysis. In this context, the OLAP engine is a powerful data analysis tool.

14 basic OLAP operations

  • Drill-up

  • Drill-down

  • Slice

  • Dice

  • Pivot

  • Scoping

  • Screening

  • Drill across

  • Drill through

  • Sort

  • Add measure

  • Drop measure

  • Union

  • Difference

OLAP Operations in Data Mining

OLAP is a widely spread technology belonging to Business Intelligence processes developed to coordinate and analyze vast amounts of data. OLAP databases are stored in the form of multidimensional cubes where each cube comprises the data supposed relevant by a cube administrator. Through certain OLAP operations, a user is able to obtain a specified view of the cube and extract requisite information from it. So this way it’s possible to get a necessary Pivot Table and Pivot Chart report.

General OLAP operations involve Drill-up, Drill-down, Pivot, and Slice-and-Dice. Here we’d like to expand the list and look through all possible OLAP operations with examples for data mining including slicing and dicing in OLAP.

But before defining what is OLAP operation, let’s figure out what language is used in this process.

OLAP language

OLAP operations could be based on two OLAP languages: SQL and MDX.

SQL or Structured Query Language is a computer language developed to work in two dimensions in order to manage relational database and manipulate data.

MDX or Multidimensional expressions is a language for analytical queries expression. Its principle difference from SQL language is that MDX is able to reference multiple dimensions. Microsoft primarily invented MDX as a SQL extension.

These two languages are different and have their own peculiarities. However, OLAP operations using SQL and MDX languages are pretty similar.

Our product Ranet OLAP uses MDX query language, that is why today we made an accent on MDX OLAP operations example.

OLAP operations:

So let’s outline the typical OLAP operations now.

Drill Up

This operation you can meet as a part of pair drill up and drill down in OLAP. Drill-up is an operation to gather data from the cube either by ascending a concept hierarchy for a dimension or by dimension reduction in order to receive measures at a less detailed granularity. So that to see a broader perspective in compliance with the concept hierarchy a user has to group columns and unite the values. As there are fewer specifics, one or more dimensions from the data cube will be deleted, when this OLAP operation is run. In some sources drill up and roll up operations in OLAP come as synonyms, so this variant is also possible.

Here’s a typical example of a Drill-up or roll up OLAP operations example:

Drill down

OLAP Drill-down is an operation opposite to Drill-up. It is carried out either by descending a concept hierarchy for a dimension or by adding a new dimension. It lets a user deploy highly detailed data from a less detailed cube. Consequently, when the operation is run, one or more dimensions from the data cube must be appended to provide more information elements.

Have a look at an OLAP Drill-down example in use:

Slice

The next pair we are going to discuss is slice and dice operations in OLAP. The Slice OLAP operations takes one specific dimension from a cube given and represents a new sub-cube, which provides information from another point of view.It can create a new sub-cube by choosing one or more dimensions. The use of Slice implies the specified granularity level of the dimension.

OLAP Slice example will look the following way:

Dice

OLAP Dice emphasizes two or more dimensions from a cube given and suggests a new sub-cube, as well as Slice operation does. In order to locate a single value for a cube, it includes adding values for each dimension.

The diagram below shows how Dice operation works:

Pivot

This OLAP operation rotates the axes of a cube to provide an alternative view of the data cube. Pivot clusters the data with other dimensions which helps analyze the performance of a company or enterprise.

Here’s an example of Pivot in operation:

Scoping

The operation of Scoping restrains the presentation of the database objects to a specified subset. It will let users receive and update certain data values which they want. If there is a huge amount of data and a user needs to constrain the access of information to a specified subset Scoping is mostly conducive.

Screening

Screening is conducted to limit the set of data extracted.

Drill across

Drill across and Drill through in OLAP are another pair of opposite operations. The operation Drill across reconciles cells from several data cubes which share the same scheme.

Drill through

OLAP Drill through enables to navigate from data at the lower level in a cube to data in the operational systems whence the cube was ejected. The operation is usually exploited to identify the cause of outlier values in a data cube.

Sort

Sort brings the cube back where the members of a dimension were sorted.

Add Measure

Thanks to this OLAP operation one is able to add new measures to a cube.

Drop Measure

In contrast to Add Measure, it’s also possible to get rid of a measure from a data cube if it's not necessary.

Union

Due to an opportunity of Union, you can unite a number of cubes which have the same scheme but separate instances.

Difference

Difference eliminates the cells in a cube which are owned by another one. These two cubes must possess the same scheme.

Questions

In order to summarize everything up, let’s go through the top asked questions about OLAP operations.

How to define the concept of OLAP and the operations it supports?

Online Analytical Processing is a technology, which helps to perform business data multidimensional analysis and operate complex calculations and data modeling. OLAP databases are stored in the form of multidimensional cubes where each cube comprises the data supposed relevant by a cube administrator. OLAP operations aimed to help user to obtain a specified view of the cube and extract requisite information from it.

What are different types of OLAP operations?

We can distinguish 14 basic OLAP operations:

  • Drill-up

  • Drill-down

  • Slice

  • Dice

  • Pivot

  • Scoping

  • Screening

  • Drill across

  • Drill through

  • Sort

  • Add measure

  • Drop measure

  • Union

  • Difference

More info you can find in the beginning of the article where we discussed the typical OLAP operations with examples.

What is difference between slice and dice in OLAP?

The Slice operation takes one specific dimension from a cube given and represents a new sub-cube which provides information from another point of view. The Dice operation in the contrary emphasizes two or more dimensions from a cube.

In conclusion, its a must to point out that OLAP system contains all historical processing of information which you’ll be able to see in a summarized and multidimensional view drawing on the operations described above. Through them, the data will turn out flexible and user-friendly to analyze.

番外篇:

大数据多维分析常用操作图解 OLAP Operations相关推荐

  1. 分布式大数据多维分析(OLAP)引擎Apache Kylin安装配置及使用示例【转】

    Kylin 麒麟官网:http://kylin.apache.org/cn/download/ 关键字:olap.Kylin Apache Kylin是一个开源的分布式分析引擎,提供Hadoop之上的 ...

  2. 一文看懂大数据生态圈完整知识体系【大数据技术及架构图解实战派】

    一文看懂大数据生态圈完整知识体系 徐葳 随着大数据行业的发展,大数据生态圈中相关的技术也在一直迭代进步,作者有幸亲身经历了国内大数据行业从零到一的发展历程,通过本文希望能够帮助大家快速构建大数据生态圈 ...

  3. 大数据生态圈常用组件(二):概括介绍、功能特性、适用场景

    三更灯火五更鸡,正是男儿读书时. 小编整理了一些常用的大数据组件,使用场景及功能特性,希望对后浪有所帮助. 分类 名称 简介 功能特点 使用场景 大数据存储 HDFS HDFS是一个分布式的文件系统, ...

  4. 分布式大数据多维分析引擎:Kylin 在百度地图的实践

    2019独角兽企业重金招聘Python工程师标准>>> 1. 前言 百度地图开放平台业务部数据智能组主要负责百度地图内部相关业务的大数据计算分析,处理日常百亿级规模数据,为不同业务提 ...

  5. 大数据多维分析平台的实践

    大数据多维分析平台的实践 一.  大数据多维分析平台搭建的初心 随着公司业务量的增长,基于传统关系型数据库搭建的各种报表查询分析系统,性能下降明显.同时由于大数据平台的的日趋完善,实时的核心业务数据逐 ...

  6. 【大数据与云计算】大数据多维分析引擎在魅族公司的实践

    " Apache Kylin是首个完全由中国团队设计开发,并贡献到Apache软件基金会(ASF)的顶级项目,开源一年左右的时间,已经在国内国际多个公司被采用作为大数据分析平台的关键组成部分 ...

  7. 大数据技术常用的工具有哪些

    数据是一个庞大而复杂的数据集合,它包含的内容有很多,比如,气候信息.公开信息.网购信息.网络日记.视频图像.病历等等.这些都是大数据的原始资料来源.这些原始数据量非常庞大,需要用专业的工具来进行处理, ...

  8. 大数据工程师常用的大数据BI工具是什么?

    大数据工程师常用的大数据BI工具是什么? [导语]目前,无论是大企业还是小企业,都面临着数字化转型的挑战.如何在大数据中获得更好地洞察力,有效改善用户体验,同时达到优化生产力的效果,这时候进行数据分析 ...

  9. 大数据开发常用的编程语言有哪些

    学习大数据开发需要掌握编程语言,哪些是大数据开发常用的编程语言呢,一起了解下吧. 1.Python语言 如果你的数据科学家不使用R,他们可能就会彻底了解Python.如果你有一个需要NLP处理的项目, ...

最新文章

  1. 山东省第三届数据应用赛事来了!
  2. 快速让你明白Objective-C的语法(和Java、C++对比)
  3. 快速设置UITableView不同section对应于不同种类的cell
  4. 【Java 注解】自定义注解 ( 元注解 )
  5. 漫游Kafka设计篇之性能优化(7)
  6. python怎么设置函数超时时间_在python运行时为函数设置超时秒数
  7. 牛客小白月赛12 H 华华和月月种树 (离线dfs序+线段树)
  8. js urlencode 20 php,js实现php函数urlencode
  9. python开发环境哪个好 博客园_我选用的Python开发环境
  10. 我整理的一份来自于线上的Nginx配置(Nginx.conf),希望对学习Nginx的有帮助
  11. mysql表不存在但实际存在_历史上有哪些实际上并不存在的人物但很多人相信他存在的?...
  12. Windows 10 Enterprise 2015 LTSB 2019_免费下载:Intouch软件、Windows操作系统、SQL数据库,Office办公、VB6.0、C#、虚拟机、PLC...
  13. iOS使用得图SDK开发VR播放器
  14. 深信服技术认证之Openstack云平台使用入门
  15. 大数据在高校中的应用
  16. 怎么让Html的高度自适应屏幕高度
  17. 百度智能云 x 民生银行 | 智能+创新,数字化运营再升级
  18. Unity3D 模型大小,面数,贴图大小,骨骼数量规范(一)
  19. JavaScript(基础)
  20. vs2008编译QT开源项目--太阳神三国杀源码分析(二) 客户端添加武将

热门文章

  1. 微信小程序 使用 uCharts 图表
  2. 俄罗斯方块代码——c++实现
  3. 2022年全球与中国油性凝胶面膜市场现状研究
  4. Opera下载最新版本地址
  5. VLAN 虚拟局域网
  6. C++语言程序设计第五版 - 郑莉(第四章课后习题)
  7. 摄像头8mm可以看多远_摄像头焦距怎么选
  8. 实用的费曼学习法 | 一些思考
  9. 安卓集成云闪付,以及So库冲突多moudle项目解决办法
  10. stm32的简易小项目之震动感应灯