Mondrian MDX 执行原理剖析

Mondrian MDX 执行原理剖析

Mondrian是一个开源项目。一个用Java写成的OLAP(在线分析性处理)引擎。它用MDX语言实现查询,从关系数据库(RDBMS)中读取数据。然后经过Java API用多维的方式对结果进行展示。Mondrian如何将MDX转换成相应的SQL去底层数据源查询数据的?这个就是本文要剖析的问题。

模型结构

<code><schema><dimension>    <hierarchy>        <table>        <level>        <level>        <level>    /<hierarchy>/<dimension><dimension>    <hierarchy>        <table>        <level>        <level>        <level>    /<hierarchy>/<dimension><cube>        <table>            <dimensionusage>    <dimensionusage>            <measure>    <measure>    <measure>    <measure>            <calculatedmember>        <formula>[Measures].[销售额] - [Measures].[成本]/<formula>        <calculatedmemberproperty>    /<calculatedmember>    /<cube>/<schema>/<code>

表结构

数据库:derby

dim_time


Mondrian MDX 执行原理剖析

dim_store


Mondrian MDX 执行原理剖析

fact_sales


Mondrian MDX 执行原理剖析

查询解析

1、单轴+单指标:查询销售额


Mondrian MDX 执行原理剖析

<code>select {[Measures].[销售额]} ON COLUMNS from [销售情况]/<code>

翻译成的SQL:

<code>select sum("fact_sales"."sales") as "m0" from "fact_sales" as "fact_sales"/<code>

2、单轴+单维度和指标:查询日期和销售额


Mondrian MDX 执行原理剖析

<code> select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS from [销售情况]/<code>

一个轴上出现不同的维度需要使用 Crossjoin

翻译成的SQL:

<code>-- 获取所有的年份select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,         "dim_time"."time_year" ASC-- 求所有年份select count(distinct "time_year") from "dim_time"-- 获取所有年份和求和数据select "dim_time"."time_year" as "c0",       sum("fact_sales"."sales") as "m0" from "dim_time" as "dim_time", "fact_sales" as "fact_sales" where "fact_sales"."time_id" = "dim_time"."time_id" group by "dim_time"."time_year"/<code>

3、单轴+双维度和单指标:查询日期和销售额


Mondrian MDX 执行原理剖析

<code>select {  Crossjoin(                 Crossjoin([时间].[年].Members, [销售点].[省].Members),                {[Measures].[销售额]}    )      } ON COLUMNSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取年select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,          "dim_time"."time_year" ASC          // 获取省份 select "dim_store"."store_province" from "dim_store" as "dim_store"  group by "dim_store"."store_province"  order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,          "dim_store"."store_province" ASC          select count(distinct "time_year") from "dim_time"  select count(distinct "store_province") from "dim_store"   select "dim_time"."time_year" as "c0", "dim_store"."store_province" as "c1",             sum("fact_sales"."sales") as "m0"   from "dim_time" as "dim_time", "fact_sales" as "fact_sales", "dim_store" as "dim_store"   where "fact_sales"."time_id" = "dim_time"."time_id"         and "fact_sales"."store_id" = "dim_store"."store_id"   group by "dim_time"."time_year", "dim_store"."store_province"  /<code>

4、双轴+单维度和指标:查询日期和销售额

Mondrian MDX 执行原理剖析

<code>select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS,  {[销售点].[省].Members} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取年select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,            "dim_time"."time_year" ASC  // 获取省份 select "dim_store"."store_province" from "dim_store" as "dim_store"  group by "dim_store"."store_province"  order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,         "dim_store"."store_province" ASC    // 获取省份个数  select count(distinct "store_province") from "dim_store"    // 获取日期个数  select count(distinct "time_year") from "dim_time"    // 获取中间数据  select "dim_time"."time_year" as "c0", "dim_store"."store_province" as "c1",  sum("fact_sales"."sales") as "m0" from "dim_time" as "dim_time",   "fact_sales" as "fact_sales", "dim_store" as "dim_store"   where "fact_sales"."time_id" = "dim_time"."time_id"     and "fact_sales"."store_id" = "dim_store"."store_id"   group by "dim_time"."time_year", "dim_store"."store_province"         /<code>

5、双轴+双维度和指标:查询日期和销售额

Mondrian MDX 执行原理剖析

<code>select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS,  {[销售点].[市].Members} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASCselect "dim_store"."store_province", "dim_store"."store_city" from "dim_store" as "dim_store" group by "dim_store"."store_province", "dim_store"."store_city" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,       "dim_store"."store_province" ASC,       CASE WHEN "dim_store"."store_city" IS NULL THEN 1 ELSE 0 END,      "dim_store"."store_city" ASC       // 获取城市个数 select count(distinct "store_city") from "dim_store"     // 获取日期个数  select count(distinct "time_year") from "dim_time"      select "dim_time"."time_year" as "c0", "dim_store"."store_city" as "c1",     sum("fact_sales"."sales") as "m0"   from "dim_time" as "dim_time", "fact_sales" as "fact_sales",     "dim_store" as "dim_store"   where "fact_sales"."time_id" = "dim_time"."time_id"     and "fact_sales"."store_id" = "dim_store"."store_id"    group by "dim_time"."time_year", "dim_store"."store_city" /<code>

6、分页双轴+双维度和指标:查询日期和销售额


Mondrian MDX 执行原理剖析

分页的效果可以对比查看上图。

<code>select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS,  {Subset([销售点].[市].Members, 0, 3)} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASCselect "dim_store"."store_province", "dim_store"."store_city" from "dim_store" as "dim_store" group by "dim_store"."store_province", "dim_store"."store_city" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,       "dim_store"."store_province" ASC,       CASE WHEN "dim_store"."store_city" IS NULL THEN 1 ELSE 0 END,      "dim_store"."store_city" ASC       // 获取城市个数 select count(distinct "store_city") from "dim_store"     // 获取日期个数  select count(distinct "time_year") from "dim_time"      select "dim_time"."time_year" as "c0", "dim_store"."store_city" as "c1",     sum("fact_sales"."sales") as "m0"   from "dim_time" as "dim_time", "fact_sales" as "fact_sales",     "dim_store" as "dim_store"   where "fact_sales"."time_id" = "dim_time"."time_id"     and "fact_sales"."store_id" = "dim_store"."store_id"    group by "dim_time"."time_year", "dim_store"."store_city" /<code>

从上面的SQL可以看出分页其实是一个内存的分页。

7、单字段排序

Mondrian MDX 执行原理剖析

<code>select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS,  {Order([销售点].[省].Members, [Measures].[销售额], DESC)} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASC  select "dim_store"."store_province" from "dim_store" as "dim_store" group by "dim_store"."store_province" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,     "dim_store"."store_province" ASC   select count(distinct "time_year") from "dim_time"  select count(distinct "store_province") from "dim_store"   // 这条SQL在没有Order的情况是没有的 select sum("fact_sales"."sales") as "m0" from "fact_sales" as "fact_sales"  // 这条SQL在没有Order的情况是没有的 select "dim_store"."store_province" as "c0", sum("fact_sales"."sales") as "m0"  from "dim_store" as "dim_store", "fact_sales" as "fact_sales"  where "fact_sales"."store_id" = "dim_store"."store_id"  group by "dim_store"."store_province"  select "dim_time"."time_year" as "c0", "dim_store"."store_province" as "c1",         sum("fact_sales"."sales") as "m0"  from "dim_time" as "dim_time", "fact_sales" as "fact_sales",         "dim_store" as "dim_store"  where "fact_sales"."time_id" = "dim_time"."time_id"         and "fact_sales"."store_id" = "dim_store"."store_id"   group by "dim_time"."time_year", "dim_store"."store_province" /<code>

8、过滤

Mondrian MDX 执行原理剖析

<code>with member [Measures].[销售额 环比上季] as 'ParallelPeriod([时间].[季度], (- 1))'select {[Measures].[销售额], [Measures].[销售额 环比上季]} ON COLUMNS,  {Filter([时间].[季度].Members, ([Measures].[销售额] > 4210000))} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取明细select "dim_time"."time_year", "dim_time"."time_quarter" from "dim_time" as "dim_time", "fact_sales" as "fact_sales" where "fact_sales"."time_id" = "dim_time"."time_id" group by "dim_time"."time_year", "dim_time"."time_quarter" having (sum("fact_sales"."sales") > 4210000) order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,         "dim_time"."time_year" ASC,     CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC       select "dim_time"."time_quarter"  from "dim_time" as "dim_time"  where ("dim_time"."time_year" = '2012')  group by "dim_time"."time_quarter"  order by CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC  select "dim_time"."time_quarter" from "dim_time" as "dim_time"  where ("dim_time"."time_year" = '2013')  group by "dim_time"."time_quarter"  order by CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,         "dim_time"."time_quarter" ASC   // 获取年  select "dim_time"."time_year" from "dim_time" as "dim_time"   group by "dim_time"."time_year"   order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASC    select count(distinct "time_year") from "dim_time"    select count(distinct "time_quarter") from "dim_time"    // 获取日期明细数据  select "dim_time"."time_year" as "c0", "dim_time"."time_quarter" as "c1",     sum("fact_sales"."sales") as "m0"   from "dim_time" as "dim_time", "fact_sales" as "fact_sales"   where "fact_sales"."time_id" = "dim_time"."time_id"   group by "dim_time"."time_year", "dim_time"."time_quarter"    /<code>


高级计算解析

1、累计分析(YTD/MTD/QTD)

Mondrian MDX 执行原理剖析

<code>with member [Measures].[销售额 YTD] as 'Sum(Ytd([时间].[季度].CurrentMember), [Measures].[销售额])'select {[Measures].[销售额], [Measures].[销售额 YTD]} ON COLUMNS,  {[时间].[季度].Members} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取年-季度select "dim_time"."time_year", "dim_time"."time_quarter" from "dim_time" as "dim_time" group by "dim_time"."time_year", "dim_time"."time_quarter" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,        "dim_time"."time_year" ASC,     CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC  // 获取年 select "dim_time"."time_year" from "dim_time" as "dim_time"  group by "dim_time"."time_year"  order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASC  // 获取季度 select "dim_time"."time_quarter" from "dim_time" as "dim_time"  where ("dim_time"."time_year" = '2013')  group by "dim_time"."time_quarter"  order by CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC    select count(distinct "time_year") from "dim_time"    select count(distinct "time_quarter") from "dim_time"    //获取明细数据  select "dim_time"."time_year" as "c0", "dim_time"."time_quarter" as "c1",     sum("fact_sales"."sales") as "m0"   from "dim_time" as "dim_time", "fact_sales" as "fact_sales"   where "fact_sales"."time_id" = "dim_time"."time_id"   group by "dim_time"."time_year", "dim_time"."time_quarter"/<code>

2、单维度对比

Mondrian MDX 执行原理剖析

<code>with member [Measures].[销售额 环比上季] as 'ParallelPeriod([时间].[季度], (- 1))'select {[Measures].[销售额], [Measures].[销售额 环比上季]} ON COLUMNS,  {[时间].[季度].Members} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取年-季度select "dim_time"."time_year", "dim_time"."time_quarter" from "dim_time" as "dim_time" group by "dim_time"."time_year", "dim_time"."time_quarter" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,         "dim_time"."time_year" ASC,     CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC  // 获取年select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_year" ASC  // 获取季度select "dim_time"."time_quarter" from "dim_time" as "dim_time" where ("dim_time"."time_year" = '2013') group by "dim_time"."time_quarter" order by CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END, "dim_time"."time_quarter" ASC    select count(distinct "time_year") from "dim_time"    select count(distinct "time_quarter") from "dim_time"    //获取明细数据 select "dim_time"."time_year" as "c0", "dim_time"."time_quarter" as "c1",         sum("fact_sales"."sales") as "m0"  from "dim_time" as "dim_time", "fact_sales" as "fact_sales"  where "fact_sales"."time_id" = "dim_time"."time_id"  group by "dim_time"."time_year", "dim_time"."time_quarter"/<code>


3、维度交叉对比

Mondrian MDX 执行原理剖析

<code>with member [Measures].[销售额 环比上季] as 'ParallelPeriod([时间].[季度], (- 1))'select {Crossjoin([销售点].[省].Members, {[Measures].[销售额], [Measures].[销售额 环比上季]})} ON COLUMNS,  {[时间].[季度].Members} ON ROWSfrom [销售情况]/<code> 

翻译成的SQL:

<code>// 获取省份select "dim_store"."store_province" from "dim_store" as "dim_store" group by "dim_store"."store_province" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,         "dim_store"."store_province" ASC  // 获取季度select "dim_time"."time_year", "dim_time"."time_quarter" from "dim_time" as "dim_time" group by "dim_time"."time_year", "dim_time"."time_quarter" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END,         "dim_time"."time_year" ASC,     CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END,     "dim_time"."time_quarter" ASC  // 获取年select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END, "dim_time"."time_year" ASC  //select "dim_time"."time_quarter" from "dim_time" as "dim_time" where ("dim_time"."time_year" = '2013') group by "dim_time"."time_quarter" order by CASE WHEN "dim_time"."time_quarter" IS NULL THEN 1 ELSE 0 END, "dim_time"."time_quarter" ASC  select count(distinct "time_year") from "dim_time"    select count(distinct "time_quarter") from "dim_time"    select count(distinct "store_province") from "dim_store"    //获取明细数据select "dim_time"."time_year" as "c0", "dim_time"."time_quarter" as "c1",         "dim_store"."store_province" as "c2",     sum("fact_sales"."sales") as "m0"  from "dim_time" as "dim_time", "fact_sales" as "fact_sales",         "dim_store" as "dim_store"  where "fact_sales"."time_id" = "dim_time"."time_id"         and "fact_sales"."store_id" = "dim_store"."store_id"  group by "dim_time"."time_year", "dim_time"."time_quarter",         "dim_store"."store_province"/<code>

基于内存的对比

3、总计/小计

Mondrian MDX 执行原理剖析

<code>select {Crossjoin([时间].[年].Members, {[Measures].[销售额]})} ON COLUMNS,  {Hierarchize({[销售点].[市].Members, [销售点].[省].Members})} ON ROWSfrom [销售情况]/<code>

翻译成的SQL:

<code>// 获取年份select "dim_time"."time_year" from "dim_time" as "dim_time" group by "dim_time"."time_year" order by CASE WHEN "dim_time"."time_year" IS NULL THEN 1 ELSE 0 END, "dim_time"."time_year" ASC  // 获取省份和城市select "dim_store"."store_province", "dim_store"."store_city" from "dim_store" as "dim_store" group by "dim_store"."store_province", "dim_store"."store_city" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,     "dim_store"."store_province" ASC,   CASE WHEN "dim_store"."store_city" IS NULL THEN 1 ELSE 0 END,   "dim_store"."store_city" ASC  // 获取省份select "dim_store"."store_province" from "dim_store" as "dim_store" group by "dim_store"."store_province" order by CASE WHEN "dim_store"."store_province" IS NULL THEN 1 ELSE 0 END,     "dim_store"."store_province" ASCselect count(distinct "time_year") from "dim_time"select count(distinct "store_city") from "dim_store"    //获取城市明细数据select "dim_time"."time_year" as "c0", "dim_store"."store_city" as "c1",        sum("fact_sales"."sales") as "m0" from "dim_time" as "dim_time", "fact_sales" as "fact_sales", "dim_store" as "dim_store"where "fact_sales"."time_id" = "dim_time"."time_id"         and "fact_sales"."store_id" = "dim_store"."store_id"  group by "dim_time"."time_year", "dim_store"."store_city"// 省份明细数据select "dim_time"."time_year" as "c0", "dim_store"."store_province" as "c1",         sum("fact_sales"."sales") as "m0" from "dim_time" as "dim_time", "fact_sales" as "fact_sales", "dim_store" as "dim_store" where "fact_sales"."time_id" = "dim_time"."time_id"         and "fact_sales"."store_id" = "dim_store"."store_id"  group by "dim_time"."time_year", "dim_store"."store_province"/<code> 
Mondrian MDX 执行原理剖析


分享到:


相關文章: