HUGmn techday2010 – March 11 in Minneapolis

Tim Tow | Mar 4, 2010 09:46 +0000
I will be presenting next week at the Hyperion User Group - Minnesota 'TechDay 2010' event next week in Minneapolis.  The user group has lined up a number of great speakers including fellow Oracle ACE Directors Edward Roske from interRel Consulting and Eric Helmer from The Hackett Group.  Oracle Essbase product managers Alan Lee and Karen Zubetz and HFM product manager Mike Malwitz will also be doing presentations along with a number of other consultants. 

Alan Lee will be speaking on the new features coming in Essbase 11.1.2 and Karen Zubetz will discussing the new OBI Applications Essbase Integrator which will, in essence, spin Essbase cubes from Oracle BI Server sources.  Mike Malwitz will talk about what is new, and what is coming, with HFM.  My presentations are on Essbase 11 installation and on converting a VBA application to Dodeca.  Here is the full agenda for the meeting: http://www.harbinger-group.com/HUGpage.html#.

I will be in Minneapolis for a couple of days and I have promised myself I would take a bit of time to enjoy my trip.  I am planning to visit the climbing gym in St Paul on Wednesday night (Vertical Endeavors - http://www.verticalendeavors.com/st_paul/index.html) and am planning to ski on Thursday night (Afton Alps - http://www.aftonalps.com).  Anyone interested in joining me?

decimal separator

Vetriselvan | Mar 3, 2010 17:25 +0000
How to change the grouping to decimal separator.

Here are few examples, how to do it.





There are more functions available, but these would be help to people who would love to start from simple.

Note : add the first two lines inside the form-field.
Winrichman.blogspot.com

Inside the Oracle BI Server Part 3 : BI Server In-Memory Joins

(author unknown) | Mar 3, 2010 08:58 +0000

Inside the Oracle BI Server Part 3 : BI Server In-Memory Joins

Mark Rittman | Mar 3, 2010 03:00 +0000

In the previous two postings in this series, I looked at the architecture of the Oracle BI Server, and how it processes incoming queries from Oracle BI Answers. In the latter article I touched on the concept of BI Server in-memory joins, and in this article I want to expand on this topic and look at just what goes on when the BI Server is called upon to combine data from multiple sources.

When the BI Server executes a query plan, it handles the data in four separate stages:

Bis32

  • Firstly, filters and functions are applied to the data from each data source
  • Then, the data from these data sources are aggregated as required
  • Then they are joined together (or “stitched” together), and
  • Then, any calculations and/or aggregations that are applied across data sources are applied

In simple OBIEE environments, data used by a request will come from a single database, and therefore any joins that need to be performed by the BI Server will automatically be “pushed down” to the underlying database. In the cases though where more than one physical database is being used to provide data for a query, this join will instead need to be performed by the BI Server “in memory”. This ability to “federate” data sources, and therefore produce reports and analysis that span multiple data sources, but present the data to users as if it was a single database, is one of the key unique features of OBIEE and sets it apart from tools like Discoverer which are really restricted to reporting against single data sources.

So given this capability, how does it work under the covers? When does the BI Server perform a join in-memory, and when does it get done at the underlying database level? Where can we see what is happening, and can we predict what method the BI Server will use when performing a join? Finally, what algorithm does the BI Server use when performing these joins, and how does it use memory and disk when during the process?

To illustrate how the process works, there are a number of join scenarios that we need to consider. Some relate to joining fact and dimension tables together, and others relate to joining fact tables that share conforming dimensions, or hold conforming data sets of differing granularity.

Joining Fact and Dimension Tables Together

The BI Server semantic layer requires you to organize your business model and mapping layer into a star schema. This star schema may have one or more logical dimension tables, that join to one or more logical fact tables. The logical fact tables typically have conforming dimensions, so that you can create requests that span multiple fact tables and multiple dimension tables.

Taking for the moment joins between fact and dimension tables, depending on how the underlying physical or logical table source joins are set up in the semantic model, these may be either inner joins, left outer joins, right outer joins or full outer joins. The simple example to consider is a business model that is mapped to a single physical database, so that all logical table sources point to the same underlying data source, as shown in the screenshot below:

Bis25-2

In this case, if we issued a request against this business model that required data from a dimension table and a fact table, the BI Server would push the join between logical table sources down to the underlying database, a single SQL query would be generated and the execution plan from a level 5 query log entry would look like this:

-------------------- Execution plan:

RqList <<2105>> [for database 3023:2820:orcl3,44]
PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:2820,44],
sum(SALES.QUANTITY_SOLD by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c2 GB [for database 3023:2820,44]
Child Nodes (RqJoinSpec): <<2136>> [for database 3023:2820:orcl3,44]
PRODUCTS T2874
SALES T2911
DetailFilter: PRODUCTS.PROD_ID = SALES.PROD_ID [for database 0:0]
GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2820,44]
OrderBy: c1 asc [for database 3023:2820,44]

The same would apply to a left outer join between table sources in the same database, a right outer join or a full outer join. The BI Server doesn’t do any work here except to issue a single SQL query, and you can see just the one “RqList” (request list) in the execution plan, indicating again that the BI Server thinks it only needs to put together one query to satisfy the request.

If, however, one of the logical dimension tables had its logical table source re-pointed to a separate physical database, as shown in the screenshot below, the BI Server would now have to do the join itself, as it can’t be pushed down to the underlying database (as there are now two of them).

Bis26

In this case, two SQL queries would be issued, one against each of the two physical databases, and the BI Server would do the join in-memory (or to disk, I’ll elaborate on this later on). The corresponding logical execution plan from a level 5 log file would now look like this:

-------------------- Execution plan:

RqBreakFilter <<2465>>[1] [for database 0:0,0]
RqList <<2466>> [for database 0:0,0]
D1.c2 as c1 [for database 3023:2500,44],
sum(D1.c5 by [ D1.c2]  at_distinct [ D1.c2, D1.c3] ) as c2 [for database 0:0,0]
Child Nodes (RqJoinSpec): <<2478>> [for database 0:0,0]
(
RqList <<2482>> [for database 0:0,0]
D902.c1 as c2 GB [for database 3023:2500,44],
D901.c2 as c3 [for database 3023:132,44],
D901.c3 as c5 [for database 3023:132,44]
Child Nodes (RqJoinSpec): <<2490>> [for database 0:0,0]

(
RqList <<2495>> [for database 3023:132:orcl,44]
SALES.PROD_ID as c2 [for database 3023:132,44],
sum(SALES.QUANTITY_SOLD by [ SALES.PROD_ID] ) as c3 [for database 3023:132,44]
Child Nodes (RqJoinSpec): <<2504>> [for database 3023:132:orcl,44]
SALES T211
GroupBy: [ SALES.PROD_ID]  [for database 3023:132,44]
OrderBy: c2 asc [for database 3023:132,44]
) as D901
InnerJoin <<2492>> On D901.c2 = D902.c2; actual join vectors:  [ 0 ] =  [ 1 ]

(
RqList <<2517>> [for database 3023:2500:orcl2,44]
PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:2500,44],
PRODUCTS.PROD_ID as c2 [for database 3023:2500,44]
Child Nodes (RqJoinSpec): <<2523>> [for database 3023:2500:orcl2,44]
PRODUCTS T2502
OrderBy: c2 asc [for database 3023:2500,44]
) as D902
OrderBy: c2, c3 [for database 0:0,0]
) as D1
OrderBy: c1 asc [for database 0:0,0]

Notice the “InnerJoin <<2492>> On D901.c2 = D902.c2; actual join vectors: [ 0 ] = [ 1 ]“ that is in the middle of the execution plan, between the two main Rqlists – this tells you that the BI Server is doing the join, as it would only appear here if it couldn’t be pushed down to the underlying database. You might also find references to LeftOuterJoin, RightOuterJoin and FullOuterJoin here, depending on how the join between the tables is defined in the physical or logical table source joins in your semantic layer.

Joining Facts with Conforming Dimensions Together

Another situation occurs when you are joining fact tables together that share conforming dimensions. A simple example of this is where you create a request that requires data from two or more fact tables that share conforming dimensions, such as those shown in the screenshot below:

Bis25-1

As requests such as these can potentially lead to “fan trap” issues (explained in this blog post), the BI Server knows that it has to generate two logical queries and join, or “stitch” them together to avoid the fan trap. If both fact tables are sourced from the same physical database, and this database supports subquery factoring (the “WITH” clause that you see in Oracle 10gR2/11g SQL statements) then it will generate the following execution plan, which has a FullOuterStitchJoin between the two inner RqList blocks:

RqBreakFilter <<3571>>[3] [for database 0:0,0]
    RqList <<3462>> [for database 3023:2820:orcl3,46]
        D1.c1 as c1 GB [for database 3023:2820,46],
        D2.c1 as c2 GB [for database 3023:2820,46],
        case  when D1.c2 is not null then D1.c2 when D2.c2 is not null then D2.c2 end  as c3 GB [for database 3023:2820,46]
    Child Nodes (RqJoinSpec): <<3567>> [for database 3023:2820:orcl3,46]
        (
            RqList <<3474>> [for database 3023:2820:orcl3,46]
                sum(COSTS.UNIT_COST by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c1 GB [for database 3023:2820,46],
                PRODUCTS.PROD_SUBCATEGORY_DESC as c2 GB [for database 3023:2820,46]
            Child Nodes (RqJoinSpec): <<3507>> [for database 3023:2820:orcl3,46]
                PRODUCTS T2874
                COSTS T2830
            DetailFilter: COSTS.PROD_ID = PRODUCTS.PROD_ID [for database 0:0]
            GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2820,46]
        ) as D1 FullOuterStitchJoin <<3565>> On D1.c2 = D2.c2
        (
            RqList <<3511>> [for database 3023:2820:orcl3,46]
                sum(SALES.AMOUNT_SOLD by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c1 GB [for database 3023:2820,46],
                PRODUCTS.PROD_SUBCATEGORY_DESC as c2 GB [for database 3023:2820,46]
            Child Nodes (RqJoinSpec): <<3544>> [for database 3023:2820:orcl3,46]
                PRODUCTS T2874
                SALES T2911
            DetailFilter: PRODUCTS.PROD_ID = SALES.PROD_ID [for database 0:0]
            GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2820,46]
        ) as D2
    OrderBy: c3 asc [for database 3023:2820,46]

The BI Server Navigator then generates a single SQL statement off of this execution plan, which queries both fact tables using subquery factoring, and then brings the results together in the main body of the statement:

-------------------- Sending query to database named orcl3 (id: <<3462>>):
WITH
SAWITH0 AS (select sum(T2830.UNIT_COST) as c1,
     T2874.PROD_SUBCATEGORY_DESC as c2
from
     PRODUCTS T2874,
     COSTS T2830
where  ( T2830.PROD_ID = T2874.PROD_ID )
group by T2874.PROD_SUBCATEGORY_DESC),
SAWITH1 AS (select sum(T2911.AMOUNT_SOLD) as c1,
     T2874.PROD_SUBCATEGORY_DESC as c2
from
     PRODUCTS T2874,
     SALES T2911
where  ( T2874.PROD_ID = T2911.PROD_ID )
group by T2874.PROD_SUBCATEGORY_DESC)
select distinct SAWITH0.c1 as c1,
     SAWITH1.c1 as c2,
     case  when SAWITH0.c2 is not null then SAWITH0.c2 when SAWITH1.c2 is not null then SAWITH1.c2 end  as c3
from
     SAWITH0 full outer join SAWITH1 On SAWITH0.c2 = SAWITH1.c2
order by c3

If the physical database doesn’t support subquery factoring, such as Oracle Database 10gR1 or higher, then the BI Server generates a slightly different execution plan, again with a FullOuterStitchJoin, like this:

-------------------- Execution plan:

RqBreakFilter <<3115>>[3] [for database 0:0,0]
    RqList <<3006>> [for database 0:0,0]
        D903.c1 as c1 GB [for database 3023:2820,44],
        D903.c2 as c2 GB [for database 3023:2820,44],
        case  when D903.c3 is not null then D903.c3 when D903.c4 is not null then D903.c4 end  as c3 GB [for database 3023:2820,44]
    Child Nodes (RqJoinSpec): <<3117>> [for database 0:0,0]
        (
            RqList <<3160>> [for database 0:0,0]
                D901.c1 as c1 GB [for database 3023:2820,44],
                D902.c1 as c2 GB [for database 3023:2820,44],
                D901.c2 as c3 [for database 3023:2820,44],
                D902.c2 as c4 [for database 3023:2820,44]
            Child Nodes (RqJoinSpec): <<3163>> [for database 0:0,0]

                    (
                        RqList <<3018>> [for database 3023:2820:orcl3,44]
                            sum(COSTS.UNIT_COST by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c1 GB [for database 3023:2820,44],
                            PRODUCTS.PROD_SUBCATEGORY_DESC as c2 GB [for database 3023:2820,44]
                        Child Nodes (RqJoinSpec): <<3051>> [for database 3023:2820:orcl3,44]
                            PRODUCTS T2874
                            COSTS T2830
                        DetailFilter: COSTS.PROD_ID = PRODUCTS.PROD_ID [for database 0:0]
                        GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2820,44]
                        OrderBy: c2 asc [for database 3023:2820,44]
                    ) as D901 FullOuterStitchJoin <<3109>> On D901.c2 = D902.c2; actual join vectors:  [ 1 ] =  [ 1 ]

                    (
                        RqList <<3055>> [for database 3023:2820:orcl3,44]
                            sum(SALES.AMOUNT_SOLD by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c1 GB [for database 3023:2820,44],
                            PRODUCTS.PROD_SUBCATEGORY_DESC as c2 GB [for database 3023:2820,44]
                        Child Nodes (RqJoinSpec): <<3088>> [for database 3023:2820:orcl3,44]
                            PRODUCTS T2874
                            SALES T2911
                        DetailFilter: PRODUCTS.PROD_ID = SALES.PROD_ID [for database 0:0]
                        GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2820,44]
                        OrderBy: c2 asc [for database 3023:2820,44]
                    ) as D902
        ) as D903
    OrderBy: c3 asc [for database 0:0,0]

This is then resolved for this database into two separate SQL statements, which then joined “in-memory” together by the BI Server.

-------------------- Sending query to database named orcl3 (id: <<3018>>):

select sum(T2830.UNIT_COST) as c1,
     T2874.PROD_SUBCATEGORY_DESC as c2
from
     PRODUCTS T2874,
     COSTS T2830
where  ( T2830.PROD_ID = T2874.PROD_ID )
group by T2874.PROD_SUBCATEGORY_DESC
order by c2

+++Administrator:2a0000:2a0005:----2010/02/28 15:05:31

-------------------- Sending query to database named orcl3 (id: <<3055>>):

select sum(T2911.AMOUNT_SOLD) as c1,
     T2874.PROD_SUBCATEGORY_DESC as c2
from
     PRODUCTS T2874,
     SALES T2911
where  ( T2874.PROD_ID = T2911.PROD_ID )
group by T2874.PROD_SUBCATEGORY_DESC
order by c2


Joining Table Sources within a Logical Fact

Another situation is a fact table may have more than one logical table source, because individual measures are sourced from different data sources or perhaps measures may be mapped in at differing levels of granularity (this blog post describes such a scenario). In this case, again the BI Server will initially try and push the join down to the underlying database, something that may be possible if a single physical database is used and we can use a technique like subquery factoring; more likely though it will require the BI Server to issue two or more physical SQL statements and then bring the results back together again using a FullOuterStitchJoin.

-------------------- Execution plan:

RqList <<7829>> [for database 0:0,0]
    D1.c1 as c1 [for database 0:0,0],
    D1.c2 as c2 [for database 0:0,0],
    D1.c3 as c3 [for database 0:0,0],
    D1.c4 as c4 [for database 3023:4210,44]
Child Nodes (RqJoinSpec): <<7842>> [for database 0:0,0]
    (
        RqList <<7809>> [for database 0:0,0]
            D1.c1 as c1 [for database 0:0,0],
            D1.c2 as c2 [for database 0:0,0],
            D1.c3 as c3 [for database 0:0,0],
            D1.c4 as c4 [for database 3023:4210,44],
            D1.c5 as c5 [for database 0:0,0]
        Child Nodes (RqJoinSpec): <<7824>> [for database 0:0,0]
            (
                RqBreakFilter <<7808>>[1,2,5] [for database 0:0,0]
                    RqList <<7604>> [for database 0:0,0]
                        case  when D903.c1 is not null then D903.c1 when D903.c2 is not null then D903.c2 end  as c1 GB [for database 0:0,0],
                        case  when D903.c3 is not null then D903.c3 when D903.c4 is not null then D903.c4 end  as c2 GB [for database 0:0,0],
                        D903.c5 as c3 GB [for database 0:0,0],
                        D903.c6 as c4 GB [for database 3023:4210,44],
                        case  when D903.c7 is not null then D903.c7 when D903.c8 is not null then D903.c8 end  as c5 GB [for database 0:0,0]
                    Child Nodes (RqJoinSpec): <<7844>> [for database 0:0,0]
                        (
                            RqList <<7915>> [for database 0:0,0]
                                D901.c1 as c1 [for database 0:0,0],
                                D902.c1 as c2 [for database 3023:4210,44],
                                D902.c2 as c3 [for database 3023:4210,44],
                                D901.c2 as c4 [for database 0:0,0],
                                D901.c3 as c5 GB [for database 0:0,0],
                                D902.c3 as c6 GB [for database 3023:4210,44],
                                D901.c4 as c7 [for database 0:0,0],
                                D902.c4 as c8 [for database 3023:4210,44]
                            Child Nodes (RqJoinSpec): <<7918>> [for database 0:0,0]

                                    (
                                        RqList <<7851>> [for database 0:0,0]
                                            D1.c2 as c1 [for database 0:0,0],
                                            D1.c3 as c2 [for database 0:0,0],
                                            D1.c1 as c3 GB [for database 0:0,0],
                                            D1.c4 as c4 [for database 0:0,0]
                                        Child Nodes (RqJoinSpec): <<7854>> [for database 0:0,0]
                                            (
                                                RqBreakFilter <<7687>>[2,3] [for database 0:0,0]
                                                    RqList <<8040>> [for database 0:0,0]
                                                        D1.c1 as c1 [for database 0:0,0],
                                                        D1.c2 as c2 [for database 0:0,0],
                                                        D1.c3 as c3 [for database 0:0,0],
                                                        D1.c4 as c4 [for database 0:0,0]
                                                    Child Nodes (RqJoinSpec): <<8058>> [for database 0:0,0]
                                                        (
                                                            RqList <<7972>> [for database 3023:4483:Quotas,2]
                                                                sum(QUANTITY_QUOTAS.QUOTA by [ CATEGORY.CATEGORY, MONTHS.MONTH_MON_YYYY] ) as c1 [for database 3023:4483,2],
                                                                MONTHS.MONTH_MON_YYYY as c2 [for database 3023:4483,2],
                                                                CATEGORY.CATEGORY as c3 [for database 3023:4483,2],
                                                                MONTHS.MONTH_YYYYMM as c4 [for database 3023:4483,2]
                                                            Child Nodes (RqJoinSpec): <<7682>> [for database 3023:4483:Quotas,2]
                                                                CATEGORY T4486
                                                                MONTHS T4488
                                                                QUANTITY_QUOTAS T4492
                                                            DetailFilter: CATEGORY.CATEGORY = QUANTITY_QUOTAS.CATEGORY and MONTHS.MONTH_YYYYMM = QUANTITY_QUOTAS.MONTH_YYYYMM [for database 0:0]
                                                            GroupBy: [ CATEGORY.CATEGORY, MONTHS.MONTH_YYYYMM, MONTHS.MONTH_MON_YYYY]  [for database 3023:4483,2]
                                                        ) as D1
                                                    OrderBy: c2, c3 [for database 0:0,0]
                                            ) as D1
                                        OrderBy: c1 asc, c2 asc [for database 0:0,0]
                                    ) as D901 FullOuterStitchJoin <<7800>> On D901.c1 =NullsEqual D902.c1 and D901.c2 =NullsEqual D902.c2; actual join vectors:  [ 0 1 ] =  [ 0 1 ]

                                    (
                                        RqList <<7880>> [for database 3023:4210:orcl4,44]
                                            D2.c2 as c1 [for database 3023:4210,44],
                                            D2.c3 as c2 [for database 3023:4210,44],
                                            D2.c1 as c3 GB [for database 3023:4210,44],
                                            D2.c4 as c4 [for database 3023:4210,44]
                                        Child Nodes (RqJoinSpec): <<7883>> [for database 3023:4210:orcl4,44]
                                            (
                                                RqBreakFilter <<7760>>[2,3] [for database 3023:4210:orcl4,44]
                                                    RqList <<7989>> [for database 3023:4210:orcl4,44]
                                                        sum(ITEMS.QUANTITY by [ PRODUCT.CATEGORY, TIMES.MONTH_MON_YYYY] ) as c1 [for database 3023:4210,44],
                                                        TIMES.MONTH_MON_YYYY as c2 [for database 3023:4210,44],
                                                        PRODUCT.CATEGORY as c3 [for database 3023:4210,44],
                                                        TIMES.MONTH_YYYYMM as c4 [for database 3023:4210,44]
                                                    Child Nodes (RqJoinSpec): <<7755>> [for database 3023:4210:orcl4,44]
                                                        PRODUCT T4256
                                                        TIMES T4264
                                                        ITEMS T4239
                                                        ORDERS T4248
                                                    DetailFilter: ITEMS.ORDID = ORDERS.ORDID and ITEMS.PRODID = PRODUCT.PRODID and ORDERS.ORDERDATE = TIMES.DAY_ID [for database 0:0]
                                                    GroupBy: [ PRODUCT.CATEGORY, TIMES.MONTH_MON_YYYY, TIMES.MONTH_YYYYMM]  [for database 3023:4210,44]
                                            ) as D2
                                        OrderBy: c1 asc, c2 asc [for database 3023:4210,44]
                                    ) as D902
                        ) as D903
                    OrderBy: c1, c2, c5 [for database 0:0,0]
            ) as D1
        OrderBy: c5 asc, c2 asc, c4 asc [for database 0:0,0]
    ) as D1

Again, notice the FullOuterStitchJoin in the execution plan – this indicates that facts (as opposed to facts and dimensions) are being joined together.

This in turn leads to two separate SQL statements. The one against the “orcl” database is more complex because the results then need to be mapped to the aggregation level that the second source, “quotas”, comes in at:

-------------------- Sending query to database named Quotas (id: <<7972>>):
select sum(T4492."QUOTA") as c1,
     T4488."MONTH_MON_YYYY" as c2,
     T4486."CATEGORY" as c3,
     T4488."MONTH_YYYYMM" as c4
from
     "CATEGORY" T4486,
     "MONTHS" T4488,
     "QUANTITY_QUOTAS" T4492
where  ( T4486."CATEGORY" = T4492."CATEGORY" and T4488."MONTH_YYYYMM" = T4492."MONTH_YYYYMM" )
group by T4486."CATEGORY", T4488."MONTH_YYYYMM", T4488."MONTH_MON_YYYY"

+++Administrator:2b0000:2b000a:----2010/02/24 17:18:51

-------------------- Sending query to database named orcl4 (id: <<7880>>):

select D2.c2 as c1,
     D2.c3 as c2,
     D2.c1 as c3,
     D2.c4 as c4
from
     (select D1.c1 as c1,
               D1.c2 as c2,
               D1.c3 as c3,
               D1.c4 as c4
          from
               (select sum(T4239.QUANTITY) as c1,
                         T4264.MONTH_MON_YYYY as c2,
                         T4256.CATEGORY as c3,
                         T4264.MONTH_YYYYMM as c4,
                         ROW_NUMBER() OVER (PARTITION BY T4256.CATEGORY, T4264.MONTH_MON_YYYY ORDER BY T4256.CATEGORY ASC, T4264.MONTH_MON_YYYY ASC) as c5
                    from
                         PRODUCT T4256,
                         TIMES T4264,
                         ITEMS T4239,
                         ORDERS T4248
                    where  ( T4239.ORDID = T4248.ORDID and T4239.PRODID = T4256.PRODID and T4248.ORDERDATE = T4264.DAY_ID )
                    group by T4256.CATEGORY, T4264.MONTH_MON_YYYY, T4264.MONTH_YYYYMM
               ) D1
          where  ( D1.c5 = 1 )
     ) D2
order by c1, c2

So, to summarize things so far:

  • Where possible, the BI Server will try and generate a single SQL statement to resolve a request
  • And if possible, any joins that are required between tables will be pushed down to the database
  • If table data sources are located on separate physical databases, the BI Server will request the individual data source data blocks, and then join the results together in-memory using an inner, left outer, right outer or full outer join as appropriate
  • If facts (or measures within a fact) are being joined together, the BI Server will need to generate one logical query per logical table source, and bring the data together with a full outer stitch join
  • As mentioned above, if it’s possible to do this stitch join at the database level (using, for example, a WITH clause), it’ll do so
  • Otherwise the BI Server will generate separate SQL statements and join the data together in-memory

When an in-memory BI Server join happens between two tables, it will bring back both sets of data from the two (or more) table sources and then perform a sort-merge join to bring the data together. If possible, it will push the sort back to the underlying database and just do the “merge” part of the join, and it’ll in all likelihood page some of the temporary data to TMP files in $ORACLEBIDATA/tmp depending on the load on the server, available memory and the number of concurrent queries that it is running. The NQSConfig.INI BI Server parameter VIRTUAL_TABLE_PAGE_SIZE determines the point at which temporary data is paged to disk, and on a Unix server you can experiment with increasing it from its default setting if you have lots of unused memory available (the docs suggest that this will probably not have much of a positive effect, though).

Fragmented Data Sources

Another variation on a join that the BI Server can do is a “union” between two queries. This is most common when you have fragmented data sources, such as the example below where part of the data in the sales table comes from one table, and part from another.

Bis34-1

In this case, the logical execution plan will contain RqUnionAll between the inner RqList request lists, to show that the BI Server knows it needs to union all the two queries.

-------------------- Execution plan:

RqList <<7569>> [for database 3023:6594:orcl7,44]
    D3.c2 as c1 GB [for database 3023:6594,44],
    sum(D3.c3 by [ D3.c2] ) as c2 GB [for database 3023:6594,44]
Child Nodes (RqJoinSpec): <<7695>> [for database 3023:6594:orcl7,44]
    (
        RqList <<7613>> [for database 3023:6594:orcl7,44]
            PRODUCTS.PROD_SUBCATEGORY_DESC as c2 [for database 3023:6594,44],
            SALES_UPTO_2003.AMOUNT_SOLD as c3 [for database 3023:6594,44]
        Child Nodes (RqJoinSpec): <<7617>> [for database 3023:6594:orcl7,44]
            PRODUCTS T6596
            SALES T6629
        DetailFilter: PRODUCTS.PROD_ID = SALES_UPTO_2003.PROD_ID [for database 0:0]
        RqUnion All <<7690>> [for database 3023:6594:orcl7,44]
        RqList <<7668>> [for database 3023:6594:orcl7,44]
            PRODUCTS.PROD_SUBCATEGORY_DESC as c2 [for database 3023:6594,44],
            SALES_BEYOND_2003.AMOUNT_SOLD as c3 [for database 3023:6594,44]
        Child Nodes (RqJoinSpec): <<7672>> [for database 3023:6594:orcl7,44]
            PRODUCTS T6596
            SALES T6637
        DetailFilter: PRODUCTS.PROD_ID = SALES_BEYOND_2003.PROD_ID [for database 0:0]
    ) as D3
GroupBy: [ D3.c2]  [for database 3023:6594,44]
OrderBy: c1 asc [for database 3023:6594,44]

Then, depending on whether the BI Server can resolve this using a single query or multiple queries against separate data source, either a single SQL statement like the one below will be issued, or separate statements will be issued and the BI Server will do the union all in memory.

select D3.c2 as c1,
     sum(D3.c3) as c2
from
     ((select T6596.PROD_SUBCATEGORY_DESC as c2,
               T6629.AMOUNT_SOLD as c3
          from
               PRODUCTS T6596,
               SALES T6629 /* SALES_UPTO_2003 */
          where  ( T6596.PROD_ID = T6629.PROD_ID )
          union all
          select T6596.PROD_SUBCATEGORY_DESC as c2,
               T6637.AMOUNT_SOLD as c3
          from
               PRODUCTS T6596,
               SALES T6637 /* SALES_BEYOND_2003 */
          where  ( T6596.PROD_ID = T6637.PROD_ID ) )
     ) D3
group by D3.c2
order by c1

Driving Tables (Parameterized Nested Loop Joins)

I mentioned in the paragraph above that BI Server joins are typically done using the sort-merge algorithm. One variation on this though is when you set one of the two tables in a business model and mapping logical join to be a driving table, typically because you are federating fact and dimension tables and one table is much smaller than the other, as shown in the screenshot below.

Bis31-1

The first thing to understand with driving tables is that they are regarded as a “hint” by the BI Server, and the BI Server may well choose to ignore the setting if it makes more sense to perform the join as normal (presumably, when both tables are relatively small). If the driving table instruction is followed, though, the BI Server will always do the join in-memory, even if both tables come from logical table sources pointing to the same physical database. In the execution plan shown below, you can see the InnerJoin (left drive) that indicates a parameterized nested loop join (PNLJ) will be required, and as the name suggests the BI Server will perform a nested loop join rather than the sort-merge join that it usually uses to join tables together.

-------------------- Execution plan:
RqBreakFilter <<8705>>[1] [for database 0:0,0]
    RqList <<8972>> [for database 0:0,0]
        D1.c2 as c1 [for database 3023:2500,44],
        sum(D1.c5 by [ D1.c2]  at_distinct [ D1.c2, D1.c3] ) as c2 [for database 0:0,0]
    Child Nodes (RqJoinSpec): <<8984>> [for database 0:0,0]
        (
            RqList <<8463>> [for database 0:0,0]
                D901.c1 as c2 GB [for database 3023:2500,44],
                D902.c2 as c3 [for database 3023:5035,44],
                D902.c3 as c5 [for database 3023:5035,44]
            Child Nodes (RqJoinSpec): <<8707>> [for database 0:0,0]

                    (
                        RqList <<8757>> [for database 3023:2500:orcl2,44]
                            PRODUCTS.PROD_NAME as c1 GB [for database 3023:2500,44],
                            PRODUCTS.PROD_ID as c2 [for database 3023:2500,44]
                        Child Nodes (RqJoinSpec): <<8760>> [for database 3023:2500:orcl2,44]
                            PRODUCTS T2502
                        DetailFilter: PRODUCTS.PROD_NAME = '128MB Memory Card' or PRODUCTS.PROD_NAME = '3 1/2" Bulk diskettes, Box of 100' or PRODUCTS.PROD_NAME = '5MP Telephoto Digital Camera' or PRODUCTS.PROD_NAME = '64MB Memory Card' or PRODUCTS.PROD_NAME = 'Deluxe Mouse' or PRODUCTS.PROD_NAME = 'Envoy Ambassador' or PRODUCTS.PROD_NAME = 'Envoy External 8X CD-ROM' or PRODUCTS.PROD_NAME = 'Martial Arts Champions' or PRODUCTS.PROD_NAME = 'Model A3827H Black Image Cartridge' or PRODUCTS.PROD_NAME = 'Model C93822D Wireless Phone Battery' or PRODUCTS.PROD_NAME = 'Model CD13272 Tricolor Ink Cartridge' or PRODUCTS.PROD_NAME = 'PCMCIA modem/fax 28800 baud' or PRODUCTS.PROD_NAME = 'SIMM- 16MB PCMCIAII card' or PRODUCTS.PROD_NAME = 'Smash up Boxing' or PRODUCTS.PROD_NAME = 'Unix/Windows 1-user pack' [for database 0:0]
                        OrderBy: c2 asc [for database 3023:2500,44]
                    ) as D901
                InnerJoin  (left drive) <<8806>> On D901.c2 = D902.c2; actual join vectors:  [ 1 ] =  [ 0 ]

                    (
                        RqList <<8790>> [for database 3023:5035:orcl5,44]
                            SALES.PROD_ID as c2 [for database 3023:5035,44],
                            sum(SALES.AMOUNT_SOLD by [ SALES.PROD_ID] ) as c3 [for database 3023:5035,44]
                        Child Nodes (RqJoinSpec): <<8793>> [for database 3023:5035:orcl5,44]
                            SALES T5126
                        DetailFilter: SALES.PROD_ID = ?1 or SALES.PROD_ID = ?2 or SALES.PROD_ID = ?3 or SALES.PROD_ID = ?4 or SALES.PROD_ID = ?5 or SALES.PROD_ID = ?6 or SALES.PROD_ID = ?7 or SALES.PROD_ID = ?8 or SALES.PROD_ID = ?9 or SALES.PROD_ID = ?10 or SALES.PROD_ID = ?11 or SALES.PROD_ID = ?12 or SALES.PROD_ID = ?13 or SALES.PROD_ID = ?14 or SALES.PROD_ID = ?15 or SALES.PROD_ID = ?16 or SALES.PROD_ID = ?17 or SALES.PROD_ID = ?18 or SALES.PROD_ID = ?19 or SALES.PROD_ID = ?20 [for database 0:0]
                        GroupBy: [ SALES.PROD_ID]  [for database 3023:5035,44]
                        OrderBy: c2 asc [for database 3023:5035,44]
                    ) as D902
            OrderBy: c2, c3 [for database 0:0,0]
        ) as D1
    OrderBy: c1 asc [for database 0:0,0]

Then then leads to the following parameterized SQL statements being issued, with the first statement representing the “driving” query, and the second the “probing” one against the larger table.

-------------------- Sending query to database named orcl2 (id: <<8757>>):

select T2502.PROD_NAME as c1,
     T2502.PROD_ID as c2
from
     PRODUCTS T2502
where  ( T2502.PROD_NAME in ('128MB Memory Card', '3 1/2" Bulk diskettes, Box of 100', '5MP Telephoto Digital Camera', '64MB Memory Card', 'Deluxe Mouse', 'Envoy Ambassador', 'Envoy External 8X CD-ROM', 'Martial Arts Champions', 'Model A3827H Black Image Cartridge', 'Model C93822D Wireless Phone Battery', 'Model CD13272 Tricolor Ink Cartridge', 'PCMCIA modem/fax 28800 baud', 'SIMM- 16MB PCMCIAII card', 'Smash up Boxing', 'Unix/Windows 1-user pack') )
order by c2

+++Administrator:2c0000:2c000a:----2010/02/24 21:06:47

-------------------- Sending query to database named orcl5 (id: <<8790>>):

select T5126.PROD_ID as c2,
     sum(T5126.AMOUNT_SOLD) as c3
from
     SALES T5126
where  ( T5126.PROD_ID in (:PARAM1, :PARAM2, :PARAM3, :PARAM4, :PARAM5, :PARAM6, :PARAM7, :PARAM8, :PARAM9, :PARAM10, :PARAM11, :PARAM12, :PARAM13, :PARAM14, :PARAM15, :PARAM16, :PARAM17, :PARAM18, :PARAM19, :PARAM20) )
group by T5126.PROD_ID
order by c2

In reality you rarely see driving table joins being used as there are much better solutions to bringing together small and large tables together – the main one being to co-locate the tables and then push the join down to the database, rather than bring both datasets together and have the BI Server join them in memory instead (this also applies to a lesser degree to all BI Server joins). But this could be a useful “quick fix” until such time as you can co-locate the data, and its useful to remember that these types of joins are always done by the BI Server due to the need to iterate through drive/probe operations.

Persist Connnection Pools

One final variation on BI Server execution plans and join types is when you set up a “persist connection pool”. Persist connection pools are typically used in two scenarios; firstly, where Oracle/Siebel Marketing is being used, and secondly, where the underlying physical database doesn’t handle large numbers of values in an IN-list. In this case, you can set up a second connection pool within a physical database and specify it as the persist connection pool, as shown in the screenshot below:

Bis30-1

I’ve never encountered a persist connection pool “in the wild”, so to speak, but an example query log output from when one was used is shown below. In this instance, the first query was sent to a MS Analysis Services database, and a persist connection pool was used to materialize the in-list results into a database table which is then joined back to the ORDERS table in the final query, rather than have the BI Server do the join in-memory.

-------------------- Sending query to database named FoodMart (id: <<10980>>):
With
  member [Measures].[YearAnc] as 'ancestor([Time].Currentmember,[Time].[Year]).name'
  set [Q] as '{{[Time].[Year].members}}'
  select
    {[measures].[YearAnc]} on columns,
    {[Q]} on rows
  from     [Sales]

-------------------- Sending query to database named SQLDB_Northwind (id: CreateTable TransGateway):
CREATE TABLE TTCH5C5DEL554110000020000003 ( column1 VARCHAR2(8) )

-------------------- Sending query to database named SQLDB_Northwind (id: <<11057>>):
select distinct TO_NUMBER(TO_CHAR(T1864.OrderDate, 'yyyy'), '9999') as c1
from
     Orders T1864
where  ( TO_NUMBER(TO_CHAR(T1864.OrderDate, 'yyyy'), '9999') in (select column1 from TTCH5C5DEL554110000020000003) )


Conclusions

So, there you have it. The join strategy of the BI Server, as is the case with functions and calculations, is to wherever possible push them down to the underlying database. If this can’t be done, because either the database version doesn’t support features like subquery factoring, or if the data for the request is being sourced from more than one physical databas, the BI Server will do the join itself, initially in-memory but usually with temporary data being paged to disk.

There are two main types of BI Server join; regular (inner, left outer, right outer and fullouter) joins for bringing together fact and dimension tables; and full outer stitch joins, for bringing together facts and measures. There are also variations for handling joins from very small tables to very large tables (driving tables, or parameterized nested loop joins), or when the physical database doesn’t support large in-lists, however these issues are usually better handled by co-locating data or upgrading the database.

Finally, even though the BI Server is pretty clever at doing these types of joins, you’re usually better trying to invest your time in physically bringing your data together into a data mart or data warehouse than spending too much time fine-tuning these joins, though a knowledge of how they work (and how to read a level 5 execution plan) can be useful if you have to understand, or tune, an existing system in-place. Of course the level 5 execution plan doesn’t really tell you anything you couldn’t determine by looking at the design of the RPD – there’s nothing that goes on beyond this that might change the execution plan for a certain set of data, unlike the Oracle database which changes the plan from database to database depending on the distribution and nature of the data – but its interesting to get a peek into the workings of the BI Server Navigator module.

How to build Excel Dashboards – 6

admin | Mar 2, 2010 21:49 +0000

Previous – Excel Dashboards – 5

Create Year and Category Parameters

Remember, we have not built any charts yet. It is very important that we build our foundation Qlet and its corresponding parameters so that all of the drop down parameters can be leveraged in the charts.

It definitely helps to plan your dashboard development. Since the dashboard tool is easy to work with, we should still make an attempt to always plan so that we don’t duplicate our effort.

During the development, it may happen that you will get lots of ideas on implementing different ways. I would suggest to save the base dashboard definition once you have frozen all the parameters and Qlets. Before you start building charts, create a copy of the dashboard file as backup.

In our data, we have details about items that are sold on a particular date. We have the unit price, and quantity information. We also have the category for each item that they belong to. So it makes sense to include a parameter for the item category so that we can select only the items belonging to a particular category.

Right click on the “Category Name”, navigate to Create ==> Create Parameter

Image

It creates a new parameter named “Category Name”

Image

 

You may place this anywhere you feel appropriate.

Similarly, right click on the “Or Year” column and create the parameter

Image

 

Image

 

NOTE: Often it does not make sense to have a year drop down and a date range parameter.

Because if the year value and the date ranges fall apart or do not intersect then you will not see any data output. In this tutorial we provided the year and date ranges only for demonstration so if you need only one, do not use the other.

Next – Create Excel Dashboard – 7

ShareThis

How to build Excel Dashboards – 5

admin | Mar 2, 2010 21:39 +0000

Previous Article – Create Excel Dashboards – 4

Add Date Hierarchy Columns

Most of the data, has some form of date information. Now date is very granular information. Very often, to generate trending information, you need to derive year and month columns.

In the dashboard designer, you can easily generate Date hierarchy columns for any kind of reporting.

In our example, we have order date, so we will generate the Year, Quarter, Month, Day , Week and other flavors of the same information for better representation.

Right click on the “Order Date” column and navigate to Create ‘ “Add Year, Quarter, Months column

Image

 

Image

 

Click on “Add” and it will add the following columns to the right of the primary column “Order Date”

Image

 

You may not need all of the date derived columns but it is worth understanding what they stand for.

The derive columns, have a prefix of “Or” which are the first two letters of the primary date column “Order Date”. The remaining part explains the actual part of the date.
1. Year : represents the year component of the date
2. Mth N : represents the numerical month.
3. Mth Disp: represents the month number but padded with zero on the left
4. Mth NN: represents month number and 3 letter month name
5. Yr Mth : represents the year and month number
6. Qtr: represents the quarter number
7. Week : represents the week of year
8. Dy: represents day of year
9. … and so on.

The concept of date hierarchy even though simple to implement is very powerful. Within few minutes you can derive important “Insight” into your data. We will now begin the process of deriving intelligence out of boring data. Now begins the fun part.

Create Date Filter and Parameters

If you work with real transaction application, you already know that the amount of data is huge and the data has some form of date column. This date could be transaction date, invoice date, order date, payment date etc.

One of the dashboard and reporting best practices is to have some form of date filter and/or parameters so users can dynamically choose the date range. This helps in narrowing down the data for analysis.

Let us create a simple date filter.

One of the typical requirement is to see data for “as of today”.
In the dashboard designer it is few click process to implement this.
Image

Right click on the Date column (order date), Navigate to

Create ==> Create Date Filter

Image

 

The detail use of the Date Filter is covered in a separate tutorial. As we have very minimal use of the date filter for the dashboard that we are building we will just touch the basics.

In the date filter, on the left you see the column name and on the drop down you see different options such as below
Image

” First Day of Month
” Last Day of Month
” First Day of Qtr
” Last Day of Qtr
” First Day of Week
” Last Day of Week
” First Day of Year
” Last day of Year

For our purpose we just limited our data to be less than today.

Click on “use this filter” and it will be applied to the Qlet.

One of the best practice is to avoid hard coding of any values in the Query. For e.g, The dashboard designer allows you to build drop down list of values as parameters to the query. We will leverage this option to the fullest so that the dashboard is extremely useful and can respond to data changes.

We will create a From and To Date range parameter

Right click on the “Order Date” column, Navigate to Create ‘ Create Parameter
It instantly creates two Data Parameters
Image

We arrange them and size them appropriately as below

Image

 

Once arranged and formatted, you can type the date or popup the date picker by click on the button on the side

Image

 

Now, right click on the Title and select “Refresh”

Image

 

Change the From and To Date values and refresh, the Qlet will change the data accordingly.

Next – Create Excel Dashboards – 6

ShareThis

How to Create Excel Dashboards – 4

admin | Mar 2, 2010 21:13 +0000

Previous Part Build Excel Dashboards – 3

We will Add Calculations and Formulas

Add Calculations
Since this is a raw extract of actual transactions, it may need to add additional calculations and columns to create useful metrics.

For e.g in our sample file, we have the “Unit Price” and “Quantity” columns. We will create calculation to derive the “Amount” column

Image

 

Right click on the Quantity Column, Navigate to Popup Menu

Create ‘ –>Create Calculation
It shows a calculation wizard.
Image

1. Click on “Unit Price”
2. Click on * multiplication operator
3. Click on Quantity
4. Rename the column to “Amount”
5. Click on “Use this Formula”

As you see, the “Amount” column is added right next to the Quantity column

Image

 

Next – Add Date Hierarchy Columns Excel Dashboards – 5

ShareThis

OBIEE Quirky Report Name

John Minkjan | Mar 2, 2010 12:47 +0000

Have you ever noticed when a report name starts with  “CON”, the file system name will be converted to “co%6E”.

image

Let start an insider joke: “It’s the usenet handle of the original programmer ;-)”

Anyhow it’s one of those things you will have to keep in  mind when scripting reports.

Till Next Time

OBIA 7.9.6.1 Performance Recommendations

Christian Berg | Mar 2, 2010 10:30 +0000
Thanks to rnm1978 for alerting me to the availability of this rather juicy document on MOS:

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=870314.1

And the really cool part?

Oracle positions Exadata solution as fast and efficient hardware for addressing I/O bottlenecks in large volume environments. The internal benchmarks for running Oracle BI Applications on Exadata will be published soon.


Sweeeet!

OBIA 7.9.6 Performance Recommendations

rnm1978 | Mar 2, 2010 09:40 +0000
A new document has been published by Oracle, discussing ways of improving performance for OBIA 7.9.6 and 7.9.6.1. Its primary focus is around improving ETL performance. There’s some very interesting content including hardware sizing recommendations, and I’d strongly recommend anyone working with OBIA reads it. It’s called “Oracle Business Intelligence Applications Version 7.9.6 Performance Recommendations” [...]

Wisdom of Crowds BI Market Study

Tim Tow | Mar 2, 2010 09:21 +0000
Howard Dresner, formerly the Chief Strategy Officer at Hyperion, is now an independent BI Strategy consultant and is currently conducting a BI market survey. His “Wisdom of Crowds” Business Intelligence Market Study is not underwritten or sponsored by any vendors and no detailed data will be shared with the outside world. If you are interested in having your opinions help shape the BI market, you can participate in the study at http://www.surveymonkey.com/s/QMZTVSV.

I have known Howard for a number of years and we became fast friends when we learned we were both pilots. For some reason, however, my 210 is faster than his airplane although I didn't buy his explanation that salaries for 'Chief Strategy Officers' are not that high.. Rather, I think it boils down to the fact that I spend a greater part of my salary for the care and feeding of my plane!



Oracle ACE Award

sranka | Mar 2, 2010 08:10 +0000

Dear All,

I have been confirmed with Prestigious Oracle ACE award. I want to take this moment to thank all friends, colleague, and family for all the support and feedback. More than me, you all deserve this award.

About Oracle ACE Program : Click Here .

My Profile in Oracle ACE Directory

Please continue your support.


Thanks,

Sunil S Ranka
“Superior BI is the antidote to Business Failure”


OBIEE Content at ODTUG Kaleidoscope 2010

Mark Rittman | Mar 2, 2010 08:04 +0000

As well as organizing our own BI Forum in the UK, another event I’ve had a hand in is ODTUG Kaleidoscope 2010, which is due to run from June 27th – July 1st in Washington D.C. I’ve been the content lead for the BI, DW and Hyperion Reporting Tools stream, and if you were thinking of coming to the BI Forum but couldn’t make it over to the UK, this event has a similar level of OBIEE content and might be of interest to you.

Odtug

Like the BI Forum, we wanted to focus on the technical aspects of OBIEE, and as the event is running Stateside we were able to get some of the key product managers from Oracle to come and present, who don’t usually make it to general user group events like this but were keen to contribute to our OBIEE focus. Here’s a taster for what’s on the agenda:

OBIEE 11g Integration with Oracle ADF Business Components
Palaniappan Chidambaram, Oracle Corporation

With Oracle Middleware getting smarter and robust, it’s time to update the data source of OBIEE from relational database to Fusion ADF Business Model Components (light-weight Java View Objects). This is an efficient way to add pervasive BI to the enterprise applications built using Fusion ADF. Learn the new OBIEE 11g and Fusion ADF integration with added metadata on security, UI hints, and much more.

I met Palanippan back at last year’s Open World, and he’s responsible for some of the OBIEE/ADF integration that’s coming with the new Fusion Release of Oracle Applications. Palanippan will be talking about something that will be of interest to both OBIEE, and ADF developers, and will be speaking about it from the perspective of someone responsible for its features and usage.

Best Practices for Performance, Scalability, and Reliability with Oracle BI Enterprise Edition
Mike Durran, Oracle Corporation, and
Oracle OBIEE Metadata Modeling Best Practices and Tips for Concurrent Development
Alan Fuller, Oracle Corporation

The initial setup and configuration of an Oracle BI system can reap benefits in terms of ongoing performance and reliability. Mike’s session describes the creation of a system that can scale to your enterprise from the initial install, configuration for optimum performance, ongoing monitoring, and troubleshooting tips.

In Alan’s session, a senior member of the OBIEE product management team will cover best practices for metadata modeling in OBIEE including: using the power of the enterprise semantic layer for performance tuning and query optimization, rapid development processes, managing frequently encountered stumbling blocks, and making it easy for end users. Also covered will be tips for concurrent development of metadata across multiple developers.

From speaking with past attendees of Kaleidoscope, a regular bit of feedback is that people want to hear “best practice” sessions on their technology area of interest. Mike and Alan are therefore going to give Oracle’s view on best practices for OBIEE performance, and RPD modeling, which will also give the audience the change to discuss their best practices and comment on the ones put forward by Oracle.

Web Services and Application Integration with Oracle Business Intelligence EE Suite
David Granholm, Oracle Corporation

A variety of techniques can be used to build novel applications which extend the core capabilities of the OBIEE Suite. Approaches include URL-based integration, Web services leveraging SOAP methods, portal and WebCenter integration, and custom applications built in Oracle’s Application Development Framework (ADF) using Oracle JDeveloper – Oracle’s main development tool for Java-based SOA applications. Each of these approaches and a few real-world examples will be discussed.

David is a really good speaker, and this session will complement Palanippan’s and extend the discussion to SOAP and web services. Again, something of interest for general application developers as well as OBIEE developers.

Oracle Business Intelligence Applications Essbase Integrator
Alaric Thomas, Oracle Corporation

Oracle has a number of powerful business intelligence technologies in its portfolio, and we are rapidly integrating these technologies to provide more value and lower TCO for customers. In this session, Alaric Thomas will discuss how the Oracle BI Applications – Essbase Integrator will bring together the capabilities of the Oracle BI Applications and Oracle Essbase to further leverage a common applications information model delivering value to both IT and end users.

I blogged on this presentation a while ago, and this will be an excellent opportunity to see how the integration of Essbase and Oracle BI Apps is taking place, using technology available now rather than being based on the Fusion BI Apps.

Oracle® Hyperion Smart View for Office, Fusion Edition
Toufic Wakim, Oracle Corporation

For those working with a financial application, Oracle Essbase or Oracle Business Intelligence Suite Enterprise Edition Plus, Oracle Hyperion Smart View for Office brings access to data and report templates and the ability to author reports in the Microsoft Office framework. This session gives an overview of this product and demonstrates new features. It also shows how to use it with Oracle Hyperion Planning, Oracle Hyperion Financial Management, Oracle Business Intelligence Suite Enterprise Edition Plus, and Oracle Essbase.

SmartView is supposed to replace the BI Office add-in for OBIEE, but in its current incarnation doesn’t work too well. Apparently a new version that is more suited to OBIEE will be released shortly, and Toufic hopefully will discuss and demonstrate this version at Kaleidoscope. If you’re interested in the future of MS Office integration with OBIEE, this will be a must-see session.

Apart from these sessions, there’s a couple from myself on OBIEE 11g new features (assuming I can get sign-off to present, as 11g is doubtful for release by the time of Kaleidoscope), and a whole bunch of OWB, DW and ODI sessions as well as content aimed at Hyperion users upgrading to OBIEE. Outside of the sessions we’ll be organizing a social event specifically for the BI community, and there’ll of course be other great Essbase, Hyperion, Oracle and development sessions running concurrently.

Registration is open now, and if you’re serious about OBIEE and based Stateside (or even outside the USA, and fancy a trip to DC), this is definitely the event to attend in 2010.

Win a Free Kaleidoscope Registration

Tim Tow | Mar 1, 2010 21:56 +0000
ODTUG is set to announce a new contest that gives you a chance to win a free registration to Kaleidoscope or, alternatively, a suite upgrade at the Marriott Wardman Park.

All you need to do is create a 3 minute (or less) YouTube video on “Why You Need to Attend ODTUG Kaleidoscope 2010”.  Make it funny, musical, or informational.  Mostly, make it memorable.  Make it viral and we will see you in DC!

To qualify, make sure to include the ODTUG Kaleidoscope logo and submit your video for consideration by emailing you name and YouTube video link to Crystal Walton at crystal@odtug.com.  By the way, Crystal works on the small team of full-time ODTUG people who work very hard to make Kaleidoscope the best Hyperion conference in the world.

Video Contest Guidelines: Video must be three minutes or less, contain no inappropriate content, and be posted publicly to YouTube by March 31, midnight EST. All videos must remain in public domain until after July 1st, 2010. The winning video will be selected from among all the entries. The judges’ decision will be final. Judges may include members of the ODTUG Kaleidoscope Conference Committee and others. Winner will be notified by April 2nd, 2010. Winner will have the choice of a complimentary non-transferable registration to ODTUG Kaleidoscope 2010 or a suite upgrade at the Marriott Wardman Park.

Essbase 11 Formatted Values, Text and Date Measures

Tim Tow | Mar 1, 2010 20:08 +0000
There was a question on the Network54 recently about formatted values in Essbase 11, so I thought I would post a little info from 'under the covers' with formatted values and how they relate to Text Measures and Dates.

First, Essbase does support formatted values in Essbase 11. You can see in this screenshot from EAS, the format string is a property of the member.












The format string is in MDX format and there seems to be no valid examples in the docs of how to actually do a format.  Thanks to the beauty of the internet and the hard work of Venkat at RittmanMead Consulting, I found an example on their blog.  In the screenshot below, you will see the result of the formatted value:





Under the covers, there is a setting at the API level that determines if the formatted values are retrieved from the database.  Uses of Smart View control this setting using the 'Format String' option in the Options dialog.

The same setting also governs whether Date and Text Measures are returned as either the formatted values or as numbers (which are what are stored in the database).   Here is a screenshot of how the Java API returns formatted values with both the formatted values returned and the the formatted values not returned:








The value circled in yellow is a formatted value; note that the number below is the numeric value (which can also be returned in the same retrieve at the API level).  The value circled in green is a Date measure; I don't think my date values are valid though.  It seems limited that the earliest date supported is January 1, 1970 (and I think I used Excel dates to fill the cell which starts at January 1, 1900.

The values circled in red and blue are Text Measures.  Text Measures get their values from Smart Lists.  Here is the EAS editor showing my simple Smart List:















OK, so it says 'Text Lists' in the dialog, but if you look at the API level, you see Smart Lists:










Now you probably know more about Formatted Values, Text and Date Measures than you ever cared to know.

Inside the Oracle BI Server Part 2 : How Is A Query Processed?

Mark Rittman | Mar 1, 2010 01:15 +0000

In the first article on this series about the Oracle BI Server, I looked at the architecture and functions within this core part of the OBIEE product set. in this article, I want to look closer at what happens when a query (or “request”) comes into the BI Server, and how it translates it into the SQL, MDX, file and XML requests that then get passed to the underlying data sources.

In the previous article, I laid out a conceptual diagram of the BI Server and talked about how the Navigator turned incoming queries into one or more physical database queries. As a recap, here’s the architecture diagram again:

Bis2-1

Now as we all know, the BI Server uses a three-layer metadata model that exposes one or more databases (or “subject areas”) for ODBC-compliant query tools to run queries against. Here’s a typical metadata model that takes a number of physical data sources, joins them together into a smaller number of business model and mapping models, and then presents them out to the query tool (usually, Oracle BI Answers) as a set of databases made up of relational tables, columns and joins.

Bis7

Usually you access this metadata model using Oracle BI Answers, which presents you with an initial choice of subject areas (databases in ODBC terminology) and then displays the contents of one of them as a list of tables and columns (in 11g, you’ll be able to to include tables from multiple subject areas in queries as long as there are tables in common between them).

Bis9

Other ODBC-compliant query tools, such as Microsoft Excel, Cognos or Business Objects, can access these subject areas and run queries against them just as if it was a regular database. Here’s Microsoft Excel 2007 building a query against the same subject area:

Bis8

What Happens When the BI Server Handles a Query?

So just what happens then, when a query (or “request’) comes in from one of these sources, and needs to be processed in order to return results to the user? As you’re probably aware, the BI Server doesn’t itself hold data (except cached results from other queries, when this feature is enabled); instead, it translates the incoming “logical” query into one or more outgoing “physical” queries against the relevant data sources. As such, a logical model presented to users might be mapped to data in an Oracle data warehouse, an E-Business Suite application, some data in a Teradata data warehouse, some measures in an Essbase cube and even some data in an Excel spreadsheet. The BI server resolves this complexity by creating a simplified, star schema business model over these data sources so that the user can query it as if it’s a single source of data.

If you’re used to the Oracle database, you’ll probably know that it has various components that are used to resolve queries – the library cache, query rewrite, table and system statistics, etc – and both rule-based and cost-based optimizers that are used to generate a query plan. For most modern Oracle systems, a statistics-based cost-based optimizer (most famously documented by Jonathan Lewis in this book) is used to generate a number of potential execution plans (which can be displayed in a 10035 trace), with the lowest cost being chosen to run the query. Now whilst the equivalent process isn’t really documented for the BI Server, what it appears to do is largely follow a rule-based approach with a small amount of statistics being used (or not used, as I’ll mention in a moment). In essence, the following sources of metadata information are consulted when creating the query plan for the BI Server;

  • The presentation (subject area) layer to business model layer mapping rules;
  • The logical table sources for each of the business columns used in the request;
  • The dimension level mappings for each of the logical table sources;
  • The “Number of Elements at this Level” count for each dimension level (potentially the statistics bit, though anecdotally I’ve heard that these figures aren’t actually used by the BI Server);
  • Whether caching is enabled, and if so, whether the query can be found in the cache;
  • What physical features are enabled for the particular source database for each column (and whether they are relational, multi-dimensional, file, XML or whatever)
  • Specific rules for generating time-series queries, binning etc, and
  • Security settings and filters

As far as I can tell, there are no indexes, no statistics (apart from the dimension level statistics mentioned above) and no hints; there is however query rewrite and aggregates, as the BI Server allows aggregate tables to be defined which are then mapped in to specific levels in a dimension hierarchy. Cleverly, the back-end data source doesn’t even have to be an SQL database, and can in fact be a multi-dimensional database such as Essbase, Oracle OLAP or Microsoft Analysis Services, with the multi-dimensional dataset that they return converted into a row-based dataset that can be joined to other data coming in from a more traditional relational database.

“A Day in the Life of a Query”

A good way of looking at what Oracle has termed “A day in the life of a query”, is to take a look at some slides from a presentation that Oracle used regularly around the time of the introduction of Oracle BI EE. I’ll go through it slide by slide and add some interpretation from myself.

1. A query comes in from Answers or any other ODBC query tool, asking for one or more columns from a subject area. Overall, the function within the BI Server that deals with this is called Intelligent Request Generation, marked in yellow in the diagram below.

Bis10

2. The query is then passed to the Logical Request Generation engine, marked in yellow in the diagram below. The request itself requires the Brand, Closed Revenue (ultimately held in the GL system), Service Requests (held in the CRM system) and Share of Revenue (a calculated, or derived, measure). As such it’s going to require multiple physical SQL queries and multi-pass calculations, all of which will be worked out by another part of the BI Server architecture, the Navigator.

Bis11

3. Once the logical request has been generated but before its passed off to the Navigator, a check is made (if this feature is enabled) as to whether the logical request can be found in the cache. Cache Services will either do a fast, or more comprehensive match of the incoming request against those stored in the query cache, and if found, return the results from there rather than have the BI Server run physical SQL against the business model’s data sources.

Bis12

For a more detailed look at what Cache Services does, the old Siebel Analytics Administration Tool documentation has a good flowchart that explains what goes on:

Bis13

The key bit is the Cache Hit step. In general, a cache hit will occur if the following conditions are met:

  • Caching is enabled (CACHE=Y in the NQSConfig.INI file);
  • The WHERE clause in the logical SQL is semantically the same, or a logical subset of a cached statement;
  • All of the columns in the SELECT list have to exist in the cached query, or they must be able to be calculated from them;
  • It has equivalent join conditions, so that the resultant joined table of any incoming query has to be the same as (or a subset of) the cached results
  • If DISTINCT is used, the cached copy has to use this attribute as well
  • Aggregation levels have to be compatible, being either the same or more aggregated than the cached query
  • No further aggregation (for example, RANK) can be used in the incoming query
  • Any ORDER BY clause has to use columns that are also in the cached SELECT list

In addition, there are two NQSConfig.INI parameters that I think were added in the last few releases (as I can’t find them mentioned in the Siebel Analytics documentation) are USE_ADVANCED_HIT_DETECTION and MAX_SUBEXPR_SEARCH_DEPTH. The latter determines how many levels into an expression (for example, SUM(MAX(SIN(COS(TAN(ABS(TRUNC(PROFIT)))))))) that the cache hit detector will go in trying to get a match, whilst the former turns on some additional cache hit searches that you might want to enable if caching is important but not otherwise happening. Unfortunately the docs don’t really expand on what these additional searches are or the performance impact that they can introduce, so if anyone has any more information on this, I’d be glad to hear.

4. If the cache can’t provide the answer to the request, the request then gets passed to the Navigator. The Navigator handles the logical request “decision tree” and determines how complex the request is, what data sources (logical table sources) need to be used, whether there are any aggregates that can be used, and overall what is the best way to satisfy the request, based on how you’ve set up the presentation, business model and mapping, and physical layers in your RPD.

Bis14

5. Within the Navigator, the Multi-Pass / Sub-Request Logic function analyzes the incoming request and works out the complexity of the query. It works out whether it requires multiple passes (for example, calculates the average of two aggregated measures), or whether the request is based on the results of another request (in other words, uses a sub-request). The BI Server then uses this information to work out the optimal way to gather the required data and do the calculations; in the example used in the slides, the revenue share calculation is based on the other two measures and is therefore considered “multi-pass”.

Bis15

6. A measure used within the business model and mapping layer may be “fragmented”, which means that it is logically partitioned so that historic information, for examples, comes from a data warehouse whilst current information comes from an OLTP application. The Fragment Optimization Engine within the Navigator sits between the incoming request and the Execution Engine and where appropriate, transforms the base-level SQL into “fragmented” SQL against each of the data sources mapped into the fragmented measure. For more background information on fragmentation, check out this old blog post on the subject.

Bis17

7. The final function within the Navigator is the Aggregate Navigator, which uses the logical table source mappings together (in theory) with the dimension level statistics to determine the most efficient table to fetch the data from (i.e. the table with the least number of records to successfully fulfil a request).

Bis16

8. The Optimized Query Rewrites function within the BI Server then takes the query plan generated by the Navigator and rewrites it to use the features of the underlying database engines, adding RANK(OVER()) calculations if Oracle is being used, for example (referred to as “function shipping”) or just getting the raw data and having the BI Server do the calculations afterwards, if working with a database that doesn’t support analytic SQL functions. This part of the BI Server is also responsible for generating XML queries, or MDX queries for OLAP sources,
which are then sent to the underlying physical databases, in parallel, so that they can retrieve their relevant data sets.

Bis18

9. Once the data is retrieved, the results combined together and any further calculations applied, the results are returned to the calling application via the ODBC interface, and also copied to the cache along with the logical SQL query if caching is enabled.

Bis21

The BI Server’s knowledge of what each source database can support, in terms of SQL functions, is determined by the contents of the DBFeatures.INI configuration file which can in turn be over-ridden by the “Features” tab in the Database settings in the physical database model.

Bis19

I think I’ve also noticed that, from release to release of OBIEE, the way that time-series queries, for example, get resolved into physical SQL queries changes over time, as Oracle get better at generating efficient SQL queries to resolve complex calculations. It’s also the case that currently, for Essbase data sources, very few of the functions used by the BI Server get function-shipped to their equivalent MDX functions, though this is meant to be improving in the forthcoming 11g release (and in the meantime, you can use EVALUATE and EVALUATE_AGGR to call MDX functions directly).

Level 5 Logging, and Logical Execution Plans

You can see what goes on when a complex, multi-pass request that requires multiple data sources is sent through from Answers and gets logged in the NQQuery.log file with level 5 logging enabled. The query requests “quantity” information that is held in an Oracle database, “quotas” that comes from an Excel spreadsheet, and “variance” which is derived from quantity minus quotas. Both columns need to be aggregated before the variance calculation can take place, and you can see from the logs the Navigator being used to resolve the query.

Starting off, this is the logical request coming through.

-------------------- Logical Request (before navigation):

RqList
    Times.Month Name as c1 GB,
    Quantity:[DAggr(Items.Quantity by [ Times.Month Name, Times.Month ID] )] as c2 GB,
    Quota:[DAggr(Items.Quota by [ Times.Month Name, Times.Month ID] )] as c3 GB,
    Quantity:[DAggr(Items.Quantity by [ Times.Month Name, Times.Month ID] )] - Quota:[DAggr(Items.Quota by [ Times.Month Name, Times.Month ID] )] as c4 GB,
    Times.Month ID as c5 GB
OrderBy: c5 asc

Then the navigator breaks the query down, works out what sources, multi-pass calculations and aggregates can be used, and generates the logical query plan.

-------------------- Execution plan:

RqList <<993147>> [for database 0:0,0]
    D1.c1 as c1 [for database 0:0,0],
    D1.c2 as c2 [for database 3023:491167,44],
    D1.c3 as c3 [for database 0:0,0],
    D1.c4 as c4 [for database 0:0,0]
Child Nodes (RqJoinSpec): <<993160>> [for database 0:0,0]
    (
        RqList <<993129>> [for database 0:0,0]
            D1.c1 as c1 [for database 0:0,0],
            D1.c2 as c2 [for database 3023:491167,44],
            D1.c3 as c3 [for database 0:0,0],
            D1.c4 as c4 [for database 0:0,0],
            D1.c5 as c5 [for database 0:0,0]
        Child Nodes (RqJoinSpec): <<993144>> [for database 0:0,0]
            (
                RqBreakFilter <<993128>>[1,5] [for database 0:0,0]
                    RqList <<992997>> [for database 0:0,0]
                        case  when D903.c1 is not null then D903.c1 when D903.c2 is not null then D903.c2 end  as c1 GB [for database 0:0,0],
                        D903.c3 as c2 GB [for database 3023:491167,44],
                        D903.c4 as c3 GB [for database 0:0,0],
                        D903.c3 - D903.c4 as c4 GB [for database 0:0,0],
                        case  when D903.c5 is not null then D903.c5 when D903.c6 is not null then D903.c6 end  as c5 GB [for database 0:0,0]
                    Child Nodes (RqJoinSpec): <<993162>> [for database 0:0,0]
                        (
                            RqList <<993219>> [for database 0:0,0]
                                D902.c1 as c1 [for database 0:0,0],
                                D901.c1 as c2 [for database 3023:491167,44],
                                D901.c2 as c3 GB [for database 3023:491167,44],
                                D902.c2 as c4 GB [for database 0:0,0],
                                D902.c3 as c5 [for database 0:0,0],
                                D901.c3 as c6 [for database 3023:491167,44]
                            Child Nodes (RqJoinSpec): <<993222>> [for database 0:0,0]

                                    (
                                        RqList <<993168>> [for database 3023:491167:ORCL,44]
                                            D1.c2 as c1 [for database 3023:491167,44],
                                            D1.c1 as c2 GB [for database 3023:491167,44],
                                            D1.c3 as c3 [for database 3023:491167,44]
                                        Child Nodes (RqJoinSpec): <<993171>> [for database 3023:491167:ORCL,44]
                                            (
                                                RqBreakFilter <<993051>>[2] [for database 3023:491167:ORCL,44]
                                                    RqList <<993263>> [for database 3023:491167:ORCL,44]
                                                        sum(ITEMS.QUANTITY by [ TIMES.MONTH_MON_YYYY] ) as c1 [for database 3023:491167,44],
                                                        TIMES.MONTH_MON_YYYY as c2 [for database 3023:491167,44],
                                                        TIMES.MONTH_YYYYMM as c3 [for database 3023:491167,44]
                                                    Child Nodes (RqJoinSpec): <<993047>> [for database 3023:491167:ORCL,44]
                                                        TIMES T492004
                                                        ITEMS T491980
                                                        ORDERS T491989
                                                    DetailFilter: ITEMS.ORDID = ORDERS.ORDID and ORDERS.ORDERDATE = TIMES.DAY_ID [for database 0:0]
                                                    GroupBy: [ TIMES.MONTH_MON_YYYY, TIMES.MONTH_YYYYMM]  [for database 3023:491167,44]
                                            ) as D1
                                        OrderBy: c1 asc [for database 3023:491167,44]
                                    ) as D901 FullOuterStitchJoin <<993122>> On D901.c1 =NullsEqual D902.c1; actual join vectors:  [ 0 ] =  [ 0 ]

                                    (
                                        RqList <<993192>> [for database 0:0,0]
                                            D2.c2 as c1 [for database 0:0,0],
                                            D2.c1 as c2 GB [for database 0:0,0],
                                            D2.c3 as c3 [for database 0:0,0]
                                        Child Nodes (RqJoinSpec): <<993195>> [for database 0:0,0]
                                            (
                                                RqBreakFilter <<993093>>[2] [for database 0:0,0]
                                                    RqList <<993319>> [for database 0:0,0]
                                                        D1.c1 as c1 [for database 0:0,0],
                                                        D1.c2 as c2 [for database 0:0,0],
                                                        D1.c3 as c3 [for database 0:0,0]
                                                    Child Nodes (RqJoinSpec): <<993334>> [for database 0:0,0]
                                                        (
                                                            RqList <<993278>> [for database 3023:496360:Quotas,2]
                                                                sum(QUANTITY_QUOTAS.QUOTA by [ MONTHS.MONTH_MON_YYYY] ) as c1 [for database 3023:496360,2],
                                                                MONTHS.MONTH_MON_YYYY as c2 [for database 3023:496360,2],
                                                                MONTHS.MONTH_YYYYMM as c3 [for database 3023:496360,2]
                                                            Child Nodes (RqJoinSpec): <<993089>> [for database 3023:496360:Quotas,2]
                                                                MONTHS T496365
                                                                QUANTITY_QUOTAS T496369
                                                            DetailFilter: MONTHS.MONTH_YYYYMM = QUANTITY_QUOTAS.MONTH_YYYYMM [for database 0:0]
                                                            GroupBy: [ MONTHS.MONTH_YYYYMM, MONTHS.MONTH_MON_YYYY]  [for database 3023:496360,2]
                                                        ) as D1
                                                    OrderBy: c2 [for database 0:0,0]
                                            ) as D2
                                        OrderBy: c1 asc [for database 0:0,0]
                                    ) as D902
                        ) as D903
                    OrderBy: c1, c5 [for database 0:0,0]
            ) as D1
        OrderBy: c5 asc [for database 0:0,0]
    ) as D1

Notice the “FullOuterStitchJoin” in the middle of the plan? We’ll look into this more in the next posting in this series. For now though, this logical query plan is then passed to the Optimized Query Rewrites and Execution Engine, which then generates in this case two physical SQL statements that are then passed back, and “stitch joined”, by the BI Server, before performing the post-aggregation calculation required for the variance measure.

-------------------- Sending query to database named ORCL (id: <<993168>>):

select D1.c2 as c1,
     D1.c1 as c2,
     D1.c3 as c3
from
     (select D1.c1 as c1,
               D1.c2 as c2,
               D1.c3 as c3
          from
               (select sum(T491980.QUANTITY) as c1,
                         T492004.MONTH_MON_YYYY as c2,
                         T492004.MONTH_YYYYMM as c3,
                         ROW_NUMBER() OVER (PARTITION BY T492004.MONTH_MON_YYYY ORDER BY T492004.MONTH_MON_YYYY ASC) as c4
                    from
                         CUST_ORDER_HISTORY.TIMES T492004,
                         CUST_ORDER_HISTORY.ITEMS T491980,
                         CUST_ORDER_HISTORY.ORDERS T491989
                    where  ( T491980.ORDID = T491989.ORDID and T491989.ORDERDATE = T492004.DAY_ID )
                    group by T492004.MONTH_MON_YYYY, T492004.MONTH_YYYYMM
               ) D1
          where  ( D1.c4 = 1 )
     ) D1
order by c1

+++Administrator:2b0000:2b000e:----2010/02/23 16:04:42

-------------------- Sending query to database named Quotas (id: <<993278>>):
select sum(T496369."QUOTA") as c1,
     T496365."MONTH_MON_YYYY" as c2,
     T496365."MONTH_YYYYMM" as c3
from
     "MONTHS" T496365,
     "QUANTITY_QUOTAS" T496369
where  ( T496365."MONTH_YYYYMM" = T496369."MONTH_YYYYMM" )
group by T496365."MONTH_YYYYMM", T496365."MONTH_MON_YYYY"


Memory Usage and Paging Files

If you follow the BI Server at the process level during these steps, you’ll find that memory usage is largely determined at startup time by the size and complexity of the RPD thats attached online, and then goes up by around 50MB when the first query is executed. After that, memory usage tends to go up the more concurrent sessions that are run, and also when cross-database joins are performed. You’ll also find TMP files being created in $ORACLEBIDATA/tmp directory, which are used by the BI Server to hold temporary data as it pages out from memory, again typically when cross-database joins are used but also when it needs to perform additional aggregations that can’t be put into the physical SQL query.

Bis20

These files can get fairly big (up to 2GB in some cases) and can be created even when a single data source is used, typically for grouping data or as we’ll see in the next posting, when joining data across fact tables. They are usually cleared down when the BI Server and Presentation Server are restarted, but bear in mind when creating complex calculations that they can get pretty I/O intensive on the BI Server hardware.

So that’s the basics in terms of how basic queries are processed by the BI Server, and how the various BI Server components and engines process the query as it goes through the various stages. Again, if anyone knows any more, please add it as a comment, but for now that’s it and I’ll be back in a few days with part 3, on BI Server In-Memory Joins.

Oracle BI EE 10.1.3.4.1 – Puzzlers – Puzzle 4 – Bypassing Security

Venkatakrishnan J | Feb 28, 2010 13:36 +0000

On to the 4th puzzle in this Puzzle series. This is a very interesting Puzzle at least from the perspective of BI EE Security. We all know that BI EE provides comprehensive security within the repository. For example, the screenshot below shows that the column CHANNEL_DESC can be accessed only by the exec user.

image

Now when we log in as any user(users who do not belong to Administrators group) and open the report containing this secured column, we will either be getting an error or will be shown as NULL in reports depending on the PROJECT_INACCESSIBLE_COLUMN_AS_NULL property in the NQSConfig.ini. The question is how do we bypass the security and still show the CHANNEL_DESC column(with their values of course) in reports. I know this sounds a bit malicious and can even be read as a security hack, but this is currently possible in BI EE(in the solution i will let you know how to use a workaround to disable this though). The question or the Puzzle today is how do we achieve this. There are some potential use cases of this

1. If you do not have access to the repository but still want to look at certain security protected columns to validate certain reports

2. If you do not have a column in the repository at all (but exists in the database) and you still want to access it.

Remember this is not about enabling Direct Database Requests as in most cases that will always be disabled. For example, if you look at the report below, CHANNEL_DESC has become null for the user i have logged in as this user does not have access to this column(PROJECT_INACCESSIBLE_COLUMN_AS_NULL  is set to YES in my case).

image

If you look at the SQL, you will notice that the CHANNEL_DESC column is not even pushed back to the database.

WITH
SAWITH0 AS (select sum(1) as c1,
     T4167.CHANNEL_ID as c2
from
     CHANNELS T4167
group by T4167.CHANNEL_ID)
select distinct SAWITH0.c2 as c1,
     cast(NULL as  VARCHAR ( 1 ) ) as c2,
     SAWITH0.c1 as c3
from
     SAWITH0
order by c1, c2

which is good and as expected. Now, the puzzle is to somehow bypass this security and display the CHANNEL_DESC column as shown below.

image

Remember, there are 2 pre-conditions to this

1. No Direct Database Requests

2. No changing the repository to add a new column in the presentation layer

Oracle BI EE 10.1.3.4.1 – Solutions – Puzzle 3

Venkatakrishnan J | Feb 27, 2010 14:09 +0000

After almost a week of me giving out the Puzzle 3, there was hardly any interest for this one (just couple of odd replies requesting clarification of the Puzzle). Probably this is because this one has no direct practical usage and also there are lots of possibilities. But to me this is one very important Puzzle as in many cases when we are called in for repository tuning, the first question that we normally get in such situations is, why does BI EE generate such a big query when the same report can be solved by a very simple query. In such cases, we need to know where to look at and also understand what can cause BI EE to generate long SQLs. Remember, there is no theoretical limit to the length of the SQL generated (we can make it to generate as big a SQL as we want). The puzzle was meant primarily to know the possibilities of what can make BI EE to generate long SQLs. I always try to visualize a repository by looking at the SQL. That generally helps in doing further analysis on a pre-built repository.

Solution 1 – Conforming Dimensions:

This is probably the easiest and the most common reason why BI EE generates a lot of sub-queries. Always conforming dimensions should be used only when absolutely necessary as that will start generating sub-queries for every fact. For example, if you look at the repository below, it shows a very simple Business Model and Mapping layer containing one dimension with one Logical table source pointing to the physical CHANNELS table.It also contains 3 Facts each having a count metric (Mapped to 1 in the BMM layer for all the 3 columns)

image

image

There is also one more Logical Column that basically adds all the 3 columns together using a logical calculation

image

Now, when you generate a report using the CHANNEL_TOTAL and the logical calculated column, you will notice that BI EE will generate 3 sub-queries and then will bring them together as shown below

image

So the solution is you can create n number of conforming dimensions like this to make BI EE to produce n sub-queries thereby making the SQL very long. In this case there is no need for actually using conforming dimensions. The same SQL can actually be converted into a single SQL with all the counts (without the sub-queries). This basically demonstrates a bad use of Conforming Dimensions.

Solution 2 – Fragmentation:

This is another way of generating big SQLs. Same source can be made to appear as part of UNION ALL queries using Fragmentation. For example, if you look at the repository below

image

there are basically 3 logical table sources that contribute to the Fact Count. Each logical table source is modeled in a way such that all of them contribute to the Count and follow Parallel Fragmentation as shown below

image image

image

And if you look at the SQL, you will notice that BI EE will fire 3 UNION ALLs to generate the count. You can make this query as big as you want by adding more and more logical table sources.

image

Solution 3 – Level Based Measures:

This is another possible solution where incorrect use of Level-Based aggregation can start generating pretty complex queries. In your queries, if you start noticing Partition By using ROW_NUMBER or SUM() OVER functions then that means level based measures are being used somewhere (not in all cases but in most of them). For example, lets look at the repository below

image

As you notice we basically have have 3 count columns each dependent on the other. Count1 is a normal measure assigned to a constant 1 and to the lowermost level in the Channel Dimension. Count 1 is a measure which is equal to measure Count(logically calculated) but assigned to the Channel Class level. Count3 is equal to measure Count2(logically calculated) but assigned to the Total level.

image

image

image

As you see, this basically demonstrates why logically calculated measures can have different level assignments than their base members. If you look at the SQL generated,

image

you will notice a number of sub-queries which will equal to the number of level assignments for each dependent measure.

In all the cases above, the queries were generated using a single Physical table and from just using 2 report attributes. Much more complex queries are possible using other methods but most of them will be a variation of the 3 listed above. Puzzle 4 to follow tomorrow.

OBIEE Presentation Server Always Refresh

John Minkjan | Feb 27, 2010 13:42 +0000

If you have a separate development presentation server, you can use the developersprompt: (http://obiee101.blogspot.com/2008/10/obiee-developersprompt.html) for bypassing the BI servercache. If you always want to bypass the Presentation server cache add this to the instanceconfig file of the development presentation server:

<WebConfig>
<ServerInstance>
<ForceRefresh>TRUE</ForceRefresh>
</ServerInstance>
</WebConfig>

Till Next Time