New Version Of XPLAN_ASH Tool – Video Tutorial

A new major release (version 3.0) of my XPLAN_ASH tool is available for download.

You can download the latest version here.

In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also (more...)

ASM AU Size And LMT AUTOALLOCATE

When using Locally Managed Tablespaces (LMT) with variable, system managed extent sizes (AUTOALLOCATE) and data files residing in ASM the Allocation Unit (AU) size can make a significant difference to the algorithm that searches for free extents.The corresponding free extent search algorithm when searching for free extents (more...)

Free Webinar

It's webinar time again.

Join me on Wednesday, May 8th at AllThingsOracle.com for an overview session on the specifics of Oracle Parallel Execution.

The session starts at 16:00 UK (17:00 Central European) time. The webinar is totally free and the recording will made available afterwards.

Here's the link to (more...)

"Cost-free" joins – 2

In the previous post I've demonstrated an unexpected Nested Loop Join caused by an extreme data distribution. Although unexpected at first sight, the performance of the execution plan selected by the optimizer is decent - provided the estimates are in the right ballpark.Here is another case of an unexpected execution plan, this time about Merge Joins.

Merge Joins

In order to appreciate why the execution plan encountered is unexpected, first a quick summary about how Merge Joins work:A Merge Join is essentially a Nested Loop operation from one sorted row source into another sorted row source. In contrast (more...)

"Cost-free" joins – 1

Recently I came across some interesting edge cases regarding the costing of joins. They all have in common that they result in (at first sight) unexpected execution plans, but only some of them are actual threats to performance.

Outer Joins


The first one is about outer joins with an extreme data distribution. Consider the following data setup:


create table t1
as
select
rownum as id
, rpad('x', 100) as filler
, case when rownum > 1e6 then rownum end as null_fk
from
dual
connect by
level <= 1e6
;

exec dbms_stats.gather_table_stats(null, 't1')

create table t2
as
select
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e6
;

exec dbms_stats.gather_table_stats(null, 't2')

create /*unique*/ index t2_idx on t2 (id);

The following query is a simple outer join between T1 and T2 and the default, unhinted execution plan that I get from 11.2.0.1 (11.1.0.7 and 10.2.0.4 show similar results):


select
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.null_fk = t2.id (+)
;

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 202M| 4204 (1)| 00:00:51 |
| 1 | NESTED LOOPS OUTER | | 1000K| 202M| 4204 (1)| 00:00:51 |
| 2 | TABLE ACCESS FULL | T1 | 1000K| 101M| 4202 (1)| 00:00:51 |
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 106 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."NULL_FK"="T2"."ID"(+))

The optimizer preferred a Nested Loop join albeit the fact that the number of estimated loop iterations is pretty large. Notice in particular the cost column: Although the inner rowsource is estimated to be started 1000K times, the cost of doing so corresponds to just a single execution.

For reference here is a cost estimate for a similar operation that corresponds to the expected costing model:


select /*+ use_nl(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_filler
from
t1
, t2
where
t1.id = t2.id (+)
;

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 202M| 3006K (1)| 10:01:21 |
| 1 | NESTED LOOPS OUTER | | 1000K| 202M| 3006K (1)| 10:01:21 |
| 2 | TABLE ACCESS FULL | T1 | 1000K| 101M| 4200 (1)| 00:00:51 |
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 106 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."ID"="T2"."ID"(+))

I now had to force the Nested Loop join via a hint, because by default other join methods were preferred by the optimizer. The cost of a single iteration of the loop has now increased to 3, although the access is exactly the same as before (T2 random table access via index lookup of T2.ID), and the cost of the Nested Loop join corresponds to the known formula: Estimated starts times the cost of a single iteration, which is 3000K in this case here, plus the 4200 of the Full Table Scan for accessing the outer row source, plus CPU costing overhead.

So what makes the difference between the two? It's the data. The column name chosen for the column in T1 already reveals what's so special: The join column used (NULL_FK) is actually NULL for all rows.

The optimizer takes this into account and assumes that none of those rows from the driving row source will actually match a row of the inner row source - in fact the lookup to the inner row source could be short-circuited in some way, since a NULL value by definition isn't supposed to find a match for this join. I haven't investigated to what extent the runtime engine does this, however in the Rowsource Statistics the inner row source is started the expected number of times, although no logical I/O is recorded for it, but some CPU time, so at least some work seems to be done there.

Modifying the test case so that more of the FKs are actually non-null shows that the cost calculation is scaled accordingly. In fact the cost calculation is more or less the same than that of a corresponding inner join that could filter out those driving rows with NULL values in the join column.

The overall performance of the execution plan is quite decent, so although it looks quite unusual it performs pretty well.

In the second part I'll show another interesting, unexpected join execution plan that however can cause real performance problems.

QB_NAME hint query block name length limitation

Oracle 10g introduced the QB_NAME hint that can come handy in case hints need to be applied to more complex statements, in particular when possibly multiple layers of views / subqueries are involved.Jonathan Lewis has a older blog post that describes more details.Just in case you wonder why sometimes apparently the QB_NAME hint - along with all other hints that refer to the assigned query block name - seems to be ignored: One possible reason is that it looks like there is an undocumented length limitation of the query block names that can be assigned - 20 characters (more...)

Exadata Smart Scan Projection Limitation

Here is an interesting limitation to Exadata Smart Scans - if more than 254 columns from a table (not HCC compressed, more on that in moment) need to be projected, Smart Scans for that particular segment will be disabled and Exadata will fall back to conventional I/O. This means that the number of columns in the projection clause can make a significant difference to performance, since only Smart Scans allow taking advantage of offloading and particularly avoiding I/O via Storage Indexes.Now the expression "254 columns" might ring a bell, since it is the maximum number of columns that Oracle (more...)

HAVING Cardinality

When performing aggregate GROUP BY operations an additional filter on the aggregates can be applied using the HAVING clause.Usually aggregates are one of the last steps executed before the final result set is returned to the client.However there are various reasons, why a GROUP BY operation might be somewhere in the middle of the execution plan operation, for example it might be part of a view that cannot be merged (or was hinted not to be merged using the NO_MERGE hint), or in the more recent releases (11g+) the optimizer decided to use the GROUP BY PLACEMENT transformation (more...)

Hash Join Buffered

Introduction

A few years ago Jonathan Lewis published a blog post that described one of the interesting side effects of Oracle's Parallel Execution implementation: Sometimes operations that usually are non-blocking will be turned into blocking ones. Mostly these are represented by additional BUFFER SORT operation that show up in the parallel version of an execution plan from 10g on (pre-10g does the same internally but doesn't show in the execution plan), but there is a special case which is the HASH JOIN BUFFERED operation that gets used with the HASH data distribution of the join row sources.Jonathan came to (more...)

DBMS_XPLAN.DISPLAY_CURSOR And Parallel Execution

Introduction

DBMS_XPLAN.DISPLAY_CURSOR can be used to get more insights into the actual resource consumption on execution plan operation level when using the GATHER_PLAN_STATISTICS hint (from 10g on), or increasing the STATISTICS_LEVEL parameter to ALL (on session level, on system level the overhead is probably prohibitive).As soon as a SQL execution is done (either successfully, cancelled or with error) the corresponding extended data in the child cursor gets populated/updated and the additional information about the actual runtime profile can be accessed via V$SQL_PLAN_STATISTICS resp. V$SQL_PLAN_STATISTICS_ALL - this is what DISPLAY_CURSOR uses to populate the additional columns in the formatted (more...)

DOAG 2012 Conference

This is just a friendly reminder that I'll be talking at this year's DOAG Oracle conference about "Cost-Based Optimizer Basics" on Thursday, the 22nd of November, "Raum 1" at 13:00.

Additionally, I've reserved a slot at the "Unconference" (15:00 the same day) where I plan to present some of the interesting stuff that I cover as part of my "Parallel Execution Masterclass".

And soon the technical content will be back at this blog where I have lots of interesting stuff in the pipeline, mainly covering Parallel Execution topics.

OTN Mini Series "Understanding Parallel Execution"

As already announced previously OTN has now managed to publish my two part series called "Understanding Parallel Execution".

Part 1
Part 2

Thanks to Bob Rhubart from Oracle for his support in getting the articles published.

The articles should give you also an idea of what I cover in my "Mastering Oracle Parallel Execution" one day Oracle Expert seminar. More information can be found here.

New Version Of XPLAN_ASH Utility

A new version 2.0 of the XPLAN_ASH utility introduced here is available for download.You can download the latest version here.The change log tracks the following changes:- Access check- Conditional compilation for different database versions- Additional activity summary- Concurrent activity information (what is/was going on at the same time)- Experimental stuff: Additional I/O summary- More pretty printing- Experimental stuff: I/O added to Average Active Session Graph (renamed to Activity Timeline)- Top Execution Plan Lines and Top Activities added to Activity Timeline- Activity Timeline is now also shown for serial (more...)

Multiple Adverts

Just a couple of announcements:

1. I had the message already a couple of days added to the "Upcoming Public Appearances" sidebar on the right hand side of this page, now the official home page is available and registration open for the "CBO Days" at Trivadis 11th and 12th December in Zurich.

Join Mohamed Zait (Manager of the Query Optimizer Group at Oracle), Maria Colgan (Senior Principal Product Manager at Oracle), Jonathan Lewis, Jože Senegačnik, Christian Antognini and myself for two days fully focused on the Cost-Based Optimizer. You will not only learn about the history and present, but also (more...)

DOAG 2012

This year again I'm going to present at the DOAG 2012 conference in November. Since it is a popular topic I decided to talk about "Cost-Based Optimizer Basics" there, too. According to the official schedule the presentation will take place on Thursday, the 22nd of November, "Raum 1" at 13:00.

Like last year, I'll also try to do some Unconference sessions in addition - if it is going to take place, which I don't know yet.

I'll post some updates as soon as I know more details.

Exchange Partition, Virtual Columns And Column Statistics

Here is an odd bug that can lead to some nasty side effects when using the EXCHANGE PARTITION technique. It is probably there for a very long time, simply because it depends on the usage of virtual columns, and the basic technique of virtual columns was introduced way back in the Oracle 8i times with the introduction of Function Based Indexes.

The problem isn't the exchange partition operation itself, but the accompanying swap of object statistics information, in particular the column statistics.

Look the following sequence of DDL and DML commands and pay then special attention to the output (more...)

Parallel Execution Analysis Using ASH – The XPLAN_ASH Tool

Webinar Material

Thanks everyone who attended my last webinar at AllThingsOracle.com.

The recording of the webinar and all scripts used during the live demos are now available for download.

The README.TXT contains a description of the scripts used - there are a couple of helper scripts along with the main scripts, so that should allow you understanding which scripts you need to use for reproducing the issues demonstrated.

Furthermore the README contains links to some of the resources on the Internet that address questions that were raised during the Q+A session.


Thanks again to AllThingsOracle.com and James Murtagh for hosting the event.

Interesting blog notes

Just a pointer to two blog posts that I find worth mentioning:

1. Christo Kutrovsky from Pythian writes about some quirks he found regarding Parallel Distribution of aggregation and analytic functions. In particular the lower part of the post (which is not about the initial Interval Partitioning issue) gives a lot of food for thought how the chosen Parallel Distribution can influence the performance of operations

2. Alexander Anokhin gives some geek insights into how Oracle performs logical I/O and in particular how the "buffer is pinned count" statistics is maintained by Oracle. He even introduces another cool tool (" (more...)

Free Webinar

In a couple of days, on Wednesday, 1st of August, I'll be presenting another free webinar hosted at AllThingsOracle.com.

Although it is called "Oracle Cost-Based Optimizer Advanced Session", don't be mislead by the title.

It is not a truly "advanced" session, but rather I'll try to delve into various topics that I could only mention briefly or had to omit completely during the first webinar on the Cost-Based Optimizer.

In principle it's going to be a selection of the most recurring issues that I come across during my consultancy work:

- I'm going to spend some time (more...)