What is overloading and how and when do I use it

Dear Patrick,

Recently I heard someone talk about overloading in Java. What is it, is it possible in PL/SQL and if so, how would I use it?

Ramesh Cumar

Dear Ramesh,

Overloading is a technique of creating multiple programs with the same name that can be called with different sets of parameters. It is definitely possible to apply this technique in PL/SQL, in fact, Oracle does this a lot of times in their own built-in (more...)

Celebrate Batman Day with SQL

Because today is Batman Day, it calls for a celebration of the SQL type:

SQL> select listagg (b)
2 within group (order by null) ||' Batman'
3 from (select 0f/0 b
4 from dual
5 connect by level <= 14
6 );

LISTAGG(B)WITHINGROUP(ORDERBYNULL)||'BATMAN'
--------------------------------------------------------------------------------
NanNanNanNanNanNanNanNanNanNanNanNanNanNan Batman

When would you use a normal table function?

Dear Patrick,

Last year I did a presentation on table functions at KScope. One of the questions I got was: ‘If pipelined table functions provide their results faster, why would you want to use a normal table function?’ I couldn’t come up with the answer then, maybe you can help?

Erik van Roon

Dear Erik,

Let’s start with explaining a bit what table functions are. Table Functions are functions that return a collection of (more...)

What’s the difference between SEMI-JOIN and ANTI-JOIN?

Dear Patrick,

What is an ANTI-JOIN? And what is the difference between the SEMI-JOIN and the ANTI-JOIN?

Lillian Sturdey

Dear Lillian,

First of all, both SEMI-JOIN and ANTI-JOIN are not in the SQL syntax but they are more a pattern. You might expect to be able to write something like:

[PATRICK]SQL>SELECT d.deptno, d.dname, d.loc
               FROM dept d
               SEMI JOIN emp e ON (e.deptno = d.deptno)
             /

to get all the (more...)

Find first n gaps in integer primary key

Setup

sokrates@12.1 > create table t( pk int primary key check(pk > 0));

Table created.

sokrates@12.1 > insert /*+ignore_row_on_dupkey_index(t(pk)) */ into t
sokrates@12.1 > select trunc(dbms_random.value(1, 1e5)) from dual
sokrates@12.1 > connect by level <= 1e5 
sokrates@12.1 > /

63187 rows created.

Finding the first n gaps

sokrates@12.1 > variable n number
sokrates@12.1 > exec :n := 1000

PL/SQL procedure successfully completed.

sokrates@12.1 > set autotr  (more...)

What is the difference between NVL and COALESCE?

Dear Patrick,

Could you tell me what the difference is between NVL and COALESCE?

Kindest regards,
Paul McCurdey

Dear Paul,

NVL returns the value of the first argument if it doesn’t evaluate to NULL, otherwise it will return the value of the second argument.
COALESCE returns the first argument that doesn’t evaluate to NULL. That can be any one of the arguments.
So they are definitely similar – but there are significant differences.
First of (more...)

RegExp: Constraint to prevent spaces at the beginning or end.

Even though a space is a regular character, the client didn't want spaces at the beginning or end of a string. Any spaces in the middle were fine.
Of course this could be handled by the application, but it must also be implemented in the database. Using a check constraint with a regular expression will prevent the end user from entering unwanted data.

To try things out, let's just start with a simple table with (more...)

SUBSTR Versus LIKE in Oracle 11.2

I was reading an old SQL tuning book which was printed in 2002. It said that a where clause with like could often use an index whereas a similar clause using substr could not. I wondered if this might still be the case in an Oracle 11.2.0.1 database. To find out, I created a table:

SQL> conn andrew/reid
Connected.
SQL> create table tab1 as
  2  select table_name from dba_tables
  3  (more...)

Real World SQL and PL/SQL: Advice from the Experts



Because my hero is Cary Millsap, I'm going to do what he did and publish my foreword Preface. All joking aside, I consider myself incredibly fortunate to have been included in this project. I learned...a lot, by simply trying to find the author's mistakes (and there were not many). There was a lot more work than I expected, as well. (Technical) Editing is lot easier than writing, to be sure.

Brendan Tierney and (more...)

An introduction to Oracle Database for beginners: What is a relational database management system?—Part II

When you start thinking in terms such as security management, availability management, continuity management, change management, incident management, problem management, configuration management, release management, and capacity management, the business of database administration begins to make coherent sense and you become a more effective database administrator. (read more)

An introduction to Oracle Database for beginners: What is a relational database management system?—Part IV

Database management systems such as Oracle are the interface between users and databases. Database management systems differ in the range of features they provide, but all of them offer certain core features such as transaction management, data integrity, and security. And, of course, they offer the ability to create databases and to define their structure, as well as to store, retrieve, update, and delete the data in the databases.(read more)

An introduction to Oracle Database for beginners: What is a relational database management system?—Part V

Physical Data Independence means that you and I shouldn’t have to worry about implementation details such as the storage structures used to store data.(read more)

An introduction to Oracle Database for beginners: What is a relational database management system?—Part I

Oracle provides a convenient virtual machine (VM) containing a complete and ready-to-use installation of Oracle Database 12c on Linux. All you need to do is to download and install the Oracle VirtualBox virtualization software and then import a ready-to-use VM.(read more)

Outer Join with OR and Lateral View Decorrelation

Use of ANSI SQL is a personal thing.

Historically I have not been a fan apart from where it makes things easier/possible.

This reticence was mainly due to optimizer bugs and limitations in the earlier days.

Recently I have been using it much more because I find that the developers I interact with prefer it / understand it better.

You might/should be aware that Oracle will rewrite ANSI SQL to an Oracle syntax representation, this (more...)

Believe it or Not: Converting an Inner Join to an Outer Join to improve performance

The cost-based optimizer tries to merge views whenever possible but sometimes we ma y want to override this behavior; that is, we may want the optimizer to push predicates into the view instead of merging the view into the main query. If the main query performs an inner join to the view, it becomes necessary to convert the inner join to an outer join if the database version is less than 12.1.0.2.(read more)

What’s in a name? – “Brittany” edition

How do you spell “Brittany”? The picture above has four well-known women with four different spellings of the name. It turns out there are nearly 100 different ways that Americans have spelled it. The US Social Security Administration names data lets us tease out all the spellings and find out which ones are most popular – and when.

Here’s how Americans have spelled “Brittany” each year. This is a graph of SSA applications for each (more...)

What’s in a name? or rather, in the SSA Names data

One of the amazing things about being a DBA/developer in 2016 is the sheer amount of freely available, downloadable data to play with. One fun publicly available data sets is the American Social Security Administration names data. It contains all names for which SSNs were issued for each year, with the number of occurrences (although names with <5 occurrences are not included to protect individual privacy).

What’s so fun about this dataset?

* It’s already normalized

* It updates only once a year, and then only by adding another year’s worth of data, so it’s easy to keep current

* Almost everyone can relate to this dataset personally – almost everyone’s name is in there!

* At about 1.8 million rows, it’s not particularly large, but it’s large enough to be interesting to play with.

The one slight annoyance is that the data is in over 100 files, one per year: too many to load one-by-one manually. So here’s a blog post on loading it into your Oracle database, with scripts.

1. Visit the URL:
https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-level-data

2. Download and unzip names.zip . This zip archive contains one file for each year from 1880 to 2015. The files are named yobXXXX.txt eg. yob2015.txt .

3. Create a table to hold the names data:

DROP TABLE names;
CREATE TABLE names (YEAR NUMBER(4), name varchar2(30), sex CHAR(1), freq NUMBER);

4. Load in one year to get a feeling for the data. Let’s load “yob2015.txt”, the most recent year.
Here’s a sql*loader control file “names.ctl” to load the data:

[oracle@localhost names]$ cat names.ctl
load data 
infile 'yob2015.txt' "str '\r\n'"
append
into table NAMES
fields terminated by ','
OPTIONALLY ENCLOSED BY '"' AND '"'
trailing nullcols
           ( NAME CHAR(4000),
             SEX CHAR(4000),
             FREQ CHAR(4000),
             YEAR "2015"
           )

(By the way, here’s a great tip from That Jeff Smith: Use sql developer to generate a sql*loader ctl file. )
Now let’s use the ctl file to load it:

[oracle@localhost names]$ sqlldr CONTROL=names.ctl   skip=0  
Username:scott/********
 
SQL*Loader: Release 12.1.0.2.0 - Production on Thu Jun 9 10:41:29 2016
 
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
 
Path used:      Conventional
Commit point reached - logical record count 20
 
...
Table NAMES:
  32952 Rows successfully loaded.
 
Check the log file:
  names.log
for more information about the load.

5. Let’s take a look at the 2015 data! How about the top 10 names for each sex?

WITH n AS 
  ( SELECT name, sex, freq, 
  rank() OVER (partition BY sex ORDER BY freq DESC) AS rank_2015
  FROM names 
  WHERE YEAR=2015 )
SELECT * FROM n
WHERE rank_2015 < 11
ORDER BY sex, rank_2015;
NAME			       S       FREQ  RANK_2015
------------------------------ - ---------- ----------
Emma			       F      20355	     1
Olivia			       F      19553	     2
Sophia			       F      17327	     3
Ava			       F      16286	     4
Isabella		       F      15504	     5
Mia			       F      14820	     6
Abigail 		       F      12311	     7
Emily			       F      11727	     8
Charlotte		       F      11332	     9
Harper			       F      10241	    10
 
NAME			       S       FREQ  RANK_2015
------------------------------ - ---------- ----------
Noah			       M      19511	     1
Liam			       M      18281	     2
Mason			       M      16535	     3
Jacob			       M      15816	     4
William 		       M      15809	     5
Ethan			       M      14991	     6
James			       M      14705	     7
Alexander		       M      14460	     8
Michael 		       M      14321	     9
Benjamin		       M      13608	    10

6. Now let’s load the names data for the other 135 years.
First we’ll create a generic “names.ctl”:

$ cat names.ctl
load data 
infile 'yob%%YEAR%%.txt' "str '\r\n'"
append
into table NAMES
fields terminated by ','
OPTIONALLY ENCLOSED BY '"' AND '"'
trailing nullcols
           ( NAME CHAR(4000),
             SEX CHAR(4000),
             FREQ CHAR(4000),
             YEAR "%%YEAR%%"
           )

Now we’ll write a small shell script to substitute %%YEAR%% for each year from 1880 to 2014, and load that year’s file.

$ cat names.sh
#!/usr/bin/bash
export TWO_TASK=orcl
for i in {1880..2014}
do
  echo "generating yob$i.ctl"
  sed s/%%YEAR%%/$i/g names.ctl > yob$i.ctl
  echo "loading yob$i"
  sqlldr username/password CONTROL=yob$i.ctl
  echo "done $i"
done
 
[oracle@localhost names]$ ./names.sh
... massive screen output...
 
[oracle@localhost names]$ grep "error" *.log
yob1880.log:  0 Rows not loaded due to data errors.
yob1881.log:  0 Rows not loaded due to data errors.
yob1882.log:  0 Rows not loaded due to data errors.
yob1883.log:  0 Rows not loaded due to data errors.
...
yob2012.log:  0 Rows not loaded due to data errors.
yob2013.log:  0 Rows not loaded due to data errors.
yob2014.log:  0 Rows not loaded due to data errors.

7. Now we can play with the data a bit!

Here’s a quick look at the popularity of 2015’s top girls’ names since 1880:

WITH n2015 AS 
  ( SELECT name, sex, freq, 
  rank() OVER (partition BY sex ORDER BY freq DESC) AS rank_2015
  FROM names 
  WHERE YEAR=2015 )
, y AS (SELECT  YEAR, sex, SUM(freq) tot FROM names GROUP BY YEAR, sex)
SELECT names.year, names.name, 100*names.freq/tot AS pct_by_sex
FROM n2015, y, names
WHERE n2015.name = names.name AND n2015.sex = names.sex
AND y.year = names.year AND y.sex=names.sex
AND n2015.rank_2015 < 11
AND y.sex='F'
ORDER BY YEAR, name;

I graphed this in SQL Developer. Click to embiggen:
2015_girls_allyrs

You can see that Emma, my grandmother’s name, is having a bit of a comeback but is nowhere near the powerhouse it was in the 1880s, when 2% of all girls were named Emma. (For the record, my grandmother was not born in the 1880s!)

My next post will look at the name Brittany and its variants.

Note: You can download the names.ctl and names.sh from github here.

Top N- queries: using the 12c syntax.

One of the new features with Oracle database 12c is the new syntax for Top N queries and pagination. Did we really need this? Should you choose for the new syntax over the way we used to do it, with an inline view? I think so, it simply adds syntactic clarity to the query, and in this blogpost I will show the difference between the "old" and the "new".

For the examples I will use (more...)

Joining to a pipelined table function and “left correlation”

Oracle 11.2.0.4

A pipelined table function may be called from regular SQL using the TABLE collection expression, e.g.

SELECT *
FROM   TABLE(my_pipelined_function('ABC','DEF'));

where ‘ABC’ and ‘DEF’ are the inputs to the function.

What if you want to call the function repeatedly for several sets of inputs, e.g. testing the function for a variety of values? If those inputs are stored in a table somewhere, it ought to be as easy (more...)

Rounding amounts, divide cents over multiple lines

In previous articles I wrote about dealing with a missing cent when you need to divide a certain amount over multiple lines. In these articles, links are at the bottom, I described a method to calculate the difference on the last row.
Then a question arose (as a comment):
What if for example i have 42 records and i wish to divide 100 by 42. I would get a rounded value of 2.38. If (more...)