SELECT (SELECT dname
FROM scott.dept
WHERE deptno= e.deptno)
AS "Department",
LISTAGG (ename, ', ') WITHIN GROUP (ORDER BY deptno)
"Employees"
FROM scott.emp e
GROUP BY deptno
ORDER BY 1
As
usual we’ll work on the EMP table that is already furnished with a hierarchical
self relation. On that table, indeed, there’s a column, MGR, that stores, for
each employee, his boss’ EMPNO.
Data contained into the table is as follows:
Data contained into the table is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
SQL>
select empno, ename, mgr
2
from emp;
EMPNO
ENAME
MGR
----------
---------- ----------
7369
SMITH 7902
7499
ALLEN 7698
7521
WARD
7698
7566
JONES 7839
7654
MARTIN 7698
7698
BLAKE 7839
7782
CLARK 7839
7788
SCOTT 7566
7839
KING
7844
TURNER 7698
7876
ADAMS 7788
7900
JAMES 7698
7902
FORD
7566
7934
MILLER 7782
|
KING
is the root of the hierarchy since hi has no boss.
The
first example of hierarchical query is the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
SQL>
select empno, ename, mgr, prior ename, level
2
from emp
3
connect by prior empno = mgr
4
start with mgr is null;
EMPNO
ENAME
MGR PRIORENAME LEVEL
----------
---------- ---------- ---------- ----------
7839
KING
1
7566
JONES 7839
KING
2
7788
SCOTT 7566
JONES
3
7876
ADAMS 7788
SCOTT
4
7902
FORD
7566
JONES
3
7369
SMITH 7902
FORD
4
7698
BLAKE 7839
KING
2
7499
ALLEN 7698
BLAKE
3
7521
WARD
7698
BLAKE
3
7654
MARTIN 7698
BLAKE
3
7844
TURNER 7698
BLAKE
3
7900
JAMES 7698
BLAKE
3
7782
CLARK 7839
KING
2
7934
MILLER 7782
CLARK
3
|
Let’s
analyze it: the CONNECT BY clause, mandatory to make a hierarchical query, is
used to define how each record is connected to the hierarchical superior.
The father of the record having MGR=x has EMPNO=x.
On the other hand, given a record with EMPNO=x, all the records having MGR=x are his sons.
The unary operator PRIOR indicates “the father of”.
START WITH clause is used to from which records we want to start the hierarchy, in our example we want to start from the root of the hierarchy, the employee that has no manager.
The root of the hierarchy could be not unique. In this example it is.
The LEVEL pseudocolumn indicates at which level each record stays in the hierarchy, starting from the root that has level=1.
The father of the record having MGR=x has EMPNO=x.
On the other hand, given a record with EMPNO=x, all the records having MGR=x are his sons.
The unary operator PRIOR indicates “the father of”.
START WITH clause is used to from which records we want to start the hierarchy, in our example we want to start from the root of the hierarchy, the employee that has no manager.
The root of the hierarchy could be not unique. In this example it is.
The LEVEL pseudocolumn indicates at which level each record stays in the hierarchy, starting from the root that has level=1.
Once
understood the quey, we can read our resulting tree: KING is the root and has
level=1.
Under KING there are three employees at level 2(JONES,BLAKE e CLARK). Then the others.
How does Oracle make the hierarchy? First of all it reads the records.
Then it determines the roots applying the START WITH clause.
Then, starting from each root, it determines the first-level sons applying the CONNECT BY clause and so on…
Under KING there are three employees at level 2(JONES,BLAKE e CLARK). Then the others.
How does Oracle make the hierarchy? First of all it reads the records.
Then it determines the roots applying the START WITH clause.
Then, starting from each root, it determines the first-level sons applying the CONNECT BY clause and so on…
To
enhance the layout of the output we can use the following trick:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
mgr, prior ename, level
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null;
EMPNO
NAME
MGR
PRIORENAME LEVEL
----------
------------------ ---------- ---------- ----------
7839
KING
1
7566
JONES
7839
KING
2
7788
SCOTT 7566
JONES
3
7876
ADAMS 7788
SCOTT
4
7902
FORD 7566
JONES
3
7369
SMITH 7902
FORD
4
7698
BLAKE
7839
KING
2
7499
ALLEN 7698
BLAKE
3
7521
WARD 7698
BLAKE
3
7654
MARTIN 7698
BLAKE
3
7844
TURNER 7698
BLAKE
3
7900
JAMES 7698 BLAKE
3
7782
CLARK
7839
KING
2
7934
MILLER 7782
CLARK
3
|
We
just added some white-spaces on the left of the name depending on the level.
Now it’s all more clear…
Records
order defines the hierarchy, adding an ORDER BY clause we could completely lose
the hierarchy.
We can decide how to sort siblings, that is to say records on the same level. For example JONES, BLAKE and CLARK:
We can decide how to sort siblings, that is to say records on the same level. For example JONES, BLAKE and CLARK:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
SQL>
select empno,lpad(' ',level*3,'
')||ename nome,
2
mgr, prior ename, level
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null
6
order siblings by ename;
EMPNO
NOME
MGR PRIORENAME LEVEL
----------
------------------ ---------- ---------- ----------
7839
KING
1
7698
BLAKE
7839
KING
2
7499
ALLEN 7698
BLAKE
3
7900
JAMES 7698
BLAKE
3
7654
MARTIN 7698
BLAKE
3
7844
TURNER 7698
BLAKE
3
7521
WARD 7698
BLAKE
3
7782
CLARK
7839
KING
2
7934
MILLER 7782
CLARK
3
7566
JONES
7839
KING
2
7902
FORD 7566
JONES
3
7369
SMITH 7902
FORD
4
7788
SCOTT 7566
JONES
3
7876
ADAMS 7788
SCOTT
4
|
With
no impact on the hierarchy we’ve sorted siblings by name.
If
you work on Oracle9i you can stop reading, what follows is available on
Oracle10g and 11g…
In
any hierarchy infinite loops can raise. For example what happens if KING has a
manager himself?
1
|
SQL>
update emp set mgr=7369 where ename='KING';
|
KING
is one of the SMITH’s sons, SMITH is himself a KING’s great grandson…
What
if we run the same query now?
1
2
3
4
5
6
7
8
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
mgr, prior ename, level
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null
6
order siblings by ename;
No rows selected
|
Of
course, because there’s no record that has MGR null… So let’s change the START
WITH as follows:
1
2
3
4
5
6
7
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
mgr, prior ename, level
3
from emp
4
connect by prior empno = mgr
5
start with empno=7839;
ERROR:
ORA-01436:
CONNECT BY loop in user data
|
Here’s
the loop, Oracle can’t create the hierarchy.
Oracle has managed it introducing the NOCYCLE clause.
Oracle has managed it introducing the NOCYCLE clause.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
mgr, prior ename, level
3
from emp
4
connect by nocycle prior empno = mgr
5
start with empno=7839;
EMPNO
NAME
MGR
PRIORENAME LEVEL
----------
------------------ ---------- ---------- ----------
7839
KING
7369
1
7566
JONES
7839
KING
2
7788
SCOTT 7566
JONES
3
7876
ADAMS 7788
SCOTT
4
7902
FORD 7566
JONES
3
7369
SMITH 7902
FORD
4
7698
BLAKE
7839
KING
2
7499
ALLEN 7698
BLAKE
3
7521
WARD 7698
BLAKE
3
7654
MARTIN 7698
BLAKE
3
7844
TURNER 7698
BLAKE
3
7900
JAMES 7698
BLAKE
3
7782
CLARK
7839
KING
2
7934
MILLER 7782
CLARK
3
|
That
clause tells to Oracle to stop when a loop is raised, Oracle goes on on the
other branchs of the tree.
After SMITH another instance of KING (and all his descendant) should be extracted, and so on, generating an infinite output. The NOCYCLE clause prevents that.
After SMITH another instance of KING (and all his descendant) should be extracted, and so on, generating an infinite output. The NOCYCLE clause prevents that.
CONNECT_BY_ISCYCLE
pseudocolumn tells us in which records a loop has been detected:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
mgr, prior ename, connect_by_iscycle
IFLOOP
3
from emp
4
connect by nocycle prior empno = mgr
5
start with empno=7839;
EMPNO
NAME
MGR
PRIORENAME IFLOOP
----------
------------------ ---------- ---------- ----------
7839
KING
7369
0
7566
JONES
7839
KING
0
7788
SCOTT 7566 JONES
0
7876
ADAMS 7788
SCOTT
0
7902
FORD 7566
JONES
0
7369
SMITH 7902
FORD
1
7698
BLAKE
7839 KING
0
7499
ALLEN 7698
BLAKE
0
7521
WARD 7698
BLAKE
0
7654
MARTIN 7698
BLAKE
0
7844
TURNER 7698
BLAKE
0
7900
JAMES 7698
BLAKE
0
7782
CLARK
7839
KING
0
7934
MILLER 7782
CLARK
0
|
In
addition to PRIOR another unary operator exists that’s really useful:
CONNECT_BY_ROOT.
It allows us to display the root of a given record:
It allows us to display the root of a given record:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
connect_by_root ename boss
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null;
EMPNO
NAME
BOSS
----------
------------------ ----------
7839
KING KING
7566
JONES KING
7788
SCOTT KING
7876
ADAMS KING
7902
FORD KING
7369
SMITH KING
7698
BLAKE KING
7499
ALLEN KING
7521
WARD KING
7654
MARTIN KING
7844
TURNER KING
7900
JAMES KING
7782
CLARK KING
7934
MILLER KING
|
In
our example there’s a single root, KING, but if we modify another record:
1
|
SQL>
update emp set mgr=null where ename='BLAKE';
|
We
get:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
connect_by_root ename boss
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null;
EMPNO
NAME
BOSS
----------
------------------ ----------
7698
BLAKE BLAKE
7499
ALLEN BLAKE
7521
WARD BLAKE
7654
MARTIN BLAKE
7844
TURNER BLAKE
7900
JAMES BLAKE
7839
KING KING
7566
JONES KING
7788
SCOTT KING
7876
ADAMS KING
7902
FORD KING
7369
SMITH KING
7782
CLARK KING
7934
MILLER KING
|
Where,
for each record, we display who is the “big chief”…
Another
really useful function is SYS_CONNECT_BY_PATH, it gets as input a field and a
character and builds the full path from the root to the current record using
the character as a separator:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
sys_connect_by_path(ename,'/') bosses
3
from emp
4
connect by prior empno = mgr
5*
start with mgr is null
EMPNO
NAME
BOSSES
----------
------------------ --------------------------------
7839
KING /KING
7566
JONES /KING/JONES
7788
SCOTT /KING/JONES/SCOTT
7876
ADAMS /KING/JONES/SCOTT/ADAMS
7902
FORD /KING/JONES/FORD
7369
SMITH /KING/JONES/FORD/SMITH
7698
BLAKE /KING/BLAKE
7499
ALLEN /KING/BLAKE/ALLEN
7521
WARD /KING/BLAKE/WARD
7654
MARTIN /KING/BLAKE/MARTIN
7844
TURNER /KING/BLAKE/TURNER
7900
JAMES /KING/BLAKE/JAMES
7782
CLARK /KING/CLARK
7934
MILLER /KING/CLARK/MILLER
|
There’s
another useful pseudocolumn to show, CONNECT_BY_ISLEAF.
It
tells us whether a record is a leaf of the tree or not:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
SQL>
select empno,lpad(' ',level*3,'
')||ename name,
2
connect_by_isleaf ifleaf
3
from emp
4
connect by prior empno = mgr
5
start with mgr is null;
EMPNO
NAME
IFLEAF
----------
------------------ ----------
7839
KING
0
7566
JONES
0
7788
SCOTT
0
7876
ADAMS 1
7902
FORD
0
7369
SMITH 1
7698
BLAKE
0
7499
ALLEN
1
7521
WARD
1
7654
MARTIN
1
7844
TURNER
1
7900
JAMES
1
7782
CLARK
0
7934
MILLER
1
|
Really
interesting is the ability of CONNECT BY to generate more records from a table
that contains only one row:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
SQL>
select level from dual
2
connect by level<=10;
LEVEL
----------
1
2
3
4
5
6
7
8
9
10
|
We
can extract from DUAL, that contains just one row, all the records we need to
make complex computations.
I’ve used this ability, for example, in the post about regular expressions when I showed how REGEXP_COUNT works.
I’ve used this ability, for example, in the post about regular expressions when I showed how REGEXP_COUNT works.
That’s,
more or less, everything about theory, let’s see two practical examples.
The
first one has been already showed in another post: About string concatenation across different records.
1
2
3
4
5
6
7
8
9
10
11
12
|
SELECT deptno,
ltrim(SYS_CONNECT_BY_PATH(ename, ','),',') enames
FROM (select deptno, ename, rank()
over(partition
by deptno order by rownum) num from emp)
where connect_by_isleaf=1
START
WITH num=1
CONNECT BY PRIOR num+1 = num and prior deptno=deptno;
DEPTNO
ENAMES
----------
----------------------------------------
10
CLARK,KING,MILLER
20
SMITH,JONES,SCOTT,ADAMS,FORD
30
ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
|
Using
SYS_CONNECT_BY_PATH we was able to put in a single string, separated by comma,
the names of the employees that work in the same department.
And this has absolutely nothing to do with the MGR column…
And this has absolutely nothing to do with the MGR column…
The
other example is the solution that I proposed to solve a problem posted on the
official Oracle SQL & PL/SQL forum. The question was how to determine
whether a word is the anagram of another one or not.
Here’s the query I wrote to extract all the anagrams of a given word:
Here’s the query I wrote to extract all the anagrams of a given word:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
with t as
(select 'bolt' name from dual)
select distinct replace(sys_connect_by_path(c,'
'),' ',null) str
from (select substr(name,level,1) c,
name,
level*1000+ascii(substr(name,level,1))
cod
from t
connect by level<=length(name))
where level=length(name)
connect by nocycle cod != prior cod
STR
-------------------------
oltb
otbl
lbot
tlbo
blto
lotb
ltob
otlb
lbto
tblo
bolt
btol
oblt
lobt
tobl
blot
obtl
tbol
tolb
olbt
ltbo
tlob
botl
btlo
24
rows selected.
|
The
word to anagram was “bolt”.
First
of all the query extracts all the letters from the word marking each with an
unique code:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
SQL>
with t as
2
(select 'bolt' name from dual)
3
select substr(name,level,1) c,
name,
4
level*1000+ascii(substr(name,level,1)) cod
5
from t
6
connect by level<=length(name);
C
NAME
COD
-
---- ----------
b
bolt 1098
o
bolt 2111
l
bolt 3108
t
bolt 4116
|
Then
I’ve used sys_connect_by_path to build all the possible combinations of such
letters.
Interesting, isn’t it?
Interesting, isn’t it?