posts - 40, comments - 58, trackbacks - 0, articles - 0
  BlogJava :: 首页 :: 新随笔 :: 联系 :: 聚合  :: 管理

Oracle分析函数

Posted on 2009-01-20 11:33 Astro.Qi 阅读(905) 评论(0)  编辑  收藏 所属分类: Oracle

分析函数
  它是Oracle分析函数专门针对类似于"经营总额"、"找出一组中的百分之多少" 或"计算排名前几位"等问题设计的。
分析函数运行效率高,使用方便。
  分析函数是基于一组行来计算的。这不同于聚集函数且广泛应用于OLAP环境中。
  Oracle从8.1.6开始提供分析函数,分析函数用于计算基于组的某种聚合值,它和聚合函数的不同之处是
对于每个组返回多行,而聚合函数对于每个组只返回一行。 

语法:
<analytic-function>(<argument>,<argument>,...)
over(
  <query-partition-clause>
  <order-by-clause>
  <windowing-clause>
)
其中:
1 over是关键字,用于标识分析函数。

2 <analytic-function>是指定的分析函数的名字。Oracle分析函数很多。

3 <argument>为参数,分析函数可以选取0-3个参数。

4 分区子句<query-partition-clause>的格式为:
  partition by<value_exp>[,value_expr]...
  关键字partition by子句根据由分区表达式的条件逻辑地将单个结果集分成N组。这里的"分区partition"和"组group"
都是同义词。

5 排序子句order-by-clause指定数据是如何存在分区内的。其格式为:
 order[siblings]by{expr|position|c_alias}[asc|desc][nulls first|nulls last]
其中:
(1)asc|desc:指定了排列顺序。
(2)nulls first|nulls last:指定了包含空值的返回行应出现在有序序列中的第一个或最后一个位置。

6窗口子句windowing-clause
 给出一个固定的或变化的数据窗口方法,分析函数将对这些数据进行操作。在一组基于任意变化或固定的窗口中,
可用该子句让分析函数计算出它的值。
格式:
 {rows|range}
 {between
  {unbounded preceding|current row |<value_expr>{preceding|following}
 }and
 {unbounded preceding|current row |<value_expr>{preceding|following}
 }|{unbounded preceding|current row |<value_expr>{preceding|following
 }}
(1)rows|range:此关键字定义了一个window。
(2)between...and...:为窗品指一个起点和终点。
(3)unbounded preceding:指明窗口是从分区(partition)的第一行开始。
(4)current row:指明窗口是从当前行开始。

create table emp(
 deptno varchar2(20),--部门编码
 ename varchar2(20),--人名
 sal number(10));--工资

insert into emp values('10','andy1',2000);
insert into emp values('10','andy2',3000);
insert into emp values('10','andy3',2000);
insert into emp values('20','leno1',4000);
insert into emp values('20','leno2',8000);
insert into emp values('20','leno3',6000);
insert into emp values('30','jack1',5000);
insert into emp values('30','jack2',6000);
insert into emp values('30','jack3',7000);

1 连续求和
select deptno,ename,sal,sum(sal) over(order by ename)  连续求和 from emp;

DEPTNO               ENAME                        SAL       连续求和
-------------------- -------------------- ----------- ----------
10                   andy1                       2000       2000
10                   andy2                       3000       5000
10                   andy3                       2000       7000
30                   jack1                       5000      12000
30                   jack2                       6000      18000
30                   jack3                       7000      25000
20                   leno1                       4000      29000
20                   leno2                       8000      37000
20                   leno3                       6000      43000


2 不连续求和
select deptno,ename,sal,sum(sal) over()  不连续求和 from emp;

DEPTNO               ENAME                        SAL      不连续求和
-------------------- -------------------- ----------- ----------
10                   andy1                       2000      43000
10                   andy2                       3000      43000
10                   andy3                       2000      43000
20                   leno1                       4000      43000
20                   leno2                       8000      43000
20                   leno3                       6000      43000
30                   jack1                       5000      43000
30                   jack2                       6000      43000
30                   jack3                       7000      43000

3.
select deptno,ename,sal,
sum(sal) over(order by ename) 连续求和,
sum(sal) over() 总和,
100*round(sal/sum(sal) over(),4) "份额(%)"
from emp
/
DEPTNO               ENAME                        SAL       连续求和         总和      份额(%)
-------------------- -------------------- ----------- ---------- ---------- ----------
10                   andy1                       2000       2000      43000       4.65
10                   andy2                       3000       5000      43000       6.98
10                   andy3                       2000       7000      43000       4.65
30                   jack1                       5000      12000      43000      11.63
30                   jack2                       6000      18000      43000      13.95
30                   jack3                       7000      25000      43000      16.28
20                   leno1                       4000      29000      43000        9.3
20                   leno2                       8000      37000      43000       18.6
20                   leno3                       6000      43000      43000      13.95


4.使用子分区查询。
按部门薪水连续的总和.
(1)select deptno,sum(sal) over (partition by deptno order by ename) 按部门连续求总和 from emp;

DEPTNO                       按部门连续求总和
-------------------- ----------------
10                               2000
10                               5000
10                               7000
20                               4000
20                              12000
20                              18000
30                               5000
30                              11000
30                              18000

(2)按部门求总和
select deptno,sum(sal) over (partition by deptno) 按部门连续求总和 from emp ;

DEPTNO                       按部门求总和
-------------------- ----------------
10                               7000
10                               7000
10                               7000
20                              18000
20                              18000
20                              18000
30                              18000
30                              18000
30                              18000

(3)不按部门连续求总和
select deptno,sum(sal) over (order by deptno,ename) 不按部门连续求总和 from emp ;

DEPTNO                        不按部门连续求总和
-------------------- ------------------
10                                 2000
10                                 5000
10                                 7000
20                                11000
20                                19000
20                                25000
30                                30000
30                                36000
30                                43000

(4)不按部门,求所有员工总和,效果等同于sum(sal)
select deptno,sum(sal) over (order by deptno,ename) 不按部门连续求总和 from emp ;

DEPTNO                        不按部门连续求总和
-------------------- ------------------
10                                 2000
10                                 5000
10                                 7000
20                                11000
20                                19000
20                                25000
30                                30000
30                                36000
30                                43000

(5)select deptno,ename,sal,
        sum(sal) over (partition by deptno order by ename) 部门连续求和,--各部门的薪水"连续"求和
        sum(sal) over (partition by deptno) 部门总和,   -- 部门统计的总和,同一部门总和不变
        100*round(sal/sum(sal) over (partition by deptno),4) "部门份额(%)",
        sum(sal) over (order by deptno,ename) 连续求和, --所有部门的薪水"连续"求和
        sum(sal) over () 总和,   -- 此处sum(sal) over () 等同于sum(sal),所有员工的薪水总和
        100*round(sal/sum(sal) over (),4) "总份额(%)"
        from emp;

DEPTNO               ENAME                        SAL       部门连续求和       部门总和     部门份额(%)       连续求和         总和     总份额(%)
-------------------- -------------------- ----------- ------------ ---------- ----------- ---------- ---------- ----------
10                   andy1                       2000         2000       7000       28.57       2000      43000       4.65
10                   andy2                       3000         5000       7000       42.86       5000      43000       6.98
10                   andy3                       2000         7000       7000       28.57       7000      43000       4.65
20                   leno1                       4000         4000      18000       22.22      11000      43000        9.3
20                   leno2                       8000        12000      18000       44.44      19000      43000       18.6
20                   leno3                       6000        18000      18000       33.33      25000      43000      13.95
30                   jack1                       5000         5000      18000       27.78      30000      43000      11.63
30                   jack2                       6000        11000      18000       33.33      36000      43000      13.95
30                   jack3                       7000        18000      18000       38.89      43000      43000      16.28

(6)TOP-N查询
6.1查询各部门中工资最高的记录
select * from(select deptno,ename,sal,row_number() over(partition by deptno order by sal desc) topn from emp)  where topn=1;

DEPTNO               ENAME                        SAL       TOPN
-------------------- -------------------- ----------- ----------
10                   andy2                       3000          1
20                   leno2                       8000          1
30                   jack3                       7000          1

6.2按薪水高低对每个员工在本部门和整个公司内的排名进行排序。
select deptno,ename,sal,dense_rank() over(partition by deptno order by sal desc nulls last) as dept_ranking,
    dense_rank() over(order by sal desc nulls last) as company_ranking 
from emp;

DEPTNO               ENAME                        SAL DEPT_RANKING COMPANY_RANKING
-------------------- -------------------- ----------- ------------ ---------------
20                   leno2                       8000            1               1
30                   jack3                       7000            1               2
20                   leno3                       6000            2               3
30                   jack2                       6000            2               3
30                   jack1                       5000            3               4
20                   leno1                       4000            3               5
10                   andy2                       3000            1               6
10                   andy1                       2000            2               7
10                   andy3                       2000            2               7

5 窗口windows
窗口子句是数据的滑动窗口,该窗口的分析函数将窗口看成一组。
select deptno "部门ID",ename "部门名称",sal "工资",
    sum(sal) over(partition by deptno order by ename rows 2 preceding) "sliding total"
from emp order by deptno,ename;

partition by deptno:相当于group by deptno
rows 2:表示前两行相加
preceding:表示从每个部门的第一行开始。


6 范围窗口
Range windows仅对数据值和日期类型数据有效。(sal)
 select deptno,ename,sal, count(*) over(order by sal asc range 3 preceding) 总计

7 行窗口
是物理单元,包含在窗口中的物理行数。对数据类型没有限制。
计算每个记录与其之前的2个记录的平均工资。
set numformat 9999
select ename,sal,
    avg(sal) over(order by deptno asc rows 2 preceding) avgasc,
    count(*) over(order by deptno asc rows 2 preceding) cntasc,
    avg(sal) over(order by deptno desc rows 2 preceding) avgdes,
    count(*) over(order by deptno desc rows 2 preceding) cntdes
from emp order by deptno;  

ENAME                        SAL     AVGASC     CNTASC     AVGDES     CNTDES
-------------------- ----------- ---------- ---------- ---------- ----------
andy1                       2000       2000          1 3666.66666          3
andy2                       3000       2500          2 5666.66666          3
andy3                       2000 2333.33333          3 2333.33333          3
leno1                       4000       3000          3 5333.33333          3
leno2                       8000 4666.66666          3 6333.33333          3
leno3                       6000       6000          3       6000          3
jack1                       5000 6333.33333          3       5500          2
jack2                       6000 5666.66666          3       6000          1
jack3                       7000       6000          3       6000          3

8 确定每组中的第一行或最后一行
使用first_vale和last_value函数可从一组中选择每一行和最后一行
统计工资每个部门最低或最高的员工信息。
select deptno,ename,sal,first_value(ename) over(partition by deptno order by sal asc) as min_sal_has 
from emp
order by deptno,ename;

select deptno,ename,sal,first_value(ename) over(partition by deptno order by sal desc) as min_sal_has 
from emp
order by deptno,ename;

9 统计各班成绩第一名的同学信息    NAME   CLASS S                         
    ----- ----- ---------------------- 
    fda    1      80                     
    ffd    1      78                     
    dss    1      95                     
    cfe    2      74                     
    gds    2      92                     
    gf     3      99                     
    ddd    3      99                     
    adf    3      45                     
    asdf   3      55                     
    3dd    3      78              
   
    通过:   
    --
    select * from                                                                       
    (                                                                            
    select name,class,s,rank()over(partition by class order by s desc) mm from t2
    )                                                                            
    where mm=1 
    --
    得到结果:
    NAME   CLASS S                       MM                                                                                        
    ----- ----- ---------------------- ---------------------- 
    dss    1      95                      1                      
    gds    2      92                      1                      
    gf     3      99                      1                      
    ddd    3      99                      1          
   
    注意:
    1.在求第一名成绩的时候,不能用row_number(),因为如果同班有两个并列第一,row_number()只返回一个结果          
    2.rank()和dense_rank()的区别是:
      --rank()是跳跃排序,有两个第二名时接下来就是第四名
      --dense_rank()l是连续排序,有两个第二名时仍然跟着第三名

         
  
二:开窗函数           
      开窗函数指定了分析函数工作的数据窗口大小,这个数据窗口大小可能会随着行的变化而变化,举例如下: 
1:     
   over(order by salary) 按照salary排序进行累计,order by是个默认的开窗函数
   over(partition by deptno)按照部门分区
2:
  over(order by salary range between 5 preceding and 5 following)
   每行对应的数据窗口是之前行幅度值不超过5,之后行幅度值不超过5
   例如:对于以下列
     aa
     1
     2
     2
     2
     3
     4
     5
     6
     7
     9
   
   sum(aa)over(order by aa range between 2 preceding and 2 following)
   得出的结果是
            AA                       SUM
            ---------------------- ------------------------------------------------------- 
            1                       10                                                      
            2                       14                                                      
            2                       14                                                      
            2                       14                                                      
            3                       18                                                      
            4                       18                                                      
            5                       22                                                      
            6                       18                                                                
            7                       22                                                                
            9                       9                                                                 
             
   就是说,对于aa=5的一行 ,sum为   5-1<=aa<=5+2 的和
   对于aa=2来说 ,sum=1+2+2+2+3+4=14     ;
   又如 对于aa=9 ,9-1<=aa<=9+2 只有9一个数,所以sum=9    ;
              
3:其它:
     over(order by salary rows between 2 preceding and 4 following)
          每行对应的数据窗口是之前2行,之后4行 
4:下面三条语句等效:           
     over(order by salary rows between unbounded preceding and unbounded following)
          每行对应的数据窗口是从第一行到最后一行,等效:
     over(order by salary range between unbounded preceding and unbounded following)
           等效
     over(partition by null)

只有注册用户登录后才能发表评论。


网站导航: