欢迎大家来到IT世界,在知识的湖畔探索吧!
Hive 0.11.0以后增加了分析函数row_number()、rank()、dense_rank(),说明:
row_number() over ([partition col1] [order by col2])
rank() over ([partition col1] [order by col2])
dense_rank() over ([partition col1] [order by col2])
三个函数都是根据col1字段分组,然后对col2字段进行排序,对排序后的每行生成一个行号,这个行号从1开始递增col1、col2都可以是多个字段,用’,’分隔
三个函数的区别是:
row_number() 是没有重复值的排序(即使记录相等也是不重复的),可以利用它来实现分页; dense_rank() 是连续排序,排名相等会在名次中不会留下空位,两个第二名仍然跟着第三名;rank()是跳跃排序,排名相等会在名次中留下空位,如两个第二名下来就是第四名。
实例:
-
hive> create table t(name string, sub string, score int) row format delimited fields terminated by ‘\t‘;
数据在a.txt文件中
-
a chinese 98
-
a english 90
-
d chinese 88
-
c english 82
-
c math 98
-
b math 89
-
b chinese 79
-
z english 90
-
z math 89
-
z chinese 80
-
e math 99
-
e english 87
-
d english 90
1、row_number
-
hive (test)> select *, row_number() over (partition by sub order by score) as od from t;
2、rank
-
hive (test)> select *, rank() over (partition by sub order by score) as od from t;
3、dense_ran
-
hive (test)> select *, dense_rank() over (partition by sub order by score desc) od from t;
业务实例:
统计每个学科的前三名
hive> select * from (select *, row_number() over (partition by sub order by score desc) as od from t ) tt where od<=3;
语文成绩是80分的排名是多少
hive> select od from (select *, row_number() over (partition by sub order by score desc) as od from t ) t where sub=‘chinese‘ and score=80;
分页查询
hive> select * from (select *, row_number() over () as rn from t) t1 where rn between 1 and 5;
select *, row_number() over () as rn from t;
免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://itzsg.com/36236.html