本文介绍了蜂巢:更加安全的SELECT AS和GROUP BY选择方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试像这样编写Hive Sql

I try to write Hive Sql like that

SELECT count(1), substr(date, 1, 4) as year
FROM ***
GROUP BY year

但是Hive无法识别别名"year",它抱怨说:失败:SemanticException [错误10004]:行1:79无效的表别名或列引用'year'

But Hive cannot recognize the alias name 'year', it complains that:FAILED: SemanticException [Error 10004]: Line 1:79 Invalid table alias or column reference 'year'

一种解决方案(配置单元:SELECT AS和GROUP BY )建议使用'GROUP BY substr(date,1,4)'.

One solution(Hive: SELECT AS and GROUP BY) suggest to use 'GROUP BY substr(date, 1, 4)'.

有效!但是在某些情况下我要分组的值可能是由蜂巢功能代码的多行产生的,编写类似

It works!However in some cases the value I want to group by may be generated from multiple lines of hive function code, it's very ugly to write code like

SELECT count(1), func1(func2(..........................)) AS something
FROM ***
GROUP BY func1(func2(..........................))

Hive中有什么干净的方法可以做到这一点吗?有什么建议吗?

Is there any clean way in Hive to do that? Any suggestions?

推荐答案

在Group By中指定位置将解决您的问题.即使按SET hive.groupby.orderby.position.alias = false; ,Group By中的此职位编号也会起作用(配置单元0.12)

Specifying the position in Group By will solve your issue. This position number in Group By works even when SET hive.groupby.orderby.position.alias=false;(Hive 0.12)

SELECT count(1), substr(date, 1, 4) as year  
FROM ***
GROUP BY 2;

这篇关于蜂巢:更加安全的SELECT AS和GROUP BY选择方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-03 08:03