本文介绍了Impala SQL:合并具有重叠日期的行.不支持WHERE EXISTS和递归CTE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Impala SQL中合并具有重叠日期间隔的行.但是,Impala不支持我找到的解决方案.存在和递归CTE.

I am trying to merge rows with overlapping date intervals in a table in Impala SQL. However the solutions I have found to solve this are not supported by Impala eg. WHERE EXISTS and recursive CTEs.

如何在Impala中为此编写查询?

How would I write a query for this in Impala?

    Table: @T
    ID  StartDate   EndDate
    1   20170101    20170201
    2   20170101    20170401
    3   20170505    20170531
    4   20170530    20170531
    5   20170530    20170831
    6   20171001    20171005
    7   20171101    20171225
    8   20171105    20171110

    Required Output:
    StartDate   EndDate
    20170101    20170401
    20170505    20170831
    20171001    20171005

Impala不支持我尝试实现的示例:

Example of what I am trying to achieve that is not supported in Impala:

    SELECT
           s1.StartDate,
           MIN(t1.EndDate) AS EndDate
    FROM @T s1
    INNER JOIN @T t1 ON s1.StartDate <= t1.EndDate
      AND NOT EXISTS(SELECT * FROM @T t2
             WHERE t1.EndDate >= t2.StartDate AND t1.EndDate < t2.EndDate)
    WHERE NOT EXISTS(SELECT * FROM @T s2
                     WHERE s1.StartDate > s2.StartDate AND s1.StartDate <= s2.EndDate)
    GROUP BY s1.StartDate
    ORDER BY s1.StartDate

类似的问题:

合并重叠的日期间隔

消除并减少重叠的日期范围

https://gerireshef.wordpress.com/2010/05/02/packing-date-intervals/

https://www.sqlservercentral.com/Forums/Topic826031-8 -1.aspx

推荐答案

select  min(StartDate)  as StartDate
       ,max(EndDate)    as EndDate

from   (select  StartDate,EndDate
               ,count (is_gap) over
                (
                    order by    StartDate,ID
                )   as range_id

        from   (select  ID,StartDate,EndDate
                       ,case
                            when    max (EndDate) over
                                    (
                                        order by    StartDate,ID
                                        rows        between unbounded preceding
                                                    and     1 preceding
                                    ) < StartDate
                            then    true
                        end as is_gap

                from    t
                ) t
        ) t

group by    range_id

order by    StartDate
;


+------------+------------+
| startdate  | enddate    |
+------------+------------+
| 2017-01-01 | 2017-04-01 |
| 2017-05-05 | 2017-08-31 |
| 2017-10-01 | 2017-10-05 |
| 2017-11-01 | 2017-12-25 |
+------------+------------+

这篇关于Impala SQL:合并具有重叠日期的行.不支持WHERE EXISTS和递归CTE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-11 15:47