对于每个fullvisitorId,我正在尝试获取date_1和date_2之间的所有visitId。当然这对于每个用户都是不同的。

任何人都可以给我任何指示我该怎么做吗?

例如:


user_1:我想要6月1日至20日之间的所有visitId
user_2:我想要6月12日至27日之间的所有visitId
...所以儿子


date_1和date_2对应于他们在网站上执行的重要操作(Event命中)。下载试用版和购买

在此先感谢您提供任何线索。

最佳答案

解决此问题的一种可能方法是使用analytical functions。举个例子:

#standardSQL
WITH data AS(
  select '1' as user, '1' as visitid, '20170520' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('event1' as eventCategory) as eventInfo)] hits UNION ALL
  select '1' as user, '2' as visitid, '20170521' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('' as eventCategory) as eventInfo)] hits UNION ALL
  select '1' as user, '3' as visitid, '20170522' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('event2' as eventCategory) as eventInfo)] hits UNION ALL
  select '1' as user, '4' as visitid, '20170523' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('' as eventCategory) as eventInfo)] hits UNION ALL

  select '2' as user, '1' as visitid, '20170520' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('event1' as eventCategory) as eventInfo)] hits UNION ALL
  select '2' as user, '2' as visitid, '20170521' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('event2' as eventCategory) as eventInfo)] hits UNION ALL
  select '2' as user, '3' as visitid, '20170522' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('' as eventCategory) as eventInfo)] hits union all

  select '3' as user, '1' as visitid, '20170520' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('event1' as eventCategory) as eventInfo)] hits UNION ALL
  select '3' as user, '2' as visitid, '20170521' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('' as eventCategory) as eventInfo)] hits UNION ALL
  select '3' as user, '3' as visitid, '20170522' as date, ARRAY<STRUCT<hitNumber INT64, eventInfo STRUCT<eventCategory STRING> >> [STRUCT(1 as hitNumber, STRUCT('' as eventCategory) as eventInfo)] hits
)

SELECT
  user,
  visitid,
  date
FROM(
  SELECT
    user,
    visitid,
    date,
    MIN(CASE WHEN hits.eventInfo.eventCategory = 'event1' THEN date END) OVER(PARTITION BY user) min_date,
MAX(CASE WHEN hits.eventInfo.eventCategory = 'event2' THEN date END) OVER(PARTITION BY user) max_date
FROM data,
UNNEST(hits) hits
)
WHERE date BETWEEN min_date AND max_date


其中,data是您的ga_sessions数据的模拟(我将“ fullvisitorid”命名为“ user”)。

假设给定用户在日期1和日期2可以有不同的事件(因此分别使用MINMAX),并且假设您将事件保存在eventCategory字段中(假设您的“下载”和“购买”事件是在会话级别中定义的,建议您使用customDimensions字段而不是hits.eventInfo.eventCategory一个)。

除分析功能外,您还可以使用标准SQL版本的ARRAYs and STRUCTs

SELECT
  user,
  ARRAY(SELECT AS STRUCT visitid, date FROM UNNEST(user_data) WHERE date BETWEEN min_date AND max_date) user_data
FROM(
  SELECT
    user,
    ARRAY_AGG((SELECT AS STRUCT visitid, date)) user_data,
    MIN(CASE WHEN EXISTS(SELECT 1 FROM UNNEST(hits) hits WHERE hits.eventInfo.eventCategory = 'event1') then date END) min_date,
    MAX(CASE WHEN EXISTS(SELECT 1 FROM UNNEST(hits) hits WHERE hits.eventInfo.eventCategory = 'event2') THEN date END) max_date
FROM data
GROUP BY user
)
WHERE ARRAY_LENGTH(ARRAY(SELECT AS STRUCT visitid, date FROM UNNEST(user_data) WHERE date BETWEEN min_date AND max_date)) > 0


如果我所做的假设与您的数据不符,则可以采用这些技术来查询所需的内容。您还可以将模拟数据用于测试目的(以及对其进行调整以更好地适合您的数据集)。

关于google-analytics - 每个用户(不同)日期范围之间的VisitId,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/44352715/

10-13 06:36