常用查询: 日活、日增、次留,各类排行榜

Google Analytics + Firebase 统计数据在 BigQuery 中常用的查询语句。

日活 = 每日活跃用户数 周活 = 每周活跃用户数

日增 = 每日新增用户数

次留 = 今日新增用户,明日依然活跃数

热门文章 = 每日文章阅读排行

热门来源 = 每日外部来源排行

不同客户端应用,对应事件名可能不同。本文主要以 web 应用 为例。

日活

-- active 日活数
-- user   日活用户数
SELECT count(user_pseudo_id) as active, count(distinct(user_pseudo_id)) as user
FROM `[project-Id].[dataset-Id].events_20220522`
WHERE `event_name` = "session_start"

周活

由于 BigQuery 数据集是按照日进行存储的。所以要计算周活的话,就需要进行联表查询。

BigQuery 联表查询,有两种方式,一种是直接在 FROM 字段中, 增加数据集。如下:

-- active 周活数
-- user   周活用户数
SELECT count(user_pseudo_id) as active, count(distinct(user_pseudo_id)) as user
FROM (
    SELECT * 
    FROM `[project-Id].[dataset-Id].events_20220522`, 
         `[project-Id].[dataset-Id].events_20220521`,
         `[project-Id].[dataset-Id].events_20220520`,
         `[project-Id].[dataset-Id].events_20220519`,
         `[project-Id].[dataset-Id].events_20220518`,
         `[project-Id].[dataset-Id].events_20220517`,
         `[project-Id].[dataset-Id].events_20220516`    
)
WHERE `event_name` = "session_start"

另外一种联表方式是,通过 UNION ALL 关键字进行联合:

-- active 周活数
-- user   周活用户数
SELECT count(user_pseudo_id) as active, count(distinct(user_pseudo_id)) as user
FROM (
    SELECT * FROM `[project-Id].[dataset-Id].events_20220522`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220521`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220520`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220519`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220518`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220517`
    UNION ALL
    SELECT * FROM `[project-Id].[dataset-Id].events_20220516`
)
WHERE `event_name` = "session_start"

日增

-- user   日增用户数
SELECT count(distinct(user_pseudo_id)) as user
FROM `[project-Id].[dataset-Id].events_20220522`
WHERE `event_name` = "first_open"

次留

-- user   次留用户数
SELECT count(distinct(user_pseudo_id)) as user
FROM `[project-Id].[dataset-Id].events_20220522`  -- 今天
WHERE `event_name` = "first_open"
AND `user_pseudo_id` in (
    SELECT distinct(user_pseudo_id)
    FROM `[project-Id].[dataset-Id].events_20220523` -- 明天
    WHERE `event_name` = "session_start"
)

文章阅读排行

-- page_title       文章标题
-- page_location    文章路径
-- viewed           阅读次数
SELECT page_title, page_location, count(*) as viewed
FROM (
    SELECT `user_pseudo_id`, `event_timestamp`,
       MAX(if(param.key = "page_title", param.value.string_value, NULL)) AS page_title,
       MAX(if(param.key = "page_location", param.value.string_value, NULL)) AS page_location
    FROM (
        SELECT `user_pseudo_id`, `event_timestamp`, param
        FROM `[project-Id].[dataset-Id].events_20220517`, UNNEST(event_params) AS param 
        WHERE event_name = "page_view"
        AND (param.key = "page_location" OR param.key = "page_title")
    )
    GROUP BY `user_pseudo_id`, `event_timestamp`
)
GROUP BY page_title, page_location
ORDER BY viewed DESC

热门来源排行

-- referer       来源地址
-- total           引用次数
SELECT param.value.string_value as referer, count(*) as total 
FROM `[project-Id].[dataset-Id].events_20220517`, UNNEST(`event_params`) AS param 
WHERE event_name = 'page_view' 
    AND param.key = 'page_referrer' 
    AND (CONTAINS_SUBSTR(param.value.string_value, "127.0.0.1") = false)    -- 过滤本站
    AND (CONTAINS_SUBSTR(param.value.string_value, "localhost") = false)    -- 过滤本站 
    AND (CONTAINS_SUBSTR(param.value.string_value, "mbox.dev") = false)     -- 过滤本站
GROUP BY param.value.string_value 
ORDER BY total DESC

← UNNEST 辅助查询 None

评论