題解 | #計算用戶的平均次日留存率#
計算用戶的平均次日留存率
http://www.fangfengwang8.cn/practice/126083961ae0415fbde061d7ebbde453
#第一步: 去重,原表數(shù)據(jù)有個別重復(fù)的,這些會影響統(tǒng)計,先做去重篩選工作
SELECT DISTINCT device_id,question_id, result, DATE FROM question_practice_detail
#第二步:按照device_id分組,以date排序,用窗口函數(shù)lead,構(gòu)造一個字段next_day,表示當(dāng)前日期的后一條記錄
SELECT question_id, result, device_id,DATE, lead(DATE,1) over(PARTITION BY device_id ORDER BY DATE) "next_day" FROM (SELECT DISTINCT device_id,question_id, result, DATE FROM question_practice_detail) t
#第三步:判斷date和nexy_day是否只差一天,是的話表示該用戶第二天有繼續(xù)答題,賦值1以資鼓勵,否則就位0
SELECT *,IF(DATE + INTERVAL 1 DAY = next_day,1,0) "next_day_question" FROM (SELECT question_id, result, device_id,DATE, lead(DATE,1) over(PARTITION BY device_id ORDER BY DATE) "next_day" FROM (SELECT DISTINCT device_id,question_id, result, DATE FROM question_practice_detail) t) tt
#第四步:根據(jù)第三步,賦值1的都是第二天答題的用戶,這時候可以求平均率了
SELECT ROUND(SUM(next_day_question) / COUNT(*),4) "avg_ret" FROM (SELECT *,IF(DATE + INTERVAL 1 DAY = next_day,1,0) "next_day_question" FROM (SELECT question_id, result, device_id,DATE, lead(DATE,1) over(PARTITION BY device_id ORDER BY DATE) "next_day" FROM (SELECT DISTINCT device_id,question_id, result, DATE FROM question_practice_detail) t) tt) ttt