广告点击次数分析

需求分析

  • 广告:ad

    • ad_action。用户行为;0 曝光;1 曝光后点击;2 购买

    • duration。停留时长

    • shop_id。商家id

    • event_type。“ad”

    • ad_type。格式类型;1 JPG;2 PNG;3 GIF;4 SWF

    • show_style。显示风格,0 静态图;1 动态图

    • product_id。产品id

    • place。广告位置;首页=1,左侧=2,右侧=3,列表页=4

    • sort。排序位置

  • 公共字段

  • 分时统计:

    曝光次数、不同用户id数(公共信息中的uid)、不同用户数(公共信息中的device_id)

    点击次数、不同用户id数、不同用户数(device_id)

    购买次数、不同用户id数、不同用户数(device_id)

DWD => DWS(不需要) => ADS;在某个分析中不是所有的层都会用到

创建ADS层表

1
2
3
4
5
6
7
8
9
10
drop table if exists ads.ads_ad_show;
create table ads.ads_ad_show(
cnt bigint,
u_cnt bigint,
device_cnt bigint,
ad_action tinyint,
hour string
)
PARTITIONED BY (`dt` string)
row format delimited fields terminated by ',';

加载ADS层数据

创建/data/lagoudw/script/advertisement/ads_load_ad_show.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cd /data/script/advertisement

vim ads_load_ad_show.sh

#!/bin/bash
source /etc/profile
if [ -n "$1" ]
then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi

sql="
insert overwrite table ads.ads_ad_show partition (dt='$do_date')
select count(1),
count(distinct uid),
count(distinct device_id),
ad_action, hour
from dwd.dwd_ad
where dt='$do_date'
group by ad_action, hour;
"

hive -e "$sql"
1
sh ads_load_ad_show.sh 2020-07-20

漏斗分析(点击率购买率)

需求分析

分时统计:

点击率 = 点击次数 / 曝光次数

购买率 = 购买次数 / 点击次数

创建ADS层表

1
2
3
4
5
6
7
8
drop table if exists ads.ads_ad_show_rate;
create table ads.ads_ad_show_rate(
hour string,
click_rate double,
buy_rate double
)
PARTITIONED BY (`dt` string)
row format delimited fields terminated by ',';
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-- 方法一
select sum(case when ad_action='0' then cnt end) show_cnt,
sum(case when ad_action='1' then cnt end) click_cnt,
sum(case when ad_action='2' then cnt end) buy_cnt, hour
from ads.ads_ad_show
where dt='2020-08-02' and hour='01'
group by hour ;

-- 方法二
select max(case when ad_action='0' then cnt end) show_cnt,
max(case when ad_action='1' then cnt end) click_cnt,
max(case when ad_action='2' then cnt end) buy_cnt, hour
from ads.ads_ad_show
where dt='2020-08-02' and hour='01'
group by hour ;

加载ADS层数据

创建/data/lagoudw/script/advertisement/ads_load_ad_show_rate.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cd /data/script/advertisement

vim ads_load_ad_show_rate.sh

#!/bin/bash
source /etc/profile
if [ -n "$1" ]
then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi

sql="
with tmp as(
select max(case when ad_action='0' then cnt end) show_cnt,
max(case when ad_action='1' then cnt end) click_cnt,
max(case when ad_action='2' then cnt end) buy_cnt, hour
from ads.ads_ad_show
where dt='$do_date'
group by hour
)
insert overwrite table ads.ads_ad_show_rate partition (dt='$do_date')
select hour, click_cnt / show_cnt as click_rate, buy_cnt / click_cnt as buy_rate
from tmp;
"

hive -e "$sql"
1
sh ads_load_ad_show_rate.sh 2020-07-20

广告效果分析

需求分析

活动曝光效果评估:

行为(曝光、点击、购买)、时间段、广告位、商品,统计对应的次数

时间段、广告位、商品,曝光次数最多的前100个

创建ADS层表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
drop table if exists ads.ads_ad_show_place;
create table ads.ads_ad_show_place(
ad_action tinyint,
hour string,
place string,
product_id int,
cnt bigint
)
PARTITIONED BY (`dt` string)
row format delimited fields terminated by ',';

drop table if exists ads.ads_ad_show_place_window;
create table ads.ads_ad_show_place_window(
hour string,
place string,
product_id int,
cnt bigint,
rank int
)
PARTITIONED BY (`dt` string)
row format delimited fields terminated by ',';

加载ADS层数据

创建/data/lagoudw/script/advertisement/ads_load_ad_show_page.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
cd /data/script/advertisement

vim ads_load_ad_show_page.sh

#!/bin/bash
source /etc/profile
if [ -n "$1" ]
then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi

sql="
insert overwrite table ads.ads_ad_show_place partition (dt='$do_date')
select ad_action, hour, place, product_id, count(1)
from dwd.dwd_ad
where dt='$do_date'
group by ad_action, hour, place, product_id;
"

hive -e "$sql"
1
sh ads_load_ad_show_page.sh 2020-07-20

创建/data/lagoudw/script/advertisement/ads_load_ad_show_page_window.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cd /data/script/advertisement

vim ads_load_ad_show_page_window.sh

#!/bin/bash
source /etc/profile
if [ -n "$1" ]
then
do_date=$1
else
do_date=`date -d "-1 day" +%F`
fi

sql="
insert overwrite table ads.ads_ad_show_place_window partition (dt='$do_date')
select * from (
select hour, place, product_id, cnt,
row_number() over (partition by hour, place, product_id order by cnt desc) rank
from ads.ads_ad_show_place
where dt='$do_date' and ad_action='0'
) t
where rank <= 100;
"

hive -e "$sql"
1
sh ads_load_ad_show_page_window.sh 2020-07-20

广告分析小结


脚本调用次序:

1
2
3
4
5
6
7
8
9
ods_load_event_log.sh

dwd_load_event_log.sh
dwd_load_ad_log.sh

ads_load_ad_show.sh
ads_load_ad_show_rate.sh
ads_load_ad_show_page.sh
ads_load_ad_show_page_window.sh