🛠 Workshop — KPI Calculation & Cohort Retention Heatmap

Dataset e-commerce: orders + customers + sessions → Tính KPI: revenue, AOV, CAC, churn rate (Python) → Funnel: visit → add_to_cart → checkout → purchase → Cohort: monthly retention heatmap. Output: KPI report + funnel chart + cohort heatmap!

🎯 Mục tiêu workshop

Sau khi hoàn thành workshop này, bạn sẽ:

Tính toàn bộ KPI từ raw data: Revenue, AOV, ARPU, CAC, churn rate, LTV
Xây funnel analysis và visualize drop-off ở mỗi bước
Tạo cohort retention heatmap — phân tích retention theo tháng acquire
Tính LTV/CAC ratio và đánh giá unit economics health
Xuất KPI report tổng hợp cho stakeholder

🧰 Yêu cầu

Yêu cầu	Chi tiết
Kiến thức	Đã hoàn thành Buổi 13 lý thuyết (Business Metrics)
Python	pandas, matplotlib, seaborn (đã học Buổi 7-9)
Thời gian	90–120 phút
Output	KPI report + funnel chart + cohort heatmap (PNG/PDF)

💡 Naming convention

Đặt tên file: HoTen_Buoi13_KPI_Report.ipynb (notebook) + HoTen_Buoi13_Cohort_Heatmap.png (export)

📦 Phần 0: Tạo Dataset (10 phút)

Chúng ta sẽ tạo 3 bảng data mô phỏng e-commerce:

python

import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

np.random.seed(42)

# ============================================
# TABLE 1: CUSTOMERS (2,000 customers)
# ============================================
n_customers = 2000
signup_dates = pd.date_range('2025-01-01', '2025-06-30', periods=n_customers)
channels = np.random.choice(
    ['Organic', 'Facebook', 'Google', 'TikTok', 'Referral'],
    n_customers,
    p=[0.25, 0.30, 0.20, 0.15, 0.10]
)
customers = pd.DataFrame({
    'customer_id': [f'C{i:04d}' for i in range(1, n_customers + 1)],
    'signup_date': signup_dates,
    'channel': channels,
    'signup_month': signup_dates.to_period('M')
})

print(f"✅ Customers: {len(customers)} rows")
print(customers.head())

python

# ============================================
# TABLE 2: ORDERS (8,500 orders)
# ============================================
n_orders = 8500
order_records = []

for i in range(n_orders):
    # Random customer
    cust_idx = np.random.randint(0, n_customers)
    cust = customers.iloc[cust_idx]

    # Order date: sometime after signup, within 6 months
    days_after = np.random.exponential(scale=30)
    order_date = cust['signup_date'] + timedelta(days=int(days_after))

    # Cap at end of dataset period
    if order_date > pd.Timestamp('2025-09-30'):
        continue

    # Revenue: log-normal distribution (realistic for e-commerce)
    revenue = int(np.random.lognormal(mean=12.0, sigma=0.6))  # ~160K VND center

    order_records.append({
        'order_id': f'O{i:05d}',
        'customer_id': cust['customer_id'],
        'order_date': order_date,
        'revenue': revenue,
        'order_month': order_date.to_period('M')
    })

orders = pd.DataFrame(order_records)
print(f"\n✅ Orders: {len(orders)} rows")
print(orders.head())
print(f"\nRevenue range: {orders['revenue'].min():,} — {orders['revenue'].max():,} VND")

python

# ============================================
# TABLE 3: SESSIONS / FUNNEL (50,000 sessions)
# ============================================
n_sessions = 50000
funnel_steps = []

for i in range(n_sessions):
    session_date = pd.Timestamp('2025-01-01') + timedelta(
        days=np.random.randint(0, 270)
    )
    channel = np.random.choice(
        ['Organic', 'Facebook', 'Google', 'TikTok', 'Referral'],
        p=[0.20, 0.30, 0.22, 0.18, 0.10]
    )

    # Funnel: visit → view_product → add_to_cart → checkout → purchase
    visited = True
    viewed = np.random.random() < 0.55
    added = viewed and (np.random.random() < 0.30)
    checked = added and (np.random.random() < 0.55)
    purchased = checked and (np.random.random() < 0.65)

    funnel_steps.append({
        'session_id': f'S{i:06d}',
        'session_date': session_date,
        'channel': channel,
        'visited': visited,
        'viewed_product': viewed,
        'added_to_cart': added,
        'began_checkout': checked,
        'purchased': purchased
    })

sessions = pd.DataFrame(funnel_steps)
print(f"\n✅ Sessions: {len(sessions)} rows")
print(sessions.head())

Phần 1: KPI Calculation (25 phút)

Bước 1.1: Revenue Metrics

python

# ============================================
# REVENUE METRICS
# ============================================

# Total Revenue
total_revenue = orders['revenue'].sum()

# GMV (giả sử refund rate 4%)
refund_rate = 0.04
gmv = total_revenue / (1 - refund_rate)

# Number of orders
n_total_orders = len(orders)

# AOV (Average Order Value)
aov = total_revenue / n_total_orders

# Active customers (placed at least 1 order)
active_customers = orders['customer_id'].nunique()

# ARPU (Average Revenue Per User)
arpu = total_revenue / active_customers

# Monthly Revenue Trend
monthly_revenue = orders.groupby('order_month')['revenue'].agg(['sum', 'count', 'mean'])
monthly_revenue.columns = ['Revenue', 'Orders', 'AOV']

print("=" * 50)
print("📊 REVENUE METRICS")
print("=" * 50)
print(f"GMV:              {gmv:>15,.0f} VND")
print(f"Net Revenue:      {total_revenue:>15,.0f} VND")
print(f"Total Orders:     {n_total_orders:>15,}")
print(f"AOV:              {aov:>15,.0f} VND")
print(f"Active Customers: {active_customers:>15,}")
print(f"ARPU:             {arpu:>15,.0f} VND")
print(f"\n📈 Monthly Revenue Trend:")
print(monthly_revenue.to_string())

Bước 1.2: Marketing Metrics — CAC by Channel

python

# ============================================
# MARKETING METRICS — CAC by Channel
# ============================================

# Simulated marketing spend by channel (6 months)
marketing_spend = {
    'Organic': 50_000_000,      # 50M (content, SEO)
    'Facebook': 280_000_000,    # 280M
    'Google': 200_000_000,      # 200M
    'TikTok': 180_000_000,      # 180M
    'Referral': 40_000_000      # 40M (referral rewards)
}

# Count new customers by channel
cust_by_channel = customers['channel'].value_counts()

# CAC by channel
cac_by_channel = pd.DataFrame({
    'New_Customers': cust_by_channel,
    'Marketing_Spend': pd.Series(marketing_spend)
})
cac_by_channel['CAC'] = cac_by_channel['Marketing_Spend'] / cac_by_channel['New_Customers']
cac_by_channel = cac_by_channel.sort_values('CAC')

# Total / Blended CAC
total_marketing = sum(marketing_spend.values())
total_new_customers = len(customers)
blended_cac = total_marketing / total_new_customers

print("\n" + "=" * 50)
print("📊 MARKETING METRICS — CAC by Channel")
print("=" * 50)
print(cac_by_channel.to_string())
print(f"\nTotal Marketing Spend: {total_marketing:>12,.0f} VND")
print(f"Total New Customers:   {total_new_customers:>12,}")
print(f"Blended CAC:           {blended_cac:>12,.0f} VND")

Bước 1.3: Customer Metrics — Churn, Retention, LTV

python

# ============================================
# CUSTOMER METRICS — Churn & LTV
# ============================================

# Define "active": placed order in last 60 days (from Sept 30, 2025)
cutoff_date = pd.Timestamp('2025-09-30')
active_window = cutoff_date - timedelta(days=60)

# Last order date per customer
last_order = orders.groupby('customer_id')['order_date'].max().reset_index()
last_order.columns = ['customer_id', 'last_order_date']

# Merge with customers
cust_status = customers.merge(last_order, on='customer_id', how='left')
cust_status['has_ordered'] = cust_status['last_order_date'].notna()
cust_status['is_active'] = cust_status['last_order_date'] >= active_window

# Churn rate (among those who ordered at least once)
ordered_customers = cust_status[cust_status['has_ordered']]
churned = ordered_customers[~ordered_customers['is_active']]
churn_rate = len(churned) / len(ordered_customers)
retention_rate = 1 - churn_rate

# Average customer lifespan
avg_lifespan_months = 1 / (churn_rate / 6)  # approximation over 6-month window

# ARPU monthly
months_in_data = 9  # Jan - Sep
arpu_monthly = total_revenue / active_customers / months_in_data

# LTV
ltv = arpu_monthly * avg_lifespan_months

# LTV / CAC
ltv_cac = ltv / blended_cac

print("\n" + "=" * 50)
print("📊 CUSTOMER METRICS")
print("=" * 50)
print(f"Customers with orders:  {len(ordered_customers):>8,}")
print(f"Active (last 60 days):  {len(ordered_customers[ordered_customers['is_active']]):>8,}")
print(f"Churned:                {len(churned):>8,}")
print(f"Churn Rate:             {churn_rate:>8.1%}")
print(f"Retention Rate:         {retention_rate:>8.1%}")
print(f"Avg Lifespan (months):  {avg_lifespan_months:>8.1f}")
print(f"ARPU (monthly):         {arpu_monthly:>8,.0f} VND")
print(f"LTV:                    {ltv:>8,.0f} VND")
print(f"Blended CAC:            {blended_cac:>8,.0f} VND")
print(f"LTV/CAC Ratio:          {ltv_cac:>8.1f}x")

# Health assessment
if ltv_cac >= 3:
    print(f"\n✅ LTV/CAC = {ltv_cac:.1f}x — Healthy unit economics!")
elif ltv_cac >= 1:
    print(f"\n⚠️ LTV/CAC = {ltv_cac:.1f}x — Marginal, cần optimize")
else:
    print(f"\n🔴 LTV/CAC = {ltv_cac:.1f}x — Thua lỗ! Cần action ngay")

Bước 1.4: LTV/CAC by Channel

python

# ============================================
# LTV/CAC BY CHANNEL
# ============================================

# Revenue per customer by channel
cust_revenue = orders.merge(
    customers[['customer_id', 'channel']], on='customer_id'
)
revenue_by_channel = cust_revenue.groupby('channel')['revenue'].sum()
cust_count_by_channel = cust_revenue.groupby('channel')['customer_id'].nunique()

arpu_by_channel = revenue_by_channel / cust_count_by_channel
ltv_by_channel = arpu_by_channel  # simplified: total revenue as proxy for LTV

# Combine
channel_economics = pd.DataFrame({
    'Customers': cust_by_channel,
    'Revenue': revenue_by_channel,
    'ARPU': arpu_by_channel,
    'CAC': cac_by_channel['CAC'],
    'LTV_CAC': arpu_by_channel / cac_by_channel['CAC']
}).sort_values('LTV_CAC', ascending=False)

print("\n" + "=" * 50)
print("📊 CHANNEL ECONOMICS — LTV/CAC by Channel")
print("=" * 50)
print(channel_economics.round(0).to_string())

# Highlight best and worst
best = channel_economics['LTV_CAC'].idxmax()
worst = channel_economics['LTV_CAC'].idxmin()
print(f"\n🟢 Best channel:  {best} (LTV/CAC = {channel_economics.loc[best, 'LTV_CAC']:.1f}x)")
print(f"🔴 Worst channel: {worst} (LTV/CAC = {channel_economics.loc[worst, 'LTV_CAC']:.1f}x)")

Phần 2: Funnel Analysis & Visualization (20 phút)

Bước 2.1: Overall Funnel

python

# ============================================
# FUNNEL ANALYSIS
# ============================================

# Count users at each stage
funnel_counts = pd.DataFrame({
    'Stage': ['Visit', 'View Product', 'Add to Cart', 'Begin Checkout', 'Purchase'],
    'Users': [
        sessions['visited'].sum(),
        sessions['viewed_product'].sum(),
        sessions['added_to_cart'].sum(),
        sessions['began_checkout'].sum(),
        sessions['purchased'].sum()
    ]
})

funnel_counts['Pct_of_Total'] = (funnel_counts['Users'] / funnel_counts['Users'].iloc[0] * 100).round(1)
funnel_counts['Step_CVR'] = [100.0] + [
    round(funnel_counts['Users'].iloc[i] / funnel_counts['Users'].iloc[i-1] * 100, 1)
    for i in range(1, len(funnel_counts))
]
funnel_counts['Drop_off'] = [0] + [
    funnel_counts['Users'].iloc[i-1] - funnel_counts['Users'].iloc[i]
    for i in range(1, len(funnel_counts))
]

print("=" * 60)
print("📊 E-COMMERCE FUNNEL ANALYSIS")
print("=" * 60)
print(funnel_counts.to_string(index=False))

Bước 2.2: Funnel Visualization

python

# ============================================
# FUNNEL CHART — Horizontal Bar
# ============================================

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Chart 1: Funnel bar chart
colors = ['#1976D2', '#2196F3', '#42A5F5', '#64B5F6', '#90CAF9']
bars = axes[0].barh(
    funnel_counts['Stage'][::-1],
    funnel_counts['Users'][::-1],
    color=colors,
    height=0.6
)

# Add labels
for bar, pct in zip(bars, funnel_counts['Pct_of_Total'][::-1]):
    width = bar.get_width()
    axes[0].text(width + 200, bar.get_y() + bar.get_height()/2,
                f'{int(width):,} ({pct}%)',
                va='center', fontsize=10, fontweight='bold')

axes[0].set_title('E-Commerce Purchase Funnel', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Number of Users')
axes[0].spines['top'].set_visible(False)
axes[0].spines['right'].set_visible(False)

# Chart 2: Drop-off waterfall
drop_stages = ['Visit→View', 'View→Cart', 'Cart→Checkout', 'Checkout→Purchase']
drop_values = funnel_counts['Drop_off'].iloc[1:].values
drop_pcts = [round(d / funnel_counts['Users'].iloc[i] * 100, 1)
             for i, d in enumerate(drop_values)]

drop_colors = ['#EF5350' if p > 50 else '#FFA726' if p > 30 else '#66BB6A'
               for p in drop_pcts]

axes[1].bar(drop_stages, drop_values, color=drop_colors, width=0.5)
for i, (v, p) in enumerate(zip(drop_values, drop_pcts)):
    axes[1].text(i, v + 200, f'{v:,}\n({p}% drop)', ha='center', fontsize=9)

axes[1].set_title('Drop-off at Each Stage', fontsize=14, fontweight='bold')
axes[1].set_ylabel('Users Lost')
axes[1].spines['top'].set_visible(False)
axes[1].spines['right'].set_visible(False)

plt.tight_layout()
plt.savefig('funnel_analysis.png', dpi=150, bbox_inches='tight')
plt.show()
print("✅ Saved: funnel_analysis.png")

Bước 2.3: Funnel by Channel

python

# ============================================
# FUNNEL BY CHANNEL
# ============================================

channel_funnel = sessions.groupby('channel').agg({
    'visited': 'sum',
    'viewed_product': 'sum',
    'added_to_cart': 'sum',
    'began_checkout': 'sum',
    'purchased': 'sum'
}).rename(columns={
    'visited': 'Visit',
    'viewed_product': 'View',
    'added_to_cart': 'Cart',
    'began_checkout': 'Checkout',
    'purchased': 'Purchase'
})

# Calculate overall CVR per channel
channel_funnel['Overall_CVR'] = (channel_funnel['Purchase'] / channel_funnel['Visit'] * 100).round(2)
channel_funnel = channel_funnel.sort_values('Overall_CVR', ascending=False)

print("\n" + "=" * 60)
print("📊 FUNNEL BY CHANNEL")
print("=" * 60)
print(channel_funnel.to_string())
print(f"\n🟢 Best CVR: {channel_funnel.index[0]} ({channel_funnel['Overall_CVR'].iloc[0]}%)")
print(f"🔴 Worst CVR: {channel_funnel.index[-1]} ({channel_funnel['Overall_CVR'].iloc[-1]}%)")

Phần 3: Cohort Retention Heatmap (30 phút)

Bước 3.1: Build Cohort Table

python

# ============================================
# COHORT ANALYSIS — Monthly Retention
# ============================================

# Step 1: Xác định cohort cho mỗi customer (tháng đăng ký)
customers['cohort'] = customers['signup_date'].dt.to_period('M')

# Step 2: Merge orders với cohort info
orders_cohort = orders.merge(
    customers[['customer_id', 'cohort']], on='customer_id'
)
orders_cohort['order_period'] = orders_cohort['order_date'].dt.to_period('M')

# Step 3: Tính period_number (months since cohort)
orders_cohort['period_number'] = (
    orders_cohort['order_period'].astype(int) - orders_cohort['cohort'].astype(int)
)

# Step 4: Count unique customers per cohort per period
cohort_data = orders_cohort.groupby(['cohort', 'period_number'])['customer_id'].nunique().reset_index()
cohort_data.columns = ['cohort', 'period_number', 'customers']

# Step 5: Pivot to cohort table
cohort_table = cohort_data.pivot(index='cohort', columns='period_number', values='customers')

# Step 6: Calculate retention rate (% of cohort size)
cohort_sizes = cohort_table[0]  # Month 0 = initial cohort size
retention_table = cohort_table.divide(cohort_sizes, axis=0) * 100

print("=" * 60)
print("📊 COHORT SIZES (Unique Customers)")
print("=" * 60)
print(cohort_table.to_string())

print("\n" + "=" * 60)
print("📊 COHORT RETENTION TABLE (%)")
print("=" * 60)
print(retention_table.round(1).to_string())

Bước 3.2: Cohort Retention Heatmap

python

# ============================================
# COHORT HEATMAP — Seaborn
# ============================================

fig, ax = plt.subplots(figsize=(14, 8))

# Rename columns for display
retention_display = retention_table.copy()
retention_display.columns = [f'Month {i}' for i in retention_display.columns]
retention_display.index = [str(idx) for idx in retention_display.index]

# Create heatmap
sns.heatmap(
    retention_display,
    annot=True,
    fmt='.0f',
    cmap='YlGnBu',
    linewidths=0.5,
    linecolor='white',
    vmin=0,
    vmax=100,
    cbar_kws={'label': 'Retention Rate (%)'},
    ax=ax
)

ax.set_title('Monthly Cohort Retention Heatmap (%)\nE-Commerce Dataset — Jan to Sep 2025',
             fontsize=14, fontweight='bold', pad=20)
ax.set_xlabel('Months Since First Purchase', fontsize=12)
ax.set_ylabel('Signup Cohort', fontsize=12)

plt.tight_layout()
plt.savefig('cohort_heatmap.png', dpi=150, bbox_inches='tight')
plt.show()
print("✅ Saved: cohort_heatmap.png")

Bước 3.3: Retention Curve

python

# ============================================
# AVERAGE RETENTION CURVE
# ============================================

avg_retention = retention_table.mean(axis=0)

fig, ax = plt.subplots(figsize=(10, 6))

ax.plot(avg_retention.index, avg_retention.values,
        'o-', color='#1976D2', linewidth=2.5, markersize=8)

# Annotate churn cliff
if len(avg_retention) > 1:
    max_drop_idx = avg_retention.diff().idxmin()
    ax.annotate(
        f'⚠️ Churn Cliff\n{avg_retention[max_drop_idx]:.0f}%',
        xy=(max_drop_idx, avg_retention[max_drop_idx]),
        xytext=(max_drop_idx + 0.5, avg_retention[max_drop_idx] + 10),
        fontsize=11, fontweight='bold', color='#D32F2F',
        arrowprops=dict(arrowstyle='->', color='#D32F2F', lw=2)
    )

# Fill area under curve
ax.fill_between(avg_retention.index, avg_retention.values,
                alpha=0.15, color='#1976D2')

ax.set_title('Average Retention Curve (All Cohorts)', fontsize=14, fontweight='bold')
ax.set_xlabel('Months Since Signup')
ax.set_ylabel('Retention Rate (%)')
ax.set_ylim(0, 105)
ax.grid(axis='y', alpha=0.3)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.tight_layout()
plt.savefig('retention_curve.png', dpi=150, bbox_inches='tight')
plt.show()
print("✅ Saved: retention_curve.png")

Phần 4: KPI Summary Report (15 phút)

Bước 4.1: Executive KPI Report

python

# ============================================
# EXECUTIVE KPI REPORT
# ============================================

print("=" * 60)
print("📋 EXECUTIVE KPI REPORT")
print("   E-Commerce Analytics — Jan to Sep 2025")
print("=" * 60)

print("\n💰 REVENUE METRICS")
print(f"   GMV:                    {gmv:>15,.0f} VND")
print(f"   Net Revenue:            {total_revenue:>15,.0f} VND")
print(f"   Total Orders:           {n_total_orders:>15,}")
print(f"   AOV:                    {aov:>15,.0f} VND")
print(f"   ARPU (total):           {arpu:>15,.0f} VND")

print("\n📣 MARKETING METRICS")
print(f"   Total Marketing Spend:  {total_marketing:>15,.0f} VND")
print(f"   New Customers:          {total_new_customers:>15,}")
print(f"   Blended CAC:            {blended_cac:>15,.0f} VND")
print(f"   Best Channel (CAC):     {cac_by_channel['CAC'].idxmin()} ({cac_by_channel['CAC'].min():,.0f} VND)")
print(f"   Worst Channel (CAC):    {cac_by_channel['CAC'].idxmax()} ({cac_by_channel['CAC'].max():,.0f} VND)")

print("\n👥 CUSTOMER METRICS")
print(f"   Active Customers:       {len(ordered_customers[ordered_customers['is_active']]):>15,}")
print(f"   Churn Rate:             {churn_rate:>14.1%}")
print(f"   Retention Rate:         {retention_rate:>14.1%}")
print(f"   LTV:                    {ltv:>15,.0f} VND")
print(f"   LTV/CAC Ratio:          {ltv_cac:>14.1f}x")

print("\n📊 FUNNEL METRICS")
overall_cvr = funnel_counts['Users'].iloc[-1] / funnel_counts['Users'].iloc[0] * 100
print(f"   Total Sessions:         {funnel_counts['Users'].iloc[0]:>15,}")
print(f"   Total Purchases:        {funnel_counts['Users'].iloc[-1]:>15,}")
print(f"   Overall CVR:            {overall_cvr:>14.1f}%")
biggest_drop = funnel_counts.iloc[1:]['Drop_off'].idxmax()
print(f"   Biggest Drop-off:       {funnel_counts.iloc[biggest_drop]['Stage']}")

print("\n🔄 COHORT RETENTION")
print(f"   Avg M1 Retention:       {avg_retention.iloc[1] if len(avg_retention) > 1 else 'N/A':>14.1f}%")
if len(avg_retention) > 3:
    print(f"   Avg M3 Retention:       {avg_retention.iloc[3]:>14.1f}%")

print("\n" + "=" * 60)
print("📌 KEY INSIGHTS & RECOMMENDATIONS")
print("=" * 60)

insights = []
if ltv_cac >= 3:
    insights.append(f"✅ LTV/CAC = {ltv_cac:.1f}x — Unit economics healthy")
else:
    insights.append(f"⚠️ LTV/CAC = {ltv_cac:.1f}x — Need to optimize CAC or improve retention")

if churn_rate > 0.3:
    insights.append(f"⚠️ High churn rate ({churn_rate:.1%}) — Focus on retention programs")
else:
    insights.append(f"✅ Churn rate ({churn_rate:.1%}) — Acceptable range")

insights.append(f"📊 Best ROI channel: {best} — Consider increasing budget")
insights.append(f"🔴 Worst ROI channel: {worst} — Review or reduce spend")

for i, insight in enumerate(insights, 1):
    print(f"   {i}. {insight}")

✅ Deliverables Checklist

Trước khi submit, kiểm tra:

✅ Phần 0: Dataset created (customers, orders, sessions)
✅ Phần 1: KPI calculated
   ├── Revenue: GMV, Net Revenue, AOV, ARPU
   ├── Marketing: CAC by channel, blended CAC
   ├── Customer: Churn rate, retention, LTV, LTV/CAC
   └── Channel Economics: LTV/CAC by channel
✅ Phần 2: Funnel analysis
   ├── Overall funnel (5 stages)
   ├── Funnel chart (saved as PNG)
   └── Funnel by channel comparison
✅ Phần 3: Cohort analysis
   ├── Cohort retention table
   ├── Cohort heatmap (saved as PNG)
   └── Average retention curve (saved as PNG)
✅ Phần 4: Executive KPI report (printed summary)

Tiêu chí đánh giá

Tiêu chí	Trọng số	Mô tả
KPI Calculation	30%	Tính đúng công thức, giải thích ý nghĩa
Funnel Analysis	20%	Xây funnel đúng, identify drop-off, actionable insight
Cohort Heatmap	30%	Build cohort table, visualize heatmap, đọc được insight
Executive Report	20%	Tổng hợp gọn, có insight + recommendation

💡 Nâng cao (tùy chọn)

Nếu hoàn thành sớm, thử thêm:

Revenue Cohort: thay vì retention (%), show revenue per cohort qua thời gian
RFM Segmentation: Phân loại khách theo Recency, Frequency, Monetary
AARRR Dashboard: Tạo 5 KPI cards cho 5 stages AARRR
Churn Prediction: Dùng logistic regression predict churn (Buổi 14+)

🛠 Workshop — KPI Calculation & Cohort Retention Heatmap ​

🎯 Mục tiêu workshop ​

🧰 Yêu cầu ​

📦 Phần 0: Tạo Dataset (10 phút) ​

Phần 1: KPI Calculation (25 phút) ​

Bước 1.1: Revenue Metrics ​

Bước 1.2: Marketing Metrics — CAC by Channel ​

Bước 1.3: Customer Metrics — Churn, Retention, LTV ​

Bước 1.4: LTV/CAC by Channel ​

Phần 2: Funnel Analysis & Visualization (20 phút) ​

Bước 2.1: Overall Funnel ​

Bước 2.2: Funnel Visualization ​

Bước 2.3: Funnel by Channel ​

Phần 3: Cohort Retention Heatmap (30 phút) ​

Bước 3.1: Build Cohort Table ​

Bước 3.2: Cohort Retention Heatmap ​

Bước 3.3: Retention Curve ​

Phần 4: KPI Summary Report (15 phút) ​

Bước 4.1: Executive KPI Report ​

✅ Deliverables Checklist ​

Tiêu chí đánh giá ​

🛠 Workshop — KPI Calculation & Cohort Retention Heatmap

🎯 Mục tiêu workshop

🧰 Yêu cầu

📦 Phần 0: Tạo Dataset (10 phút)

Phần 1: KPI Calculation (25 phút)

Bước 1.1: Revenue Metrics

Bước 1.2: Marketing Metrics — CAC by Channel

Bước 1.3: Customer Metrics — Churn, Retention, LTV

Bước 1.4: LTV/CAC by Channel

Phần 2: Funnel Analysis & Visualization (20 phút)

Bước 2.1: Overall Funnel

Bước 2.2: Funnel Visualization

Bước 2.3: Funnel by Channel

Phần 3: Cohort Retention Heatmap (30 phút)

Bước 3.1: Build Cohort Table

Bước 3.2: Cohort Retention Heatmap

Bước 3.3: Retention Curve

Phần 4: KPI Summary Report (15 phút)

Bước 4.1: Executive KPI Report

✅ Deliverables Checklist

Tiêu chí đánh giá