Skip to content

CTE Applications

Tables

categories — Categories (parent-child hierarchy)

staff — Staff (dept, role, manager)

orders — Orders (status, amount, date)

customers — Customers (grade, points, channel)

order_items — Order items (qty, unit price)

products — Products (name, price, stock, brand)

calendar — Calendar (weekday, holiday)

Concepts

WITH, WITH RECURSIVE, Multiple CTE Chaining, CTE + Aggregate, CTE + JOIN, CTE + Window Function

1. Monthly Sales vs Average

Find monthly sales in 2024 in CTE, and view only months with sales above the overall average.

Hint 1: After collecting monthly sales in CTE, use WHERE revenue >= (SELECT AVG(revenue) FROM cte_name) in external query. Filter. CTE may be referenced multiple times.

Answer
WITH monthly_revenue AS (
    SELECT
        SUBSTR(ordered_at, 1, 7) AS year_month,
        ROUND(SUM(total_amount), 0) AS revenue,
        COUNT(*) AS order_count
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY SUBSTR(ordered_at, 1, 7)
)
SELECT
    year_month,
    revenue,
    order_count,
    (SELECT ROUND(AVG(revenue), 0) FROM monthly_revenue) AS avg_revenue,
    CASE
        WHEN revenue >= (SELECT AVG(revenue) FROM monthly_revenue) THEN '평균 이상'
        ELSE '평균 미만'
    END AS status
FROM monthly_revenue
WHERE revenue >= (SELECT AVG(revenue) FROM monthly_revenue)
ORDER BY revenue DESC;

Result (4 rows)

year_month revenue order_count avg_revenue status
2024-11 543,313,372.00 547 426,203,627.00 평균 이상
2024-09 536,079,841.00 523 426,203,627.00 평균 이상
2024-03 519,844,502.00 555 426,203,627.00 평균 이상
2024-04 451,877,581.00 466 426,203,627.00 평균 이상

2. Find the total purchase amount for each customer in CTE, then extract the top 5 customers by grade.

Combines CTE and ROW_NUMBER window functions.

Hint 1: In the first CTE, aggregate the total purchase amount by customer, Apply ROW_NUMBER() OVER (PARTITION BY grade ORDER BY total_spent DESC) in the second CTE.

Answer
WITH customer_totals AS (
    SELECT
        c.id,
        c.name,
        c.grade,
        ROUND(SUM(o.total_amount), 0) AS total_spent,
        COUNT(*) AS order_count
    FROM customers AS c
    INNER JOIN orders AS o ON c.id = o.customer_id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY c.id, c.name, c.grade
),
ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY grade
            ORDER BY total_spent DESC
        ) AS rn
    FROM customer_totals
)
SELECT
    grade,
    rn AS rank_in_grade,
    name,
    order_count,
    total_spent
FROM ranked
WHERE rn <= 5
ORDER BY
    CASE grade
        WHEN 'VIP' THEN 1
        WHEN 'GOLD' THEN 2
        WHEN 'SILVER' THEN 3
        WHEN 'BRONZE' THEN 4
    END,
    rn;

Result (top 7 of 20 rows)

grade rank_in_grade name order_count total_spent
VIP 1 Allen Snyder 303 403,448,758.00
VIP 2 Jason Rivera 342 366,385,931.00
VIP 3 Brenda Garcia 249 253,180,338.00
VIP 4 Courtney Huff 223 244,604,910.00
VIP 5 Ronald Arellano 219 235,775,349.00
GOLD 1 Sandra Callahan 171 204,611,811.00
GOLD 2 David York 160 199,282,408.00

3. Use CTE to view sales and review summaries for each product at once.

Displays the sales amount, sales volume, number of reviews, and average rating of the top 10 selling products in 2024.

Hint 1: Create the sales CTE and review CTE separately, then combine them into LEFT JOIN in the external query. Products without reviews should also be included.

Answer
WITH sales AS (
    SELECT
        oi.product_id,
        ROUND(SUM(oi.quantity * oi.unit_price), 0) AS total_revenue,
        SUM(oi.quantity) AS units_sold
    FROM order_items AS oi
    INNER JOIN orders AS o ON oi.order_id = o.id
    WHERE o.ordered_at LIKE '2024%'
      AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY oi.product_id
),
review_summary AS (
    SELECT
        product_id,
        COUNT(*) AS review_count,
        ROUND(AVG(rating), 2) AS avg_rating
    FROM reviews
    GROUP BY product_id
)
SELECT
    p.name          AS product_name,
    cat.name        AS category,
    s.units_sold,
    s.total_revenue,
    COALESCE(r.review_count, 0) AS review_count,
    r.avg_rating
FROM sales AS s
INNER JOIN products   AS p   ON s.product_id  = p.id
INNER JOIN categories AS cat ON p.category_id = cat.id
LEFT  JOIN review_summary AS r ON s.product_id = r.product_id
ORDER BY s.total_revenue DESC
LIMIT 10;

Result (top 7 of 10 rows)

product_name category units_sold total_revenue review_count avg_rating
Razer Blade 18 Black Gaming Laptop 38 165,417,800.00 20 4.10
Razer Blade 16 Silver Gaming Laptop 37 137,007,300.00 19 3.95
MacBook Air 15 M3 Silver MacBook 23 126,065,300.00 4 3.75
ASUS Dual RTX 4060 Ti Black NVIDIA 40 106,992,000.00 16 3.75
ASUS Dual RTX 5070 Ti Silver NVIDIA 106 104,558,400.00 23 3.65
ASUS ROG Swift PG32UCDM Silver Gaming Monitor 48 90,734,400.00 24 3.67
ASUS ROG Strix Scar 16 Gaming Laptop 35 85,837,500.00 13 4.23

4. Practice chaining, where you aggregate results from one CTE and filter them from another CTE.

Find the number of new subscribers per month and look for “surge” months, which are month-on-month increases of 20% or more.

Hint 1: CTE 1: Counting the number of new subscribers per month. CTE 2: Calculate the change rate by taking the previous month's figures with LAG. In the outer query, filter only rows with growth rate >= 20.

Answer
WITH monthly_signups AS (
    SELECT
        SUBSTR(created_at, 1, 7) AS year_month,
        COUNT(*) AS signup_count
    FROM customers
    WHERE created_at >= '2022-01-01'
    GROUP BY SUBSTR(created_at, 1, 7)
),
with_growth AS (
    SELECT
        year_month,
        signup_count,
        LAG(signup_count, 1) OVER (ORDER BY year_month) AS prev_count,
        ROUND(100.0 * (signup_count - LAG(signup_count, 1) OVER (ORDER BY year_month))
            / LAG(signup_count, 1) OVER (ORDER BY year_month), 1) AS growth_pct
    FROM monthly_signups
)
SELECT
    year_month,
    signup_count,
    prev_count,
    growth_pct
FROM with_growth
WHERE growth_pct >= 20
ORDER BY growth_pct DESC;

Result (6 rows)

year_month signup_count prev_count growth_pct
2024-06 68 43 58.10
2024-03 71 48 47.90
2025-10 76 54 40.70
2022-03 59 42 40.50
2024-10 65 51 27.50
2025-07 66 55 20.00

5. Find the proportion of sales by category in CTE and calculate the cumulative proportion (Pareto).

Calculate sales by category as of 2024 and display cumulative proportions in descending order.

Hint 1: After counting sales by category in CTE, Find the cumulative proportion with SUM(share_pct) OVER (ORDER BY revenue DESC ROWS UNBOUNDED PRECEDING).

Answer
WITH category_revenue AS (
    SELECT
        cat.name AS category,
        ROUND(SUM(oi.quantity * oi.unit_price), 0) AS revenue
    FROM order_items AS oi
    INNER JOIN orders     AS o   ON oi.order_id   = o.id
    INNER JOIN products   AS p   ON oi.product_id = p.id
    INNER JOIN categories AS cat ON p.category_id = cat.id
    WHERE o.ordered_at LIKE '2024%'
      AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY cat.name
),
with_share AS (
    SELECT
        category,
        revenue,
        ROUND(100.0 * revenue / SUM(revenue) OVER (), 1) AS share_pct
    FROM category_revenue
)
SELECT
    category,
    revenue,
    share_pct,
    ROUND(SUM(share_pct) OVER (
        ORDER BY revenue DESC
        ROWS UNBOUNDED PRECEDING
    ), 1) AS cumulative_pct
FROM with_share
ORDER BY revenue DESC;

Result (top 7 of 38 rows)

category revenue share_pct cumulative_pct
Gaming Laptop 636,925,700.00 12.30 12.30
AMD 447,953,400.00 8.70 21.00
Gaming Monitor 353,934,400.00 6.90 27.90
NVIDIA 345,858,700.00 6.70 34.60
2-in-1 340,884,400.00 6.60 41.20
General Laptop 291,760,500.00 5.70 46.90
Professional Monitor 254,590,200.00 4.90 51.80

6. Expand your category hierarchy into recursive CTEs.

Displays the entire path, from the top category (parent_id IS NULL) to subcategories.

Hint 1: Use WITH RECURSIVE. Anchor member: WHERE parent_id IS NULL. Recursive member: JOIN CTE itself and categories with c.parent_id = cte.id. The path connects to parent_path || ' > ' || c.name.

Answer
WITH RECURSIVE cat_tree AS (
    -- 앵커: 최상위 카테고리
    SELECT
        id,
        parent_id,
        name,
        name AS full_path,
        depth,
        0 AS level
    FROM categories
    WHERE parent_id IS NULL

    UNION ALL

    -- 재귀: 자식 카테고리
    SELECT
        c.id,
        c.parent_id,
        c.name,
        ct.full_path || ' > ' || c.name,
        c.depth,
        ct.level + 1
    FROM categories AS c
    INNER JOIN cat_tree AS ct ON c.parent_id = ct.id
)
SELECT
    id,
    full_path,
    level,
    depth
FROM cat_tree
ORDER BY full_path;

Result (top 7 of 53 rows)

id full_path level depth
14 CPU 0 0
16 CPU > AMD 1 1
15 CPU > Intel 1 1
31 Case 0 0
32 Cooling 0 0
33 Cooling > Air Cooling 1 1
34 Cooling > Liquid Cooling 1 1

7. Expand your employee org chart as a recursive CTE.

Displays the path to every employee's immediate supervisor, starting with the manager (manager_id IS NULL).

Hint 1: manager_id in table staff is self-referencing (Self-Join). Anchor: WHERE manager_id IS NULL (top level administrator). Recursive: Associate subordinates with s.manager_id = tree.id.

Answer
WITH RECURSIVE org_tree AS (
    -- 앵커: 최고 관리자
    SELECT
        id,
        name,
        department,
        role,
        manager_id,
        name AS chain,
        0 AS depth
    FROM staff
    WHERE manager_id IS NULL

    UNION ALL

    -- 재귀: 부하 직원
    SELECT
        s.id,
        s.name,
        s.department,
        s.role,
        s.manager_id,
        ot.chain || ' > ' || s.name,
        ot.depth + 1
    FROM staff AS s
    INNER JOIN org_tree AS ot ON s.manager_id = ot.id
)
SELECT
    id,
    name,
    department,
    role,
    depth,
    chain
FROM org_tree
ORDER BY chain;

Result (5 rows)

id name department role depth chain
1 Michael Thomas Management admin 0 Michael Thomas
4 Jaime Phelps Sales manager 1 Michael Thomas > Jaime Phelps
3 Jonathan Smith Management admin 1 Michael Thomas > Jonathan Smith
5 Nicole Hamilton Marketing manager 2 Michael Thomas > Jonathan Smith > Nic...
2 Michael Mcguire Management admin 1 Michael Thomas > Michael Mcguire

8. Find the total number of subordinates under each manager with recursive CTE.

Calculate the overall team size, including direct loads as well as indirect loads (loads of direct reports).

Hint 1: After expanding all parent-child relationships with a recursive CTE, record each parent manager's id as the root. You can find COUNT(*) - 1 (excluding itself) for each root.

Answer
WITH RECURSIVE subordinates AS (
    -- 앵커: 각 직원을 자기 자신의 루트로 시작
    SELECT
        id AS root_id,
        id AS member_id
    FROM staff
    WHERE is_active = 1

    UNION ALL

    -- 재귀: 부하 직원 추가
    SELECT
        sub.root_id,
        s.id
    FROM staff AS s
    INNER JOIN subordinates AS sub ON s.manager_id = sub.member_id
    WHERE s.is_active = 1
)
SELECT
    s.id,
    s.name,
    s.department,
    s.role,
    COUNT(*) - 1 AS total_subordinates
FROM subordinates AS sub
INNER JOIN staff AS s ON sub.root_id = s.id
GROUP BY s.id, s.name, s.department, s.role
HAVING COUNT(*) > 1
ORDER BY total_subordinates DESC;

Result (2 rows)

id name department role total_subordinates
1 Michael Thomas Management admin 4
3 Jonathan Smith Management admin 1

9. Build customer RFM (Recency, Frequency, Monetary) segments with multiple CTEs.

We chain three CTEs: (1) RFM raw metrics, (2) NTILE scores, and (3) segment labeling.

Hint 1: CTE 1(rfm_raw): MAX(ordered_at), COUNT(*), SUM(total_amount). CTE 2(rfm_scored): Apply NTILE(4). CTE 3 or external query: Segments (VIP/Loyal/General/churn risk) are assigned with a combined score of R+F+M.

Answer
WITH rfm_raw AS (
    SELECT
        c.id            AS customer_id,
        c.name,
        c.grade,
        MAX(o.ordered_at) AS last_order_date,
        COUNT(*)          AS frequency,
        ROUND(SUM(o.total_amount), 0) AS monetary
    FROM customers AS c
    INNER JOIN orders AS o ON c.id = o.customer_id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY c.id, c.name, c.grade
),
rfm_scored AS (
    SELECT
        *,
        NTILE(4) OVER (ORDER BY last_order_date ASC) AS r_score,
        NTILE(4) OVER (ORDER BY frequency ASC)       AS f_score,
        NTILE(4) OVER (ORDER BY monetary ASC)         AS m_score
    FROM rfm_raw
),
rfm_segment AS (
    SELECT
        *,
        r_score + f_score + m_score AS total_score,
        CASE
            WHEN r_score + f_score + m_score >= 10 THEN 'Champions'
            WHEN r_score + f_score + m_score >= 8  THEN 'Loyal'
            WHEN r_score + f_score + m_score >= 5  THEN 'Potential'
            ELSE 'At Risk'
        END AS segment
    FROM rfm_scored
)
SELECT
    segment,
    COUNT(*)                     AS customer_count,
    ROUND(AVG(frequency), 1)     AS avg_frequency,
    ROUND(AVG(monetary), 0)      AS avg_monetary,
    ROUND(AVG(r_score + f_score + m_score), 1) AS avg_score
FROM rfm_segment
GROUP BY segment
ORDER BY avg_score DESC;

Result (4 rows)

segment customer_count avg_frequency avg_monetary avg_score
Champions 784 32.90 34,334,528.00 11.00
Loyal 569 8.50 8,532,740.00 8.40
Potential 999 3.40 2,797,314.00 6.00
At Risk 441 1.50 456,944.00 3.50

10. Revenue by Category Tree

After developing the category tree with recursive CTE, count the number of products and sales for each top category.

Hint 1: Track the top root of each category with a recursive CTE. log id AS root_id, name AS root_name in anchor, In recursion, it propagates root_id and root_name from the parent as is. Aggregate by root by joining products, order_items, and orders to the results.

Answer
WITH RECURSIVE cat_root AS (
    -- 앵커: 최상위 카테고리 (자기 자신이 루트)
    SELECT
        id,
        id   AS root_id,
        name AS root_name
    FROM categories
    WHERE parent_id IS NULL

    UNION ALL

    -- 재귀: 자식 카테고리는 부모의 루트를 상속
    SELECT
        c.id,
        cr.root_id,
        cr.root_name
    FROM categories AS c
    INNER JOIN cat_root AS cr ON c.parent_id = cr.id
)
SELECT
    cr.root_name                  AS top_category,
    COUNT(DISTINCT p.id)          AS product_count,
    ROUND(SUM(oi.quantity * oi.unit_price), 0) AS total_revenue,
    SUM(oi.quantity)              AS units_sold
FROM cat_root AS cr
INNER JOIN products   AS p  ON p.category_id = cr.id
INNER JOIN order_items AS oi ON oi.product_id = p.id
INNER JOIN orders     AS o  ON oi.order_id   = o.id
WHERE o.ordered_at LIKE '2024%'
  AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
GROUP BY cr.root_id, cr.root_name
ORDER BY total_revenue DESC;

Result (top 7 of 18 rows)

top_category product_count total_revenue units_sold
Laptop 21 1,395,635,900.00 660
Monitor 19 727,065,300.00 712
Graphics Card 10 713,579,800.00 580
Motherboard 16 398,988,900.00 920
Speakers/Headsets 8 232,144,800.00 921
Storage 13 205,861,200.00 833
Memory (RAM) 14 200,423,600.00 1354

11. Create month sequences with recursive CTEs and create monthly reports, including months with no sales.

Create a 12-month sequence from 2024-01 to 2024-12 with a recursive CTE and LEFT JOIN it with actual sales data.

Hint 1: With a recursive CTE, we create a sequence starting from '2024-01' and incrementing by +1 for each month. Calculate the next month as DATE(year_month || '-01', '+1 month'), formatted as SUBSTR(..., 1, 7). Exit condition: year_month < '2024-12'.

Answer
WITH RECURSIVE months AS (
    SELECT '2024-01' AS year_month
    UNION ALL
    SELECT SUBSTR(DATE(year_month || '-01', '+1 month'), 1, 7)
    FROM months
    WHERE year_month < '2024-12'
),
actual_revenue AS (
    SELECT
        SUBSTR(ordered_at, 1, 7) AS year_month,
        COUNT(*)                 AS order_count,
        ROUND(SUM(total_amount), 0) AS revenue
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY SUBSTR(ordered_at, 1, 7)
)
SELECT
    m.year_month,
    COALESCE(a.order_count, 0) AS order_count,
    COALESCE(a.revenue, 0)     AS revenue
FROM months AS m
LEFT JOIN actual_revenue AS a ON m.year_month = a.year_month
ORDER BY m.year_month;

Result (top 7 of 12 rows)

year_month order_count revenue
2024-01 314 288,908,320.00
2024-02 416 403,127,749.00
2024-03 555 519,844,502.00
2024-04 466 451,877,581.00
2024-05 385 425,264,478.00
2024-06 389 362,715,211.00
2024-07 381 343,929,897.00

12. Calculate product performance ratings with a 3-step CTE chain.

CTE 1: Sales by product. CTE 2: NTILE(5) rating based on sales. CTE 3: Statistics by grade. The final output includes the number of products by grade, sales range, and representative products (#1 in sales).

Hint 1: CTE 3 to FIRST_VALUE(product_name) OVER (PARTITION BY tier ORDER BY revenue DESC) Extract representative products from each grade. Use DISTINCT to summarize each grade into one row.

Answer
WITH product_sales AS (
    SELECT
        p.id       AS product_id,
        p.name     AS product_name,
        ROUND(SUM(oi.quantity * oi.unit_price), 0) AS revenue,
        SUM(oi.quantity) AS units_sold
    FROM order_items AS oi
    INNER JOIN orders   AS o ON oi.order_id   = o.id
    INNER JOIN products AS p ON oi.product_id = p.id
    WHERE o.ordered_at LIKE '2024%'
      AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY p.id, p.name
),
tiered AS (
    SELECT
        *,
        NTILE(5) OVER (ORDER BY revenue DESC) AS tier
    FROM product_sales
),
tier_summary AS (
    SELECT DISTINCT
        tier,
        COUNT(*) OVER (PARTITION BY tier) AS product_count,
        MIN(revenue) OVER (PARTITION BY tier) AS min_revenue,
        MAX(revenue) OVER (PARTITION BY tier) AS max_revenue,
        ROUND(AVG(revenue) OVER (PARTITION BY tier), 0) AS avg_revenue,
        FIRST_VALUE(product_name) OVER (
            PARTITION BY tier ORDER BY revenue DESC
        ) AS top_product
    FROM tiered
)
SELECT
    tier,
    CASE tier
        WHEN 1 THEN 'S (최상위)'
        WHEN 2 THEN 'A (우수)'
        WHEN 3 THEN 'B (보통)'
        WHEN 4 THEN 'C (저조)'
        WHEN 5 THEN 'D (부진)'
    END AS tier_label,
    product_count,
    min_revenue,
    max_revenue,
    avg_revenue,
    top_product
FROM tier_summary
ORDER BY tier;

Result (5 rows)

tier tier_label product_count min_revenue max_revenue avg_revenue top_product
1 S (최상위) 44 39,196,400.00 165,417,800.00 66,665,918.00 Razer Blade 18 Black
2 A (우수) 44 18,568,800.00 38,656,400.00 26,834,570.00 SteelSeries Arctis Nova Pro Wireless ...
3 B (보통) 43 10,200,000.00 18,087,300.00 14,174,188.00 JBL Flip 6 Black
4 C (저조) 43 5,400,000.00 10,027,500.00 7,904,735.00 Razer DeathAdder V4 Pro White
5 D (부진) 43 159,000.00 5,188,800.00 2,268,381.00 Arctic Liquid Freezer III 240 White

13. Create a sequence of consecutive days with a recursive CTE and find days (gaps) with no orders.

Generate consecutive dates from 2024-01-01 to 2024-12-31 and extract days with 0 orders.

Hint 1: In a recursive CTE, DATE(date_val, '+1 day') is incremented by one day. Connect to the actual order date with LEFT JOIN and filter out days where the order is NULL. Alternatively, you can use the calendar table if you have one.

Answer
WITH RECURSIVE date_seq AS (
    SELECT '2024-01-01' AS date_val
    UNION ALL
    SELECT DATE(date_val, '+1 day')
    FROM date_seq
    WHERE date_val < '2024-12-31'
),
daily_orders AS (
    SELECT
        DATE(ordered_at) AS order_date,
        COUNT(*)         AS order_count
    FROM orders
    WHERE ordered_at >= '2024-01-01'
      AND ordered_at < '2025-01-01'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY DATE(ordered_at)
)
SELECT
    d.date_val AS gap_date,
    CASE CAST(STRFTIME('%w', d.date_val) AS INTEGER)
        WHEN 0 THEN '일'
        WHEN 1 THEN '월'
        WHEN 2 THEN '화'
        WHEN 3 THEN '수'
        WHEN 4 THEN '목'
        WHEN 5 THEN '금'
        WHEN 6 THEN '토'
    END AS day_name
FROM date_seq AS d
LEFT JOIN daily_orders AS o ON d.date_val = o.order_date
WHERE o.order_count IS NULL
ORDER BY d.date_val;

14. Perform “Cohort Retention” analysis with a 4-step CTE chain.

CTE 1: First order month (cohort) for each customer. CTE 2: List of order months by customer. CTE 3: Cohort-relative month mapping. CTE 4: Retention rate by cohort.

Hint 1: The relative month is (CAST(SUBSTR(order_month,1,4) AS INTEGER)*12 + CAST(SUBSTR(order_month,6,2) AS INTEGER)) - Calculated as (CAST(SUBSTR(cohort_month,1,4) AS INTEGER)*12 + CAST(SUBSTR(cohort_month,6,2) AS INTEGER)). Retention rate = Active customers per month / Total customers in the cohort * 100.

Answer
WITH cohort AS (
    SELECT
        customer_id,
        SUBSTR(MIN(ordered_at), 1, 7) AS cohort_month
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY customer_id
),
order_months AS (
    SELECT DISTINCT
        customer_id,
        SUBSTR(ordered_at, 1, 7) AS order_month
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
),
cohort_activity AS (
    SELECT
        co.cohort_month,
        (CAST(SUBSTR(om.order_month, 1, 4) AS INTEGER) * 12
         + CAST(SUBSTR(om.order_month, 6, 2) AS INTEGER))
      - (CAST(SUBSTR(co.cohort_month, 1, 4) AS INTEGER) * 12
         + CAST(SUBSTR(co.cohort_month, 6, 2) AS INTEGER)) AS months_since,
        COUNT(DISTINCT om.customer_id) AS active_customers
    FROM cohort AS co
    INNER JOIN order_months AS om ON co.customer_id = om.customer_id
    GROUP BY co.cohort_month, months_since
),
cohort_size AS (
    SELECT
        cohort_month,
        COUNT(*) AS total_customers
    FROM cohort
    GROUP BY cohort_month
)
SELECT
    ca.cohort_month,
    cs.total_customers,
    ca.months_since,
    ca.active_customers,
    ROUND(100.0 * ca.active_customers / cs.total_customers, 1) AS retention_pct
FROM cohort_activity AS ca
INNER JOIN cohort_size AS cs ON ca.cohort_month = cs.cohort_month
WHERE ca.cohort_month >= '2024-01'
  AND ca.months_since <= 6
ORDER BY ca.cohort_month, ca.months_since;

Result (top 7 of 147 rows)

cohort_month total_customers months_since active_customers retention_pct
2024-01 30 0 30 100.00
2024-01 30 1 7 23.30
2024-01 30 2 8 26.70
2024-01 30 3 12 40.00
2024-01 30 4 3 10.00
2024-01 30 5 5 16.70
2024-01 30 6 4 13.30

15. Obtain statistics by category depth with recursive CTE and visualize them in tree form.

The tree structure is expressed as text, including the indentation (space according to depth) of each category, The number of direct products in each category and the total number of sub-products are displayed together.

Hint 1: Unfolds the category tree with a recursive CTE, creating sort_path for sorting (e.g. '001/003/007'). Indent is REPLACE(SUBSTR(' ', 1, level * 4), ' ', ' ') or Expressed as CASE level WHEN 0 THEN '' WHEN 1 THEN ' ' WHEN 2 THEN ' ' END, etc.

Answer
WITH RECURSIVE cat_tree AS (
    SELECT
        id,
        parent_id,
        name,
        depth,
        sort_order,
        PRINTF('%03d', sort_order) AS sort_path,
        0 AS level
    FROM categories
    WHERE parent_id IS NULL

    UNION ALL

    SELECT
        c.id,
        c.parent_id,
        c.name,
        c.depth,
        c.sort_order,
        ct.sort_path || '/' || PRINTF('%03d', c.sort_order),
        ct.level + 1
    FROM categories AS c
    INNER JOIN cat_tree AS ct ON c.parent_id = ct.id
),
direct_products AS (
    SELECT category_id, COUNT(*) AS direct_count
    FROM products
    WHERE is_active = 1
    GROUP BY category_id
)
SELECT
    ct.id,
    CASE ct.level
        WHEN 0 THEN ct.name
        WHEN 1 THEN '  ' || ct.name
        WHEN 2 THEN '    ' || ct.name
    END AS tree_display,
    ct.level,
    COALESCE(dp.direct_count, 0) AS direct_products
FROM cat_tree AS ct
LEFT JOIN direct_products AS dp ON ct.id = dp.category_id
ORDER BY ct.sort_path;

Result (top 7 of 53 rows)

id tree_display level direct_products
1 Desktop PC 0 0
2 Pre-built 1 3
3 Custom Build 1 9
4 Barebone 1 1
5 Laptop 0 0
6 General Laptop 1 8
7 Gaming Laptop 1 6