콘텐츠로 이동

도전 문제

사용 테이블

categories — 카테고리 (부모-자식 계층)

customers — 고객 (등급, 포인트, 가입채널)

orders — 주문 (상태, 금액, 일시)

order_items — 주문 상세 (수량, 단가)

products — 상품 (이름, 가격, 재고, 브랜드)

reviews — 리뷰 (평점, 내용)

shipping — 배송 (택배사, 추적번호, 상태)

point_transactions — 포인트 (적립, 사용, 소멸)

product_views — 조회 로그 (고객, 상품, 일시)

inventory_transactions — 재고 입출고 (유형, 수량)

wishlists — 위시리스트 (고객-상품)

calendar — 날짜 차원 (요일, 공휴일)

학습 범위

Window, Analytics, CTE, consecutive, median, retention

1. 중복 리뷰 감지

같은 상품에 2회 이상 리뷰를 작성한 고객을 찾으세요. 고객명, 상품명, 리뷰 횟수를 표시합니다.

힌트 1: - reviewscustomers, products와 JOIN - GROUP BY customer_id, product_idHAVING COUNT(*) >= 2

정답
SELECT
    c.name AS customer_name,
    p.name AS product_name,
    COUNT(*) AS review_count
FROM reviews AS r
INNER JOIN customers AS c ON r.customer_id = c.id
INNER JOIN products AS p ON r.product_id = p.id
GROUP BY r.customer_id, r.product_id, c.name, p.name
HAVING COUNT(*) >= 2
ORDER BY review_count DESC;

실행 결과 (총 321행 중 상위 7행)

customer_name product_name review_count
이영자 G.SKILL Trident Z5 DDR5 64GB 6000MHz 화이트 5
김병철 SteelSeries Aerox 5 Wireless 실버 4
이영자 삼성 DDR5 32GB PC5-38400 4
이영자 삼성 오디세이 G7 32 화이트 4
강명자 ASUS PCE-BE92BT 4
박정수 삼성 DDR4 32GB PC4-25600 4
박정수 Keychron Q1 Pro 실버 4

2. 평일 vs 주말 평균 주문 금액 비교

calendar 테이블을 활용하여 평일과 주말의 평균 주문 금액을 비교하세요. 평일/주말 구분, 주문 수, 평균 주문 금액, 총 매출을 표시합니다.

힌트 1: - calendar.is_weekend 칼럼 활용 - orderscalendar를 날짜로 JOIN

정답
SELECT
    CASE cal.is_weekend
        WHEN 1 THEN 'Weekend'
        ELSE 'Weekday'
    END AS day_type,
    COUNT(*) AS order_count,
    ROUND(AVG(o.total_amount), 2) AS avg_order_value,
    ROUND(SUM(o.total_amount), 2) AS total_revenue
FROM orders AS o
INNER JOIN calendar AS cal
    ON SUBSTR(o.ordered_at, 1, 10) = cal.date_key
WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
GROUP BY cal.is_weekend
ORDER BY cal.is_weekend;

실행 결과 (2행)

day_type order_count avg_order_value total_revenue
Weekday 23,745 1,003,330.80 23,824,089,843.00
Weekend 10,953 999,300.48 10,945,338,164.00

3. 전일 대비 주문 수 증감

2024년 12월의 일별 주문 수를 구하고, LAG를 사용하여 전일 대비 주문 수가 증가한 날만 표시하세요.

힌트 1: - SUBSTR(ordered_at, 1, 10)으로 날짜 추출 - LAG(order_count) OVER (ORDER BY order_date)로 전일 값 참조

정답
WITH daily AS (
    SELECT
        SUBSTR(ordered_at, 1, 10) AS order_date,
        COUNT(*) AS order_count
    FROM orders
    WHERE ordered_at LIKE '2024-12%'
      AND status NOT IN ('cancelled')
    GROUP BY SUBSTR(ordered_at, 1, 10)
),
with_prev AS (
    SELECT
        order_date,
        order_count,
        LAG(order_count) OVER (ORDER BY order_date) AS prev_count
    FROM daily
)
SELECT
    order_date,
    order_count,
    prev_count,
    order_count - prev_count AS diff
FROM with_prev
WHERE order_count > prev_count
ORDER BY order_date;

실행 결과 (총 11행 중 상위 7행)

order_date order_count prev_count diff
2024-12-03 15 14 1
2024-12-06 14 9 5
2024-12-07 15 14 1
2024-12-08 18 15 3
2024-12-14 17 13 4
2024-12-17 17 14 3
2024-12-20 18 14 4

4. 카테고리별 3번째로 비싼 상품

각 카테고리에서 가격이 3번째로 비싼 상품을 찾으세요. 카테고리명, 상품명, 가격, 순위를 표시합니다.

힌트 1: - ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price DESC) - WHERE rn = 3으로 필터

정답
WITH ranked AS (
    SELECT
        cat.name AS category,
        p.name AS product_name,
        p.price,
        ROW_NUMBER() OVER (
            PARTITION BY p.category_id
            ORDER BY p.price DESC
        ) AS rn
    FROM products AS p
    INNER JOIN categories AS cat ON p.category_id = cat.id
    WHERE p.is_active = 1
)
SELECT category, product_name, price, rn AS rank
FROM ranked
WHERE rn = 3
ORDER BY price DESC;

실행 결과 (총 32행 중 상위 7행)

category product_name price rank
게이밍 노트북 Razer Blade 18 블랙 2,987,500.00 3
일반 노트북 ASUS ExpertBook B5 화이트 2,068,800.00 3
조립PC ASUS ROG Strix G16CH 실버 1,879,100.00 3
NVIDIA MSI GeForce RTX 4070 Ti Super GAMING X 1,744,000.00 3
전문가용 모니터 LG 32EP950 OLED 화이트 1,545,700.00 3
게이밍 모니터 LG 울트라기어 27GR95QE 화이트 1,511,700.00 3
2in1 HP Pavilion x360 14 블랙 1,479,700.00 3

5. A/B 버킷 분할

고객 ID의 홀짝(MOD)으로 A/B 그룹을 나누고, 각 그룹의 고객 수, 평균 주문 금액, 평균 주문 횟수를 비교하세요.

힌트 1: - CASE WHEN c.id % 2 = 0 THEN 'A' ELSE 'B' END - customersorders JOIN 후 그룹별 집계

정답
SELECT
    CASE WHEN c.id % 2 = 0 THEN 'A' ELSE 'B' END AS bucket,
    COUNT(DISTINCT c.id) AS customer_count,
    COUNT(o.id) AS total_orders,
    ROUND(AVG(o.total_amount), 2) AS avg_order_value,
    ROUND(1.0 * COUNT(o.id) / COUNT(DISTINCT c.id), 1) AS avg_orders_per_customer
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id
   AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
GROUP BY CASE WHEN c.id % 2 = 0 THEN 'A' ELSE 'B' END
ORDER BY bucket;

실행 결과 (2행)

bucket customer_count total_orders avg_order_value avg_orders_per_customer
A 2615 17,147 1,010,823.23 6.60
B 2615 17,551 993,495.65 6.70

6. 트리 노드 타입 분류

categories 테이블에서 각 카테고리를 root / inner / leaf로 분류하세요. root: parent_id가 NULL, inner: 자식이 있는 비루트, leaf: 자식이 없는 노드.

힌트 1: - LEFT JOIN categories AS child ON cat.id = child.parent_id - CASE로 parent_id와 child 존재 여부에 따라 분류

정답
SELECT
    cat.id,
    cat.name,
    cat.parent_id,
    cat.depth,
    CASE
        WHEN cat.parent_id IS NULL THEN 'root'
        WHEN EXISTS (SELECT 1 FROM categories c2 WHERE c2.parent_id = cat.id) THEN 'inner'
        ELSE 'leaf'
    END AS node_type
FROM categories AS cat
ORDER BY cat.depth, cat.sort_order;

실행 결과 (총 53행 중 상위 7행)

id name parent_id depth node_type
1 데스크톱 PC NULL 0 root
5 노트북 NULL 0 root
10 모니터 NULL 0 root
14 CPU NULL 0 root
17 메인보드 NULL 0 root
20 메모리(RAM) NULL 0 root
23 저장장치 NULL 0 root

7. 일별 주문 취소율 (최근 30일)

최근 30일간(2025-12-01 ~ 2025-12-31 기준) 일별 전체 주문 수와 취소된 주문 수, 취소율(%)을 계산하세요.

힌트 1: - status = 'cancelled'인 주문 비율 계산 - SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END)

정답
SELECT
    SUBSTR(ordered_at, 1, 10) AS order_date,
    COUNT(*) AS total_orders,
    SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled_orders,
    ROUND(100.0 * SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END)
        / COUNT(*), 2) AS cancel_rate_pct
FROM orders
WHERE ordered_at BETWEEN '2025-12-01' AND '2025-12-31 23:59:59'
GROUP BY SUBSTR(ordered_at, 1, 10)
ORDER BY order_date;

실행 결과 (총 31행 중 상위 7행)

order_date total_orders cancelled_orders cancel_rate_pct
2025-12-01 23 2 8.70
2025-12-02 20 1 5.00
2025-12-03 21 0 0.0
2025-12-04 19 1 5.26
2025-12-05 24 1 4.17
2025-12-06 22 0 0.0
2025-12-07 20 0 0.0

8. 할부 금액 계산

주문 금액이 500,000원 이상인 주문에 대해 3개월, 6개월, 12개월 할부 시 월 납부액을 계산하세요. 주문번호, 총액, 3/6/12개월 월납부액을 표시합니다.

힌트 1: - ROUND(total_amount / 3, 0) 등으로 단순 분할 - WHERE total_amount >= 500000

정답
SELECT
    order_number,
    total_amount,
    ROUND(total_amount / 3, 0) AS monthly_3m,
    ROUND(total_amount / 6, 0) AS monthly_6m,
    ROUND(total_amount / 12, 0) AS monthly_12m
FROM orders
WHERE total_amount >= 500000
  AND status NOT IN ('cancelled', 'returned', 'return_requested')
ORDER BY total_amount DESC
LIMIT 20;

실행 결과 (총 20행 중 상위 7행)

order_number total_amount monthly_3m monthly_6m monthly_12m
ORD-20201121-08810 50,867,500.00 16,955,833.00 8,477,917.00 4,238,958.00
ORD-20250305-32265 46,820,024.00 15,606,675.00 7,803,337.00 3,901,669.00
ORD-20200209-05404 43,677,500.00 14,559,167.00 7,279,583.00 3,639,792.00
ORD-20251218-37240 38,626,400.00 12,875,467.00 6,437,733.00 3,218,867.00
ORD-20220106-15263 37,987,600.00 12,662,533.00 6,331,267.00 3,165,633.00
ORD-20200820-07684 37,518,200.00 12,506,067.00 6,253,033.00 3,126,517.00
ORD-20220224-15869 35,397,700.00 11,799,233.00 5,899,617.00 2,949,808.00

9. NULL 생년월일 추정

birth_date가 NULL인 고객에게, 같은 등급(grade) 고객의 평균 출생 연도를 대입하세요. 고객명, 등급, 원래 birth_date, 추정 birth_year를 표시합니다.

힌트 1: - AVG(CAST(SUBSTR(birth_date, 1, 4) AS INTEGER))로 등급별 평균 출생 연도 - LEFT JOIN 또는 스칼라 서브쿼리로 NULL 고객에 적용

정답
WITH grade_avg_year AS (
    SELECT
        grade,
        ROUND(AVG(CAST(SUBSTR(birth_date, 1, 4) AS INTEGER)), 0) AS avg_birth_year
    FROM customers
    WHERE birth_date IS NOT NULL
    GROUP BY grade
)
SELECT
    c.id,
    c.name,
    c.grade,
    c.birth_date,
    gay.avg_birth_year AS estimated_birth_year
FROM customers AS c
INNER JOIN grade_avg_year AS gay ON c.grade = gay.grade
WHERE c.birth_date IS NULL
ORDER BY c.grade, c.id
LIMIT 20;

실행 결과 (총 20행 중 상위 7행)

id name grade birth_date estimated_birth_year
7 김명자 BRONZE NULL 1,988.00
13 김정식 BRONZE NULL 1,988.00
14 윤순옥 BRONZE NULL 1,988.00
24 강민석 BRONZE NULL 1,988.00
36 윤지훈 BRONZE NULL 1,988.00
38 박준영 BRONZE NULL 1,988.00
42 최영진 BRONZE NULL 1,988.00

10. 중복 위시리스트 찾기

wishlists 테이블에는 UNIQUE 제약이 있지만, 만약 중복이 있었다면 어떤 SQL로 찾고 가장 오래된 것만 남길 수 있을까요? ROW_NUMBER로 같은 customer_id + product_id 중 가장 최근 것을 식별하세요.

힌트 1: - ROW_NUMBER() OVER (PARTITION BY customer_id, product_id ORDER BY created_at) - rn > 1이면 삭제 대상

정답
WITH ranked AS (
    SELECT
        id,
        customer_id,
        product_id,
        created_at,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id, product_id
            ORDER BY created_at ASC
        ) AS rn
    FROM wishlists
)
SELECT id, customer_id, product_id, created_at, rn,
       CASE WHEN rn > 1 THEN 'DELETE' ELSE 'KEEP' END AS action
FROM ranked
WHERE customer_id IN (
    SELECT customer_id FROM wishlists
    GROUP BY customer_id
    HAVING COUNT(*) > 1
)
ORDER BY customer_id, product_id, rn
LIMIT 30;

실행 결과 (총 30행 중 상위 7행)

id customer_id product_id created_at rn action
88 3 142 2020-10-21 05:28:28 1 KEEP
1202 3 164 2018-12-05 10:45:35 1 KEEP
1456 3 234 2025-06-24 10:23:47 1 KEEP
996 19 4 2017-10-28 18:51:00 1 KEEP
82 19 144 2024-09-29 00:02:22 1 KEEP
1378 81 1 2017-10-25 09:20:52 1 KEEP
1997 81 106 2023-02-11 22:00:20 1 KEEP

11. 가입 채널별 전환율

acquisition_channel별로 가입 고객 수와 첫 주문까지 도달한 고객 수, 전환율(%)을 계산하세요.

힌트 1: - customers.acquisition_channel로 그룹화 - orders에 1건이라도 있으면 전환된 것

정답
SELECT
    COALESCE(c.acquisition_channel, 'unknown') AS channel,
    COUNT(*) AS signup_count,
    COUNT(DISTINCT o.customer_id) AS converted_count,
    ROUND(100.0 * COUNT(DISTINCT o.customer_id) / COUNT(*), 1) AS conversion_rate_pct
FROM customers AS c
LEFT JOIN orders AS o
    ON c.id = o.customer_id
   AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
GROUP BY COALESCE(c.acquisition_channel, 'unknown')
ORDER BY conversion_rate_pct DESC;

실행 결과 (5행)

channel signup_count converted_count conversion_rate_pct
social 8685 741 8.50
search_ad 10,504 838 8.00
referral 5134 384 7.50
direct 3156 210 6.70
organic 9656 620 6.40

12. Flow vs Stock 비교

상품별 재고 입출고 흐름(flow: 입고-출고 합계)과 현재 재고(stock: products.stock_qty)를 비교하세요. 차이가 있는 상품을 식별합니다.

힌트 1: - inventory_transactions에서 SUM(quantity) (quantity는 입고=양수, 출고=음수) - products.stock_qty와 비교

정답
WITH flow AS (
    SELECT
        product_id,
        SUM(quantity) AS net_flow
    FROM inventory_transactions
    GROUP BY product_id
)
SELECT
    p.name AS product_name,
    p.stock_qty AS current_stock,
    f.net_flow AS calculated_stock,
    p.stock_qty - f.net_flow AS discrepancy
FROM products AS p
INNER JOIN flow AS f ON p.id = f.product_id
WHERE p.stock_qty != f.net_flow
ORDER BY ABS(p.stock_qty - f.net_flow) DESC
LIMIT 20;

실행 결과 (총 20행 중 상위 7행)

product_name current_stock calculated_stock discrepancy
넷기어 GS316PP 104 1590 -1486
TP-Link TL-SG1016D 실버 275 1569 -1294
LG 27UQ85R 블랙 26 1280 -1254
CORSAIR RM1000x 화이트 58 1262 -1204
소니 WH-CH720N 실버 89 1259 -1170
Razer Basilisk V3 Pro 35K 화이트 99 1268 -1169
로지텍 G PRO X SUPERLIGHT 2 화이트 152 1300 -1148

13. 연속 동일 상태 주문

고객별로 3건 이상 연속으로 같은 status인 주문을 찾으세요. LAG를 사용하여 이전 2건의 status와 비교합니다.

힌트 1: - LAG(status, 1), LAG(status, 2) 사용 - 3건이 모두 같은 status이면 해당 행 표시

정답
WITH order_seq AS (
    SELECT
        customer_id,
        order_number,
        status,
        ordered_at,
        LAG(status, 1) OVER (PARTITION BY customer_id ORDER BY ordered_at) AS prev_1,
        LAG(status, 2) OVER (PARTITION BY customer_id ORDER BY ordered_at) AS prev_2
    FROM orders
)
SELECT
    os.customer_id,
    c.name AS customer_name,
    os.order_number,
    os.status,
    os.prev_1,
    os.prev_2,
    os.ordered_at
FROM order_seq AS os
INNER JOIN customers AS c ON os.customer_id = c.id
WHERE os.status = os.prev_1
  AND os.status = os.prev_2
ORDER BY os.customer_id, os.ordered_at
LIMIT 30;

실행 결과 (총 30행 중 상위 7행)

customer_id customer_name order_number status prev_1 prev_2 ordered_at
2 김경수 ORD-20160830-00269 confirmed confirmed confirmed 2016-08-30 10:49:39
2 김경수 ORD-20160904-00274 confirmed confirmed confirmed 2016-09-04 08:47:04
2 김경수 ORD-20160915-00287 confirmed confirmed confirmed 2016-09-15 20:07:17
2 김경수 ORD-20161024-00334 confirmed confirmed confirmed 2016-10-24 12:13:06
2 김경수 ORD-20161101-00343 confirmed confirmed confirmed 2016-11-01 10:44:08
2 김경수 ORD-20170122-00444 confirmed confirmed confirmed 2017-01-22 08:39:07
2 김경수 ORD-20170305-00501 confirmed confirmed confirmed 2017-03-05 20:35:01

14. 카테고리별 매출 Top 3 상품

ROW_NUMBER를 사용하여 각 카테고리에서 매출 상위 3개 상품을 구하세요. 카테고리명, 순위, 상품명, 매출을 표시합니다.

힌트 1: - ROW_NUMBER() OVER (PARTITION BY cat.id ORDER BY revenue DESC) - WHERE rn <= 3

정답
WITH product_revenue AS (
    SELECT
        cat.name AS category,
        cat.id AS category_id,
        p.name AS product_name,
        ROUND(SUM(oi.quantity * oi.unit_price), 2) AS revenue,
        ROW_NUMBER() OVER (
            PARTITION BY cat.id
            ORDER BY SUM(oi.quantity * oi.unit_price) DESC
        ) AS rn
    FROM order_items AS oi
    INNER JOIN orders AS o ON oi.order_id = o.id
    INNER JOIN products AS p ON oi.product_id = p.id
    INNER JOIN categories AS cat ON p.category_id = cat.id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY cat.id, cat.name, p.id, p.name
)
SELECT category, rn AS rank, product_name, revenue
FROM product_revenue
WHERE rn <= 3
ORDER BY category, rn;

실행 결과 (총 113행 중 상위 7행)

category rank product_name revenue
2in1 1 레노버 ThinkPad X1 2in1 실버 554,231,700.00
2in1 2 HP Envy x360 15 실버 326,727,400.00
2in1 3 HP Pavilion x360 14 블랙 319,615,200.00
AMD 1 AMD Ryzen 9 9900X 452,187,900.00
AMD 1 MSI Radeon RX 7900 XTX GAMING X 화이트 585,793,600.00
AMD 2 AMD Ryzen 9 9900X 149,725,400.00
AMD 2 ASUS Dual RX 9070 실버 515,058,400.00

15. 멘토링 페어 매칭

같은 부서 내에서 주니어(role='staff')와 시니어(role='manager')를 멘토-멘티 쌍으로 매칭하세요. 부서명, 멘티명, 멘토명을 표시합니다.

힌트 1: - staff 테이블을 self-join: s1.department = s2.department - s1.role = 'staff' AND s2.role = 'manager'

정답
SELECT
    s1.department,
    s1.name AS mentee,
    s1.role AS mentee_role,
    s2.name AS mentor,
    s2.role AS mentor_role
FROM staff AS s1
INNER JOIN staff AS s2
    ON s1.department = s2.department
   AND s2.role = 'manager'
WHERE s1.role = 'staff'
  AND s1.is_active = 1
  AND s2.is_active = 1
ORDER BY s1.department, s1.name;

16. 클래식 리텐션 분석

가입 월(cohort) 기준으로, 가입 후 +1개월, +2개월, +3개월에 다시 주문한 고객 비율을 계산하세요.

힌트 1: - 코호트 = SUBSTR(customers.created_at, 1, 7) - 각 주문의 "가입 후 경과 월수" 계산 - 조건부 COUNT(DISTINCT CASE WHEN ... THEN customer_id END)

정답
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTR(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        CAST(
            (CAST(SUBSTR(o.ordered_at, 1, 4) AS INTEGER) * 12
             + CAST(SUBSTR(o.ordered_at, 6, 2) AS INTEGER))
          - (CAST(SUBSTR(co.created_at, 1, 4) AS INTEGER) * 12
             + CAST(SUBSTR(co.created_at, 6, 2) AS INTEGER))
        AS INTEGER) AS months_since_signup
    FROM cohort AS co
    INNER JOIN orders AS o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END) AS m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END) AS m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END) AS m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTR(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        (CAST(SUBSTR(o.ordered_at, 1, 4) AS NUMBER) * 12
         + CAST(SUBSTR(o.ordered_at, 6, 2) AS NUMBER))
      - (CAST(SUBSTR(co.created_at, 1, 4) AS NUMBER) * 12
         + CAST(SUBSTR(co.created_at, 6, 2) AS NUMBER)) AS months_since_signup
    FROM cohort co
    INNER JOIN orders o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END) AS m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END) AS m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END) AS m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTRING(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        (CAST(SUBSTRING(o.ordered_at, 1, 4) AS INT) * 12
         + CAST(SUBSTRING(o.ordered_at, 6, 2) AS INT))
      - (CAST(SUBSTRING(co.created_at, 1, 4) AS INT) * 12
         + CAST(SUBSTRING(co.created_at, 6, 2) AS INT)) AS months_since_signup
    FROM cohort AS co
    INNER JOIN orders AS o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END) AS m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END) AS m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END) AS m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup = 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;

실행 결과 (6행)

signup_month cohort_size m1 m1_pct m2 m2_pct m3 m3_pct
2024-01 28 3 10.70 5 17.90 4 14.30
2024-02 26 7 26.90 3 11.50 5 19.20
2024-03 44 10 22.70 11 25.00 2 4.50
2024-04 26 4 15.40 4 15.40 2 7.70
2024-05 24 4 16.70 7 29.20 2 8.30
2024-06 36 4 11.10 4 11.10 6 16.70

17. 롤링 리텐션 분석

가입 월(cohort) 기준으로, N개월 이후에 "아무 때나" 주문한 고객 비율을 계산하세요. 클래식 리텐션과 달리 정확한 월이 아닌 N개월 이후 어떤 시점이든 포함합니다.

힌트 1: - months_since_signup >= N 조건으로 "N개월 이후"를 판별 - 클래식과 비교하면 항상 같거나 큰 값

정답
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTR(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        CAST(
            (CAST(SUBSTR(o.ordered_at, 1, 4) AS INTEGER) * 12
             + CAST(SUBSTR(o.ordered_at, 6, 2) AS INTEGER))
          - (CAST(SUBSTR(co.created_at, 1, 4) AS INTEGER) * 12
             + CAST(SUBSTR(co.created_at, 6, 2) AS INTEGER))
        AS INTEGER) AS months_since_signup
    FROM cohort AS co
    INNER JOIN orders AS o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END) AS rolling_m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END) AS rolling_m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END) AS rolling_m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTR(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        (CAST(SUBSTR(o.ordered_at, 1, 4) AS NUMBER) * 12
         + CAST(SUBSTR(o.ordered_at, 6, 2) AS NUMBER))
      - (CAST(SUBSTR(co.created_at, 1, 4) AS NUMBER) * 12
         + CAST(SUBSTR(co.created_at, 6, 2) AS NUMBER)) AS months_since_signup
    FROM cohort co
    INNER JOIN orders o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END) AS rolling_m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END) AS rolling_m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END) AS rolling_m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;
WITH cohort AS (
    SELECT
        id AS customer_id,
        SUBSTRING(created_at, 1, 7) AS signup_month,
        created_at
    FROM customers
    WHERE created_at >= '2024-01-01' AND created_at < '2024-07-01'
),
cohort_orders AS (
    SELECT
        co.signup_month,
        co.customer_id,
        (CAST(SUBSTRING(o.ordered_at, 1, 4) AS INT) * 12
         + CAST(SUBSTRING(o.ordered_at, 6, 2) AS INT))
      - (CAST(SUBSTRING(co.created_at, 1, 4) AS INT) * 12
         + CAST(SUBSTRING(co.created_at, 6, 2) AS INT)) AS months_since_signup
    FROM cohort AS co
    INNER JOIN orders AS o
        ON co.customer_id = o.customer_id
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    signup_month,
    COUNT(DISTINCT customer_id) AS cohort_size,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END) AS rolling_m1,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 1 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m1_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END) AS rolling_m2,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 2 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m2_pct,
    COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END) AS rolling_m3,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_signup >= 3 THEN customer_id END)
        / COUNT(DISTINCT customer_id), 1) AS rolling_m3_pct
FROM cohort_orders
GROUP BY signup_month
ORDER BY signup_month;

실행 결과 (6행)

signup_month cohort_size rolling_m1 rolling_m1_pct rolling_m2 rolling_m2_pct rolling_m3 rolling_m3_pct
2024-01 28 28 100.00 28 100.00 26 92.90
2024-02 26 26 100.00 26 100.00 25 96.20
2024-03 44 43 97.70 43 97.70 42 95.50
2024-04 26 26 100.00 24 92.30 24 92.30
2024-05 24 24 100.00 24 100.00 24 100.00
2024-06 36 35 97.20 35 97.20 33 91.70

18. DAU/MAU 고착도

2024년 12월 기준으로 product_views의 일별 활성 고객(DAU)과 월간 활성 고객(MAU) 비율을 계산하세요. DAU/MAU가 고착도(Stickiness)입니다.

힌트 1: - DAU = 일별 COUNT(DISTINCT customer_id) - MAU = 해당 월 전체 COUNT(DISTINCT customer_id) (서브쿼리) - Stickiness = DAU / MAU

정답
WITH dau AS (
    SELECT
        SUBSTR(viewed_at, 1, 10) AS view_date,
        COUNT(DISTINCT customer_id) AS daily_active
    FROM product_views
    WHERE viewed_at LIKE '2024-12%'
    GROUP BY SUBSTR(viewed_at, 1, 10)
),
mau AS (
    SELECT COUNT(DISTINCT customer_id) AS monthly_active
    FROM product_views
    WHERE viewed_at LIKE '2024-12%'
)
SELECT
    d.view_date,
    d.daily_active AS dau,
    m.monthly_active AS mau,
    ROUND(100.0 * d.daily_active / m.monthly_active, 2) AS stickiness_pct
FROM dau AS d
CROSS JOIN mau AS m
ORDER BY d.view_date;

실행 결과 (총 31행 중 상위 7행)

view_date dau mau stickiness_pct
2024-12-01 94 1056 8.90
2024-12-02 118 1056 11.17
2024-12-03 111 1056 10.51
2024-12-04 106 1056 10.04
2024-12-05 109 1056 10.32
2024-12-06 111 1056 10.51
2024-12-07 102 1056 9.66

19. 7일 이동 평균 매출

2024년 12월의 일별 매출과 7일 이동 평균을 계산하세요. AVG() OVER (ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)를 사용합니다.

힌트 1: - 일별 매출을 먼저 집계 - AVG(daily_revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)

정답
WITH daily_revenue AS (
    SELECT
        SUBSTR(ordered_at, 1, 10) AS order_date,
        ROUND(SUM(total_amount), 2) AS revenue
    FROM orders
    WHERE ordered_at LIKE '2024-12%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY SUBSTR(ordered_at, 1, 10)
)
SELECT
    order_date,
    revenue,
    ROUND(AVG(revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ), 2) AS moving_avg_7d
FROM daily_revenue
ORDER BY order_date;

실행 결과 (총 31행 중 상위 7행)

order_date revenue moving_avg_7d
2024-12-01 10,287,445.00 10,287,445.00
2024-12-02 11,732,557.00 11,010,001.00
2024-12-03 11,867,860.00 11,295,954.00
2024-12-04 11,198,303.00 11,271,541.25
2024-12-05 5,489,585.00 10,115,150.00
2024-12-06 15,937,500.00 11,085,541.67
2024-12-07 15,895,514.00 11,772,680.57

20. 3개월 이동 평균 월매출

월별 매출과 3개월 이동 평균을 계산하세요. 최근 24개월 데이터를 사용합니다.

힌트 1: - AVG(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

정답
WITH monthly AS (
    SELECT
        SUBSTR(ordered_at, 1, 7) AS year_month,
        ROUND(SUM(total_amount), 0) AS revenue
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
      AND ordered_at >= '2024-01-01'
    GROUP BY SUBSTR(ordered_at, 1, 7)
)
SELECT
    year_month,
    revenue,
    ROUND(AVG(revenue) OVER (
        ORDER BY year_month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ), 0) AS moving_avg_3m
FROM monthly
ORDER BY year_month;

실행 결과 (총 24행 중 상위 7행)

year_month revenue moving_avg_3m
2024-01 288,908,320.00 288,908,320.00
2024-02 403,127,749.00 346,018,035.00
2024-03 519,844,502.00 403,960,190.00
2024-04 451,877,581.00 458,283,277.00
2024-05 425,264,478.00 465,662,187.00
2024-06 362,715,211.00 413,285,757.00
2024-07 343,929,897.00 377,303,195.00

21. 연도 내 누적 월매출

각 연도별로 월별 매출과 연도 내 누적 매출을 계산하세요. PARTITION BY year로 연도마다 누적이 리셋됩니다.

힌트 1: - SUM(revenue) OVER (PARTITION BY year ORDER BY month)

정답
SELECT
    SUBSTR(ordered_at, 1, 4) AS year,
    SUBSTR(ordered_at, 1, 7) AS year_month,
    ROUND(SUM(total_amount), 0) AS monthly_revenue,
    ROUND(SUM(SUM(total_amount)) OVER (
        PARTITION BY SUBSTR(ordered_at, 1, 4)
        ORDER BY SUBSTR(ordered_at, 1, 7)
    ), 0) AS cumulative_revenue
FROM orders
WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
  AND ordered_at >= '2023-01-01'
GROUP BY SUBSTR(ordered_at, 1, 4), SUBSTR(ordered_at, 1, 7)
ORDER BY year_month;

실행 결과 (총 36행 중 상위 7행)

year year_month monthly_revenue cumulative_revenue
2023 2023-01 270,083,587.00 270,083,587.00
2023 2023-02 327,431,648.00 597,515,235.00
2023 2023-03 477,735,354.00 1,075,250,589.00
2023 2023-04 396,849,049.00 1,472,099,638.00
2023 2023-05 349,749,072.00 1,821,848,710.00
2023 2023-06 279,698,633.00 2,101,547,343.00
2023 2023-07 312,983,148.00 2,414,530,491.00

22. 파레토 분석 (고객)

매출의 80%를 생성하는 고객이 전체의 몇 %인지 분석하세요. 고객별 매출 누적 비율과 고객 순위를 표시합니다.

힌트 1: - 고객별 매출을 내림차순 정렬 후 누적 SUM - 전체 매출 대비 누적 비율 계산 - 80% 도달 시점의 고객 수 / 전체 고객 수

정답
WITH customer_revenue AS (
    SELECT
        customer_id,
        ROUND(SUM(total_amount), 2) AS revenue
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY customer_id
),
ranked AS (
    SELECT
        customer_id,
        revenue,
        SUM(revenue) OVER (ORDER BY revenue DESC) AS cumulative_revenue,
        SUM(revenue) OVER () AS total_revenue,
        ROW_NUMBER() OVER (ORDER BY revenue DESC) AS rank,
        COUNT(*) OVER () AS total_customers
    FROM customer_revenue
)
SELECT
    rank,
    revenue,
    ROUND(100.0 * cumulative_revenue / total_revenue, 2) AS cumulative_pct,
    ROUND(100.0 * rank / total_customers, 2) AS customer_pct,
    CASE
        WHEN 100.0 * cumulative_revenue / total_revenue <= 80 THEN 'Top 80%'
        ELSE 'Remaining'
    END AS pareto_group
FROM ranked
WHERE rank <= 50 OR 100.0 * cumulative_revenue / total_revenue BETWEEN 78 AND 82
ORDER BY rank;

실행 결과 (총 167행 중 상위 7행)

rank revenue cumulative_pct customer_pct pareto_group
1 403,448,758.00 1.16 0.04 Top 80%
2 366,385,931.00 2.21 0.07 Top 80%
3 253,180,338.00 2.94 0.11 Top 80%
4 244,604,910.00 3.65 0.14 Top 80%
5 235,775,349.00 4.32 0.18 Top 80%
6 234,708,853.00 5.00 0.21 Top 80%
7 230,165,991.00 5.66 0.25 Top 80%

23. 구매 주기 분석

고객별 연속 주문 간 평균 일수(구매 주기)를 계산하세요. LAG로 이전 주문일을 가져와 JULIANDAY 차이를 구합니다.

힌트 1: - LAG(ordered_at) OVER (PARTITION BY customer_id ORDER BY ordered_at) - JULIANDAY(ordered_at) - JULIANDAY(prev_ordered_at) - AVG()로 고객별 평균 주기

정답
WITH order_gaps AS (
    SELECT
        customer_id,
        ordered_at,
        LAG(ordered_at) OVER (
            PARTITION BY customer_id ORDER BY ordered_at
        ) AS prev_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    c.name AS customer_name,
    c.grade,
    COUNT(*) AS gap_count,
    ROUND(AVG(JULIANDAY(og.ordered_at) - JULIANDAY(og.prev_ordered_at)), 1) AS avg_cycle_days,
    MIN(CAST(JULIANDAY(og.ordered_at) - JULIANDAY(og.prev_ordered_at) AS INTEGER)) AS min_days,
    MAX(CAST(JULIANDAY(og.ordered_at) - JULIANDAY(og.prev_ordered_at) AS INTEGER)) AS max_days
FROM order_gaps AS og
INNER JOIN customers AS c ON og.customer_id = c.id
WHERE og.prev_ordered_at IS NOT NULL
GROUP BY og.customer_id, c.name, c.grade
HAVING COUNT(*) >= 3
ORDER BY avg_cycle_days ASC
LIMIT 20;
WITH order_gaps AS (
    SELECT
        customer_id,
        ordered_at,
        LAG(ordered_at) OVER (
            PARTITION BY customer_id ORDER BY ordered_at
        ) AS prev_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    c.name AS customer_name,
    c.grade,
    COUNT(*) AS gap_count,
    ROUND(AVG(CAST(og.ordered_at AS DATE) - CAST(og.prev_ordered_at AS DATE)), 1) AS avg_cycle_days,
    MIN(CAST(og.ordered_at AS DATE) - CAST(og.prev_ordered_at AS DATE)) AS min_days,
    MAX(CAST(og.ordered_at AS DATE) - CAST(og.prev_ordered_at AS DATE)) AS max_days
FROM order_gaps og
INNER JOIN customers c ON og.customer_id = c.id
WHERE og.prev_ordered_at IS NOT NULL
GROUP BY og.customer_id, c.name, c.grade
HAVING COUNT(*) >= 3
ORDER BY avg_cycle_days ASC
FETCH FIRST 20 ROWS ONLY;
WITH order_gaps AS (
    SELECT
        customer_id,
        ordered_at,
        LAG(ordered_at) OVER (
            PARTITION BY customer_id ORDER BY ordered_at
        ) AS prev_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT TOP 20
    c.name AS customer_name,
    c.grade,
    COUNT(*) AS gap_count,
    ROUND(AVG(CAST(DATEDIFF(DAY, og.prev_ordered_at, og.ordered_at) AS FLOAT)), 1) AS avg_cycle_days,
    MIN(DATEDIFF(DAY, og.prev_ordered_at, og.ordered_at)) AS min_days,
    MAX(DATEDIFF(DAY, og.prev_ordered_at, og.ordered_at)) AS max_days
FROM order_gaps AS og
INNER JOIN customers AS c ON og.customer_id = c.id
WHERE og.prev_ordered_at IS NOT NULL
GROUP BY og.customer_id, c.name, c.grade
HAVING COUNT(*) >= 3
ORDER BY avg_cycle_days ASC;

실행 결과 (총 20행 중 상위 7행)

customer_name grade gap_count avg_cycle_days min_days max_days
김병철 VIP 341 10.10 0 112
박정수 VIP 302 10.80 0 88
이미정 VIP 218 12.00 0 74
김승현 VIP 6 12.10 0 26
정유진 VIP 222 12.30 0 125
이지영 GOLD 3 12.40 3 20
강명자 VIP 248 12.90 0 102

24. 첫 구매 후 재구매율

첫 주문 후 30일/60일/90일 이내에 재구매한 고객 비율을 구하세요.

힌트 1: - 고객별 첫 주문일 = MIN(ordered_at) - 두 번째 주문이 첫 주문 + N일 이내인지 확인

정답
WITH first_order AS (
    SELECT
        customer_id,
        MIN(ordered_at) AS first_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY customer_id
),
repeat_order AS (
    SELECT
        fo.customer_id,
        fo.first_ordered_at,
        MIN(o.ordered_at) AS second_ordered_at
    FROM first_order AS fo
    INNER JOIN orders AS o
        ON fo.customer_id = o.customer_id
       AND o.ordered_at > fo.first_ordered_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY fo.customer_id, fo.first_ordered_at
)
SELECT
    COUNT(DISTINCT fo.customer_id) AS total_customers,
    COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 30
        THEN fo.customer_id
    END) AS repurchase_30d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 30
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_30d_pct,
    COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 60
        THEN fo.customer_id
    END) AS repurchase_60d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 60
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_60d_pct,
    COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 90
        THEN fo.customer_id
    END) AS repurchase_90d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN JULIANDAY(ro.second_ordered_at) - JULIANDAY(fo.first_ordered_at) <= 90
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_90d_pct
FROM first_order AS fo
LEFT JOIN repeat_order AS ro ON fo.customer_id = ro.customer_id;
WITH first_order AS (
    SELECT
        customer_id,
        MIN(ordered_at) AS first_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY customer_id
),
repeat_order AS (
    SELECT
        fo.customer_id,
        fo.first_ordered_at,
        MIN(o.ordered_at) AS second_ordered_at
    FROM first_order fo
    INNER JOIN orders o
        ON fo.customer_id = o.customer_id
       AND o.ordered_at > fo.first_ordered_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY fo.customer_id, fo.first_ordered_at
)
SELECT
    COUNT(DISTINCT fo.customer_id) AS total_customers,
    COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 30
        THEN fo.customer_id
    END) AS repurchase_30d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 30
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_30d_pct,
    COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 60
        THEN fo.customer_id
    END) AS repurchase_60d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 60
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_60d_pct,
    COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 90
        THEN fo.customer_id
    END) AS repurchase_90d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN CAST(ro.second_ordered_at AS DATE) - CAST(fo.first_ordered_at AS DATE) <= 90
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_90d_pct
FROM first_order fo
LEFT JOIN repeat_order ro ON fo.customer_id = ro.customer_id;
WITH first_order AS (
    SELECT
        customer_id,
        MIN(ordered_at) AS first_ordered_at
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY customer_id
),
repeat_order AS (
    SELECT
        fo.customer_id,
        fo.first_ordered_at,
        MIN(o.ordered_at) AS second_ordered_at
    FROM first_order AS fo
    INNER JOIN orders AS o
        ON fo.customer_id = o.customer_id
       AND o.ordered_at > fo.first_ordered_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY fo.customer_id, fo.first_ordered_at
)
SELECT
    COUNT(DISTINCT fo.customer_id) AS total_customers,
    COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 30
        THEN fo.customer_id
    END) AS repurchase_30d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 30
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_30d_pct,
    COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 60
        THEN fo.customer_id
    END) AS repurchase_60d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 60
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_60d_pct,
    COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 90
        THEN fo.customer_id
    END) AS repurchase_90d,
    ROUND(100.0 * COUNT(DISTINCT CASE
        WHEN DATEDIFF(DAY, fo.first_ordered_at, ro.second_ordered_at) <= 90
        THEN fo.customer_id
    END) / COUNT(DISTINCT fo.customer_id), 1) AS repurchase_90d_pct
FROM first_order AS fo
LEFT JOIN repeat_order AS ro ON fo.customer_id = ro.customer_id;

실행 결과 (1행)

total_customers repurchase_30d repurchase_30d_pct repurchase_60d repurchase_60d_pct repurchase_90d repurchase_90d_pct
2793 704 25.20 1016 36.40 1229 44.00

25. 포인트 잔액 불일치 탐지

point_transactions의 SUM(amount)과 customers.point_balance가 일치하지 않는 고객을 찾으세요.

힌트 1: - SUM(pt.amount) 고객별 합계 - customers.point_balance와 비교

정답
WITH point_sum AS (
    SELECT
        customer_id,
        SUM(amount) AS calculated_balance
    FROM point_transactions
    GROUP BY customer_id
)
SELECT
    c.id AS customer_id,
    c.name,
    c.point_balance AS stored_balance,
    COALESCE(ps.calculated_balance, 0) AS calculated_balance,
    c.point_balance - COALESCE(ps.calculated_balance, 0) AS drift
FROM customers AS c
LEFT JOIN point_sum AS ps ON c.id = ps.customer_id
WHERE c.point_balance != COALESCE(ps.calculated_balance, 0)
ORDER BY ABS(c.point_balance - COALESCE(ps.calculated_balance, 0)) DESC
LIMIT 20;

실행 결과 (총 20행 중 상위 7행)

customer_id name stored_balance calculated_balance drift
97 김병철 3,518,880 2,332,397 1,186,483
226 박정수 3,955,828 2,863,301 1,092,527
162 강명자 2,450,166 1,521,994 928,172
549 이미정 2,276,622 1,449,259 827,363
227 김성민 2,297,542 1,516,187 781,355
3 김민재 1,564,015 859,898 704,117
98 이영자 2,218,590 1,514,981 703,609

26. 프로모션 리프트 분석

프로모션 기간 중 일평균 매출과 프로모션 외 기간의 일평균 매출을 비교하세요. 프로모션별 리프트(%)를 계산합니다.

힌트 1: - promotions.started_at ~ ended_at 기간의 전체 매출 / 기간 일수 - 비프로모션 기간의 전체 매출 / 기간 일수

정답
WITH promo_daily AS (
    SELECT
        pr.id AS promo_id,
        pr.name AS promo_name,
        ROUND(SUM(o.total_amount), 0) AS promo_revenue,
        CAST(JULIANDAY(pr.ended_at) - JULIANDAY(pr.started_at) + 1 AS INTEGER) AS promo_days
    FROM promotions AS pr
    INNER JOIN orders AS o
        ON o.ordered_at BETWEEN pr.started_at AND pr.ended_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    WHERE pr.started_at >= '2024-01-01'
    GROUP BY pr.id, pr.name, pr.started_at, pr.ended_at
),
overall_daily AS (
    SELECT
        ROUND(SUM(total_amount) / 365.0, 0) AS avg_daily_revenue
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    pd.promo_name,
    pd.promo_revenue,
    pd.promo_days,
    ROUND(1.0 * pd.promo_revenue / pd.promo_days, 0) AS promo_avg_daily,
    od.avg_daily_revenue AS baseline_avg_daily,
    ROUND(100.0 * ((1.0 * pd.promo_revenue / pd.promo_days) - od.avg_daily_revenue)
        / od.avg_daily_revenue, 1) AS lift_pct
FROM promo_daily AS pd
CROSS JOIN overall_daily AS od
ORDER BY lift_pct DESC
LIMIT 15;
WITH promo_daily AS (
    SELECT
        pr.id AS promo_id,
        pr.name AS promo_name,
        ROUND(SUM(o.total_amount), 0) AS promo_revenue,
        CAST(pr.ended_at AS DATE) - CAST(pr.started_at AS DATE) + 1 AS promo_days
    FROM promotions pr
    INNER JOIN orders o
        ON o.ordered_at BETWEEN pr.started_at AND pr.ended_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    WHERE pr.started_at >= '2024-01-01'
    GROUP BY pr.id, pr.name, pr.started_at, pr.ended_at
),
overall_daily AS (
    SELECT
        ROUND(SUM(total_amount) / 365.0, 0) AS avg_daily_revenue
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    pd.promo_name,
    pd.promo_revenue,
    pd.promo_days,
    ROUND(1.0 * pd.promo_revenue / pd.promo_days, 0) AS promo_avg_daily,
    od.avg_daily_revenue AS baseline_avg_daily,
    ROUND(100.0 * ((1.0 * pd.promo_revenue / pd.promo_days) - od.avg_daily_revenue)
        / od.avg_daily_revenue, 1) AS lift_pct
FROM promo_daily pd
CROSS JOIN overall_daily od
ORDER BY lift_pct DESC
FETCH FIRST 15 ROWS ONLY;
WITH promo_daily AS (
    SELECT
        pr.id AS promo_id,
        pr.name AS promo_name,
        ROUND(SUM(o.total_amount), 0) AS promo_revenue,
        DATEDIFF(DAY, pr.started_at, pr.ended_at) + 1 AS promo_days
    FROM promotions AS pr
    INNER JOIN orders AS o
        ON o.ordered_at BETWEEN pr.started_at AND pr.ended_at
       AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    WHERE pr.started_at >= '2024-01-01'
    GROUP BY pr.id, pr.name, pr.started_at, pr.ended_at
),
overall_daily AS (
    SELECT
        ROUND(SUM(total_amount) / 365.0, 0) AS avg_daily_revenue
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT TOP 15
    pd.promo_name,
    pd.promo_revenue,
    pd.promo_days,
    ROUND(1.0 * pd.promo_revenue / pd.promo_days, 0) AS promo_avg_daily,
    od.avg_daily_revenue AS baseline_avg_daily,
    ROUND(100.0 * ((1.0 * pd.promo_revenue / pd.promo_days) - od.avg_daily_revenue)
        / od.avg_daily_revenue, 1) AS lift_pct
FROM promo_daily AS pd
CROSS JOIN overall_daily AS od
ORDER BY lift_pct DESC;

실행 결과 (총 15행 중 상위 7행)

promo_name promo_revenue promo_days promo_avg_daily baseline_avg_daily lift_pct
연말 감사 세일 2025 416,811,391.00 15 27,787,426.00 14,012,174.00 98.30
깜짝 특가 74,259,378.00 3 24,753,126.00 14,012,174.00 76.70
봄맞이 세일 2025 311,900,316.00 15 20,793,354.00 14,012,174.00 48.40
추석 선물 세일 2024 225,867,067.00 11 20,533,370.00 14,012,174.00 46.50
신학기 노트북 특가 2025 373,010,414.00 22 16,955,019.00 14,012,174.00 21.00
게이밍 기어 페스타 2025 131,978,627.00 8 16,497,328.00 14,012,174.00 17.70
신학기 노트북 특가 2024 350,818,175.00 22 15,946,281.00 14,012,174.00 13.80

27. 카테고리 교차 판매 분석

같은 주문에서 가장 자주 함께 구매되는 카테고리 쌍을 찾으세요.

힌트 1: - 주문별 카테고리 목록을 먼저 구한 뒤 - Self-join으로 모든 카테고리 쌍 생성 (c1.id < c2.id로 중복 제거)

정답
WITH order_categories AS (
    SELECT DISTINCT
        oi.order_id,
        p.category_id,
        cat.name AS category_name
    FROM order_items AS oi
    INNER JOIN products AS p ON oi.product_id = p.id
    INNER JOIN categories AS cat ON p.category_id = cat.id
    INNER JOIN orders AS o ON oi.order_id = o.id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    oc1.category_name AS category_1,
    oc2.category_name AS category_2,
    COUNT(*) AS co_occurrence
FROM order_categories AS oc1
INNER JOIN order_categories AS oc2
    ON oc1.order_id = oc2.order_id
   AND oc1.category_id < oc2.category_id
GROUP BY oc1.category_name, oc2.category_name
ORDER BY co_occurrence DESC
LIMIT 15;

실행 결과 (총 15행 중 상위 7행)

category_1 category_2 co_occurrence
파워서플라이(PSU) 케이스 3139
SSD 케이스 2999
SSD 파워서플라이(PSU) 2990
AMD 케이스 2209
AMD 파워서플라이(PSU) 2182
DDR5 케이스 1768
Intel 케이스 1757

28. 등급 하락 궤적 추적

customer_grade_history에서 연속 하락(downgrade)이 2회 이상인 고객을 찾으세요. 예: VIP -> GOLD -> SILVER

힌트 1: - LAG(reason) OVER (PARTITION BY customer_id ORDER BY changed_at) 사용 - 현재와 이전 모두 'downgrade'인 행을 찾기

정답
WITH grade_seq AS (
    SELECT
        customer_id,
        old_grade,
        new_grade,
        reason,
        changed_at,
        LAG(reason) OVER (
            PARTITION BY customer_id ORDER BY changed_at
        ) AS prev_reason,
        LAG(old_grade) OVER (
            PARTITION BY customer_id ORDER BY changed_at
        ) AS prev_old_grade
    FROM customer_grade_history
)
SELECT
    gs.customer_id,
    c.name AS customer_name,
    gs.prev_old_grade AS grade_before,
    gs.old_grade AS grade_mid,
    gs.new_grade AS grade_after,
    gs.changed_at
FROM grade_seq AS gs
INNER JOIN customers AS c ON gs.customer_id = c.id
WHERE gs.reason = 'downgrade'
  AND gs.prev_reason = 'downgrade'
ORDER BY gs.customer_id, gs.changed_at;

실행 결과 (총 292행 중 상위 7행)

customer_id customer_name grade_before grade_mid grade_after changed_at
4 진정자 VIP GOLD SILVER 2024-01-01 00:00:00
8 성민석 VIP GOLD SILVER 2022-01-01 00:00:00
10 박지훈 GOLD SILVER BRONZE 2021-01-01 00:00:00
12 장준서 VIP GOLD SILVER 2023-01-01 00:00:00
14 윤순옥 VIP GOLD SILVER 2020-01-01 00:00:00
14 윤순옥 GOLD SILVER BRONZE 2021-01-01 00:00:00
15 강은서 VIP GOLD SILVER 2023-01-01 00:00:00

29. 택배사별 월별 배송 성과 및 추이

택배사별 월별 평균 배송 소요일과 전월 대비 변화를 보여주세요.

힌트 1: - JULIANDAY(delivered_at) - JULIANDAY(shipped_at) 로 배송일 - LAG(avg_days) OVER (PARTITION BY carrier ORDER BY month) 로 전월 비교

정답
WITH monthly_carrier AS (
    SELECT
        carrier,
        SUBSTR(shipped_at, 1, 7) AS ship_month,
        COUNT(*) AS delivery_count,
        ROUND(AVG(JULIANDAY(delivered_at) - JULIANDAY(shipped_at)), 2) AS avg_days
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
      AND shipped_at >= '2024-01-01'
    GROUP BY carrier, SUBSTR(shipped_at, 1, 7)
)
SELECT
    carrier,
    ship_month,
    delivery_count,
    avg_days,
    LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month) AS prev_month_days,
    ROUND(avg_days - LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month), 2) AS mom_change
FROM monthly_carrier
ORDER BY carrier, ship_month;
WITH monthly_carrier AS (
    SELECT
        carrier,
        SUBSTR(shipped_at, 1, 7) AS ship_month,
        COUNT(*) AS delivery_count,
        ROUND(AVG(CAST(delivered_at AS DATE) - CAST(shipped_at AS DATE)), 2) AS avg_days
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
      AND shipped_at >= '2024-01-01'
    GROUP BY carrier, SUBSTR(shipped_at, 1, 7)
)
SELECT
    carrier,
    ship_month,
    delivery_count,
    avg_days,
    LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month) AS prev_month_days,
    ROUND(avg_days - LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month), 2) AS mom_change
FROM monthly_carrier
ORDER BY carrier, ship_month;
WITH monthly_carrier AS (
    SELECT
        carrier,
        SUBSTRING(shipped_at, 1, 7) AS ship_month,
        COUNT(*) AS delivery_count,
        ROUND(AVG(CAST(DATEDIFF(DAY, shipped_at, delivered_at) AS FLOAT)), 2) AS avg_days
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
      AND shipped_at >= '2024-01-01'
    GROUP BY carrier, SUBSTRING(shipped_at, 1, 7)
)
SELECT
    carrier,
    ship_month,
    delivery_count,
    avg_days,
    LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month) AS prev_month_days,
    ROUND(avg_days - LAG(avg_days) OVER (PARTITION BY carrier ORDER BY ship_month), 2) AS mom_change
FROM monthly_carrier
ORDER BY carrier, ship_month;

실행 결과 (총 96행 중 상위 7행)

carrier ship_month delivery_count avg_days prev_month_days mom_change
CJ대한통운 2024-01 126 2.50 NULL NULL
CJ대한통운 2024-02 172 2.55 2.50 0.05
CJ대한통운 2024-03 218 2.54 2.55 -0.01
CJ대한통운 2024-04 192 2.51 2.54 -0.03
CJ대한통운 2024-05 171 2.33 2.51 -0.18
CJ대한통운 2024-06 160 2.44 2.33 0.11
CJ대한통운 2024-07 150 2.51 2.44 0.07

30. 3일 연속 매출 증가 구간

일별 매출이 3일 연속 증가한 구간을 찾으세요. 시작일, 종료일, 연속 일수를 표시합니다.

힌트 1: - LAG로 전일 매출 비교하여 증가 여부 플래그 - 증가 플래그가 끊기는 지점을 그룹 경계로 사용 (island 패턴) - 그룹별 연속 일수 >= 3 필터

정답
WITH daily AS (
    SELECT
        SUBSTR(ordered_at, 1, 10) AS order_date,
        ROUND(SUM(total_amount), 2) AS revenue
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
      AND ordered_at >= '2024-01-01'
    GROUP BY SUBSTR(ordered_at, 1, 10)
),
with_flag AS (
    SELECT
        order_date,
        revenue,
        LAG(revenue) OVER (ORDER BY order_date) AS prev_revenue,
        CASE
            WHEN revenue > LAG(revenue) OVER (ORDER BY order_date) THEN 1
            ELSE 0
        END AS is_increase
    FROM daily
),
with_group AS (
    SELECT
        order_date,
        revenue,
        is_increase,
        SUM(CASE WHEN is_increase = 0 THEN 1 ELSE 0 END) OVER (
            ORDER BY order_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS grp
    FROM with_flag
)
SELECT
    MIN(order_date) AS start_date,
    MAX(order_date) AS end_date,
    COUNT(*) AS streak_days
FROM with_group
WHERE is_increase = 1
GROUP BY grp
HAVING COUNT(*) >= 3
ORDER BY start_date;
WITH daily AS (
    SELECT
        SUBSTR(ordered_at, 1, 10) AS order_date,
        ROUND(SUM(total_amount), 2) AS revenue
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
      AND ordered_at >= '2024-01-01'
    GROUP BY SUBSTR(ordered_at, 1, 10)
),
with_flag AS (
    SELECT
        order_date,
        revenue,
        LAG(revenue) OVER (ORDER BY order_date) AS prev_revenue,
        CASE
            WHEN revenue > LAG(revenue) OVER (ORDER BY order_date) THEN 1
            ELSE 0
        END AS is_increase
    FROM daily
),
with_group AS (
    SELECT
        order_date,
        revenue,
        is_increase,
        SUM(CASE WHEN is_increase = 0 THEN 1 ELSE 0 END) OVER (
            ORDER BY order_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS grp
    FROM with_flag
)
SELECT
    MIN(order_date) AS start_date,
    MAX(order_date) AS end_date,
    COUNT(*) AS streak_days
FROM with_group
WHERE is_increase = 1
GROUP BY grp
HAVING COUNT(*) >= 3
ORDER BY start_date;
WITH daily AS (
    SELECT
        SUBSTRING(ordered_at, 1, 10) AS order_date,
        ROUND(SUM(total_amount), 2) AS revenue
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
      AND ordered_at >= '2024-01-01'
    GROUP BY SUBSTRING(ordered_at, 1, 10)
),
with_flag AS (
    SELECT
        order_date,
        revenue,
        LAG(revenue) OVER (ORDER BY order_date) AS prev_revenue,
        CASE
            WHEN revenue > LAG(revenue) OVER (ORDER BY order_date) THEN 1
            ELSE 0
        END AS is_increase
    FROM daily
),
with_group AS (
    SELECT
        order_date,
        revenue,
        is_increase,
        SUM(CASE WHEN is_increase = 0 THEN 1 ELSE 0 END) OVER (
            ORDER BY order_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS grp
    FROM with_flag
)
SELECT
    MIN(order_date) AS start_date,
    MAX(order_date) AS end_date,
    COUNT(*) AS streak_days
FROM with_group
WHERE is_increase = 1
GROUP BY grp
HAVING COUNT(*) >= 3
ORDER BY start_date;

실행 결과 (총 25행 중 상위 7행)

start_date end_date streak_days
2024-01-04 2024-01-07 4
2024-01-17 2024-01-21 5
2024-01-26 2024-01-28 3
2024-03-04 2024-03-06 3
2024-03-27 2024-03-29 3
2024-05-14 2024-05-16 3
2024-06-20 2024-06-23 4

31. 연속 월 주문 고객

5개월 이상 연속으로 매월 주문한 고객을 찾으세요. 고객명, 연속 시작월, 연속 종료월, 연속 개월 수를 표시합니다.

힌트 1: - 고객별 월별 주문 여부를 먼저 구함 - 연속 월을 감지하려면 "month_number - ROW_NUMBER" 패턴 사용 - 같은 차이값을 가진 행들이 연속 구간

정답
WITH customer_months AS (
    SELECT DISTINCT
        customer_id,
        SUBSTR(ordered_at, 1, 7) AS order_month,
        CAST(SUBSTR(ordered_at, 1, 4) AS INTEGER) * 12
            + CAST(SUBSTR(ordered_at, 6, 2) AS INTEGER) AS month_num
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
),
with_rn AS (
    SELECT
        customer_id,
        order_month,
        month_num,
        month_num - ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY month_num
        ) AS grp
    FROM customer_months
),
streaks AS (
    SELECT
        customer_id,
        MIN(order_month) AS start_month,
        MAX(order_month) AS end_month,
        COUNT(*) AS consecutive_months
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 5
)
SELECT
    c.name AS customer_name,
    c.grade,
    s.start_month,
    s.end_month,
    s.consecutive_months
FROM streaks AS s
INNER JOIN customers AS c ON s.customer_id = c.id
ORDER BY s.consecutive_months DESC, c.name
LIMIT 20;
WITH customer_months AS (
    SELECT DISTINCT
        customer_id,
        SUBSTR(ordered_at, 1, 7) AS order_month,
        CAST(SUBSTR(ordered_at, 1, 4) AS NUMBER) * 12
            + CAST(SUBSTR(ordered_at, 6, 2) AS NUMBER) AS month_num
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
),
with_rn AS (
    SELECT
        customer_id,
        order_month,
        month_num,
        month_num - ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY month_num
        ) AS grp
    FROM customer_months
),
streaks AS (
    SELECT
        customer_id,
        MIN(order_month) AS start_month,
        MAX(order_month) AS end_month,
        COUNT(*) AS consecutive_months
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 5
)
SELECT
    c.name AS customer_name,
    c.grade,
    s.start_month,
    s.end_month,
    s.consecutive_months
FROM streaks s
INNER JOIN customers c ON s.customer_id = c.id
ORDER BY s.consecutive_months DESC, c.name
FETCH FIRST 20 ROWS ONLY;
WITH customer_months AS (
    SELECT DISTINCT
        customer_id,
        SUBSTRING(ordered_at, 1, 7) AS order_month,
        CAST(SUBSTRING(ordered_at, 1, 4) AS INT) * 12
            + CAST(SUBSTRING(ordered_at, 6, 2) AS INT) AS month_num
    FROM orders
    WHERE status NOT IN ('cancelled', 'returned', 'return_requested')
),
with_rn AS (
    SELECT
        customer_id,
        order_month,
        month_num,
        month_num - ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY month_num
        ) AS grp
    FROM customer_months
),
streaks AS (
    SELECT
        customer_id,
        MIN(order_month) AS start_month,
        MAX(order_month) AS end_month,
        COUNT(*) AS consecutive_months
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 5
)
SELECT TOP 20
    c.name AS customer_name,
    c.grade,
    s.start_month,
    s.end_month,
    s.consecutive_months
FROM streaks AS s
INNER JOIN customers AS c ON s.customer_id = c.id
ORDER BY s.consecutive_months DESC, c.name;

실행 결과 (총 20행 중 상위 7행)

customer_name grade start_month end_month consecutive_months
김병철 VIP 2016-07 2022-05 71
이미정 VIP 2018-10 2022-07 46
박정수 VIP 2017-01 2020-09 45
김성민 VIP 2017-05 2020-07 39
강상철 VIP 2020-10 2023-11 38
정유진 VIP 2018-04 2021-03 36
김경희 GOLD 2017-11 2020-09 35

32. 세션 정의 (30분 갭)

product_views를 세션으로 그룹화하세요 (같은 고객의 조회 간 30분 이상 간격이면 새 세션). 고객별 세션 수, 세션당 평균 조회 수를 구하세요.

힌트 1: - LAG(viewed_at)로 이전 조회 시간 - 30분 = (JULIANDAY(현재) - JULIANDAY(이전)) * 24 * 60 > 30 - SUM(is_new_session) OVER (...)로 세션 번호 부여

정답
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN (JULIANDAY(viewed_at) - JULIANDAY(LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ))) * 24 * 60 > 30 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN (CAST(viewed_at AS DATE) - CAST(LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) AS DATE)) * 24 * 60 > 30 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN DATEDIFF(MINUTE, LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ), viewed_at) > 30 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;

실행 결과 (1행)

total_customers total_sessions avg_sessions_per_customer avg_views_per_session
267 65,248 244.40 1.10

33. 세션 재정의 (10분 갭)

c11-32와 동일하되 세션 갭을 10분으로 변경하세요. 30분 갭 결과와 비교하여 세션 수가 어떻게 달라지는지 확인합니다.

힌트 1: - 30분을 10분으로 변경하기만 하면 됨 - 세션이 더 많이 분할될 것

정답
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN (JULIANDAY(viewed_at) - JULIANDAY(LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ))) * 24 * 60 > 10 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN (CAST(viewed_at AS DATE) - CAST(LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) AS DATE)) * 24 * 60 > 10 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;
WITH view_gaps AS (
    SELECT
        customer_id,
        viewed_at,
        CASE
            WHEN LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ) IS NULL THEN 1
            WHEN DATEDIFF(MINUTE, LAG(viewed_at) OVER (
                PARTITION BY customer_id ORDER BY viewed_at
            ), viewed_at) > 10 THEN 1
            ELSE 0
        END AS is_new_session
    FROM product_views
    WHERE customer_id <= 500
),
with_session AS (
    SELECT
        customer_id,
        viewed_at,
        SUM(is_new_session) OVER (
            PARTITION BY customer_id ORDER BY viewed_at
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS session_id
    FROM view_gaps
),
session_stats AS (
    SELECT
        customer_id,
        session_id,
        COUNT(*) AS views_in_session
    FROM with_session
    GROUP BY customer_id, session_id
)
SELECT
    COUNT(DISTINCT customer_id) AS total_customers,
    COUNT(*) AS total_sessions,
    ROUND(1.0 * COUNT(*) / COUNT(DISTINCT customer_id), 1) AS avg_sessions_per_customer,
    ROUND(AVG(views_in_session), 1) AS avg_views_per_session
FROM session_stats;

실행 결과 (1행)

total_customers total_sessions avg_sessions_per_customer avg_views_per_session
267 67,538 253.00 1.00

34. 등급별 주문 금액 중앙값

고객 등급별 주문 금액의 중앙값(median)을 계산하세요. SQLite에는 MEDIAN 함수가 없으므로 NTILE 또는 ROW_NUMBER로 구현합니다.

힌트 1: - ROW_NUMBER() OVER (PARTITION BY grade ORDER BY total_amount) - 전체 건수의 절반 위치가 중앙값 - COUNT(*) OVER (PARTITION BY grade)로 전체 건수 파악

정답
WITH grade_orders AS (
    SELECT
        c.grade,
        o.total_amount,
        ROW_NUMBER() OVER (PARTITION BY c.grade ORDER BY o.total_amount) AS rn,
        COUNT(*) OVER (PARTITION BY c.grade) AS cnt
    FROM orders AS o
    INNER JOIN customers AS c ON o.customer_id = c.id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
)
SELECT
    grade,
    ROUND(AVG(total_amount), 2) AS median_amount
FROM grade_orders
WHERE rn IN (cnt / 2, cnt / 2 + 1)
GROUP BY grade
ORDER BY
    CASE grade
        WHEN 'VIP' THEN 1 WHEN 'GOLD' THEN 2
        WHEN 'SILVER' THEN 3 WHEN 'BRONZE' THEN 4
    END;

실행 결과 (4행)

grade median_amount
VIP 481,750.00
GOLD 456,200.00
SILVER 423,000.00
BRONZE 317,400.00

35. 택배사별 배송일 중앙값

택배사별 배송 소요일의 중앙값을 계산하세요.

힌트 1: - JULIANDAY(delivered_at) - JULIANDAY(shipped_at)로 배송일 계산 - ROW_NUMBER()로 순위를 매긴 후 중앙 위치의 값 추출

정답
WITH delivery_days AS (
    SELECT
        carrier,
        ROUND(JULIANDAY(delivered_at) - JULIANDAY(shipped_at), 1) AS days,
        ROW_NUMBER() OVER (PARTITION BY carrier ORDER BY JULIANDAY(delivered_at) - JULIANDAY(shipped_at)) AS rn,
        COUNT(*) OVER (PARTITION BY carrier) AS cnt
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
)
SELECT
    carrier,
    ROUND(AVG(days), 2) AS median_days
FROM delivery_days
WHERE rn IN (cnt / 2, cnt / 2 + 1)
GROUP BY carrier
ORDER BY median_days;
WITH delivery_days AS (
    SELECT
        carrier,
        ROUND(CAST(delivered_at AS DATE) - CAST(shipped_at AS DATE), 1) AS days,
        ROW_NUMBER() OVER (PARTITION BY carrier ORDER BY CAST(delivered_at AS DATE) - CAST(shipped_at AS DATE)) AS rn,
        COUNT(*) OVER (PARTITION BY carrier) AS cnt
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
)
SELECT
    carrier,
    ROUND(AVG(days), 2) AS median_days
FROM delivery_days
WHERE rn IN (TRUNC(cnt / 2), TRUNC(cnt / 2) + 1)
GROUP BY carrier
ORDER BY median_days;
WITH delivery_days AS (
    SELECT
        carrier,
        ROUND(CAST(DATEDIFF(DAY, shipped_at, delivered_at) AS FLOAT), 1) AS days,
        ROW_NUMBER() OVER (PARTITION BY carrier ORDER BY DATEDIFF(DAY, shipped_at, delivered_at)) AS rn,
        COUNT(*) OVER (PARTITION BY carrier) AS cnt
    FROM shipping
    WHERE delivered_at IS NOT NULL
      AND shipped_at IS NOT NULL
)
SELECT
    carrier,
    ROUND(AVG(days), 2) AS median_days
FROM delivery_days
WHERE rn IN (cnt / 2, cnt / 2 + 1)
GROUP BY carrier
ORDER BY median_days;

실행 결과 (4행)

carrier median_days
CJ대한통운 3.00
로젠택배 3.00
우체국택배 3.00
한진택배 3.00

36. 디바이스별 퍼널 이탈 분석

device_type(desktop/mobile/tablet)별로 조회 -> 장바구니 -> 구매 퍼널의 각 단계별 전환율과 이탈률을 구하세요.

힌트 1: - product_views.device_type으로 디바이스 구분 - 각 단계의 고유 고객 수를 device_type별로 집계 - 이탈률 = 1 - 전환율

정답
WITH view_step AS (
    SELECT
        device_type,
        COUNT(DISTINCT customer_id) AS viewers
    FROM product_views
    GROUP BY device_type
),
cart_step AS (
    SELECT
        pv.device_type,
        COUNT(DISTINCT c.customer_id) AS carters
    FROM product_views AS pv
    INNER JOIN carts AS c ON pv.customer_id = c.customer_id
    INNER JOIN cart_items AS ci ON c.id = ci.cart_id AND pv.product_id = ci.product_id
    GROUP BY pv.device_type
),
purchase_step AS (
    SELECT
        pv.device_type,
        COUNT(DISTINCT o.customer_id) AS buyers
    FROM product_views AS pv
    INNER JOIN orders AS o ON pv.customer_id = o.customer_id
    INNER JOIN order_items AS oi ON o.id = oi.order_id AND pv.product_id = oi.product_id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY pv.device_type
)
SELECT
    vs.device_type,
    vs.viewers,
    COALESCE(cs.carters, 0) AS carters,
    ROUND(100.0 * COALESCE(cs.carters, 0) / vs.viewers, 2) AS view_to_cart_pct,
    COALESCE(ps.buyers, 0) AS buyers,
    ROUND(100.0 * COALESCE(ps.buyers, 0) / NULLIF(COALESCE(cs.carters, 0), 0), 2) AS cart_to_buy_pct,
    ROUND(100.0 * COALESCE(ps.buyers, 0) / vs.viewers, 2) AS view_to_buy_pct
FROM view_step AS vs
LEFT JOIN cart_step AS cs ON vs.device_type = cs.device_type
LEFT JOIN purchase_step AS ps ON vs.device_type = ps.device_type
ORDER BY vs.device_type;

실행 결과 (3행)

device_type viewers carters view_to_cart_pct buyers cart_to_buy_pct view_to_buy_pct
desktop 3658 486 13.29 2710 557.61 74.08
mobile 3660 513 14.02 2707 527.68 73.96
tablet 3385 201 5.94 2301 1,144.78 67.98

37. 채널 기여도 분석

product_views의 referrer_source별로 조회수와 최종 구매 전환 수를 구하세요. 다중 채널 고객의 경우 마지막 접촉(last-touch) 기준으로 크레딧을 부여합니다.

힌트 1: - 고객별 마지막 referrer_source를 먼저 구함 (최근 product_view) - 해당 채널에 전환 크레딧 부여

정답
WITH last_touch AS (
    SELECT
        pv.customer_id,
        pv.referrer_source,
        ROW_NUMBER() OVER (
            PARTITION BY pv.customer_id
            ORDER BY pv.viewed_at DESC
        ) AS rn
    FROM product_views AS pv
    INNER JOIN orders AS o ON pv.customer_id = o.customer_id
    WHERE o.status NOT IN ('cancelled', 'returned', 'return_requested')
),
channel_views AS (
    SELECT
        referrer_source,
        COUNT(*) AS total_views,
        COUNT(DISTINCT customer_id) AS unique_viewers
    FROM product_views
    GROUP BY referrer_source
),
channel_conversions AS (
    SELECT
        referrer_source,
        COUNT(DISTINCT customer_id) AS conversions
    FROM last_touch
    WHERE rn = 1
    GROUP BY referrer_source
)
SELECT
    cv.referrer_source,
    cv.total_views,
    cv.unique_viewers,
    COALESCE(cc.conversions, 0) AS last_touch_conversions,
    ROUND(100.0 * COALESCE(cc.conversions, 0) / cv.unique_viewers, 2) AS conversion_rate_pct
FROM channel_views AS cv
LEFT JOIN channel_conversions AS cc ON cv.referrer_source = cc.referrer_source
ORDER BY last_touch_conversions DESC;

실행 결과 (6행)

referrer_source total_views unique_viewers last_touch_conversions conversion_rate_pct
search 114,520 3652 995 27.25
direct 65,716 3603 541 15.02
ad 49,030 3549 411 11.58
recommendation 49,429 3544 405 11.43
social 32,707 3402 291 8.55
email 16,469 2929 150 5.12

38. 활성일 아일랜드 탐지

고객별 product_views에서 연속 활성일(island)을 찾으세요. 예: 고객이 월/화/수 조회 -> 목 미조회 -> 금/토 조회이면 2개 아일랜드(3일, 2일).

힌트 1: - 고객별 조회 날짜를 DISTINCT로 추출 - DATE - ROW_NUMBER 패턴으로 연속 날짜 그룹화 - 그룹별 MIN/MAX/COUNT

정답
WITH active_days AS (
    SELECT DISTINCT
        customer_id,
        SUBSTR(viewed_at, 1, 10) AS view_date
    FROM product_views
    WHERE customer_id <= 200
),
with_rn AS (
    SELECT
        customer_id,
        view_date,
        JULIANDAY(view_date) - ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY view_date
        ) AS grp
    FROM active_days
),
islands AS (
    SELECT
        customer_id,
        MIN(view_date) AS island_start,
        MAX(view_date) AS island_end,
        COUNT(*) AS island_days
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 3
)
SELECT
    c.name AS customer_name,
    i.island_start,
    i.island_end,
    i.island_days
FROM islands AS i
INNER JOIN customers AS c ON i.customer_id = c.id
ORDER BY i.island_days DESC, c.name
LIMIT 20;
WITH active_days AS (
    SELECT DISTINCT
        customer_id,
        SUBSTR(viewed_at, 1, 10) AS view_date
    FROM product_views
    WHERE customer_id <= 200
),
with_rn AS (
    SELECT
        customer_id,
        view_date,
        CAST(view_date AS DATE) - ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY view_date
        ) AS grp
    FROM active_days
),
islands AS (
    SELECT
        customer_id,
        MIN(view_date) AS island_start,
        MAX(view_date) AS island_end,
        COUNT(*) AS island_days
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 3
)
SELECT
    c.name AS customer_name,
    i.island_start,
    i.island_end,
    i.island_days
FROM islands i
INNER JOIN customers c ON i.customer_id = c.id
ORDER BY i.island_days DESC, c.name
FETCH FIRST 20 ROWS ONLY;
WITH active_days AS (
    SELECT DISTINCT
        customer_id,
        SUBSTRING(viewed_at, 1, 10) AS view_date
    FROM product_views
    WHERE customer_id <= 200
),
with_rn AS (
    SELECT
        customer_id,
        view_date,
        DATEADD(DAY, -ROW_NUMBER() OVER (
            PARTITION BY customer_id ORDER BY view_date
        ), CAST(view_date AS DATE)) AS grp
    FROM active_days
),
islands AS (
    SELECT
        customer_id,
        MIN(view_date) AS island_start,
        MAX(view_date) AS island_end,
        COUNT(*) AS island_days
    FROM with_rn
    GROUP BY customer_id, grp
    HAVING COUNT(*) >= 3
)
SELECT TOP 20
    c.name AS customer_name,
    i.island_start,
    i.island_end,
    i.island_days
FROM islands AS i
INNER JOIN customers AS c ON i.customer_id = c.id
ORDER BY i.island_days DESC, c.name;

실행 결과 (총 20행 중 상위 7행)

customer_name island_start island_end island_days
이영자 2016-01-09 2016-02-17 40
김병철 2020-02-28 2020-03-25 27
김민재 2016-03-05 2016-03-29 25
김병철 2017-10-13 2017-11-06 25
이영자 2016-04-18 2016-05-12 25
이영철 2018-02-19 2018-03-15 25
강명자 2021-08-30 2021-09-21 23

39. 월간 종합 대시보드

한 쿼리로 2024년 월별 종합 대시보드를 만드세요: 매출, 주문 수, 신규 고객 수, 활성 고객 수, 평균 주문 금액, 전월 대비 매출 증감률, 매출 1위 상품명.

힌트 1: - 주문/고객/상품 통계를 각각 CTE로 준비 - 월 기준으로 모두 JOIN - LAG로 전월 대비 증감률 - ROW_NUMBER로 월별 매출 1위 상품

정답
WITH monthly_orders AS (
    SELECT
        SUBSTR(ordered_at, 1, 7) AS year_month,
        COUNT(*) AS order_count,
        ROUND(SUM(total_amount), 0) AS revenue,
        ROUND(AVG(total_amount), 0) AS avg_order_value,
        COUNT(DISTINCT customer_id) AS active_customers
    FROM orders
    WHERE ordered_at LIKE '2024%'
      AND status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY SUBSTR(ordered_at, 1, 7)
),
new_customers AS (
    SELECT
        SUBSTR(created_at, 1, 7) AS year_month,
        COUNT(*) AS new_customer_count
    FROM customers
    WHERE created_at LIKE '2024%'
    GROUP BY SUBSTR(created_at, 1, 7)
),
top_products AS (
    SELECT
        SUBSTR(o.ordered_at, 1, 7) AS year_month,
        p.name AS product_name,
        ROUND(SUM(oi.quantity * oi.unit_price), 0) AS product_revenue,
        ROW_NUMBER() OVER (
            PARTITION BY SUBSTR(o.ordered_at, 1, 7)
            ORDER BY SUM(oi.quantity * oi.unit_price) DESC
        ) AS rn
    FROM order_items AS oi
    INNER JOIN orders AS o ON oi.order_id = o.id
    INNER JOIN products AS p ON oi.product_id = p.id
    WHERE o.ordered_at LIKE '2024%'
      AND o.status NOT IN ('cancelled', 'returned', 'return_requested')
    GROUP BY SUBSTR(o.ordered_at, 1, 7), p.id, p.name
),
with_growth AS (
    SELECT
        mo.year_month,
        mo.revenue,
        mo.order_count,
        COALESCE(nc.new_customer_count, 0) AS new_customers,
        mo.active_customers,
        mo.avg_order_value,
        LAG(mo.revenue) OVER (ORDER BY mo.year_month) AS prev_revenue,
        ROUND(100.0 * (mo.revenue - LAG(mo.revenue) OVER (ORDER BY mo.year_month))
            / NULLIF(LAG(mo.revenue) OVER (ORDER BY mo.year_month), 0), 1) AS mom_growth_pct,
        tp.product_name AS top_product
    FROM monthly_orders AS mo
    LEFT JOIN new_customers AS nc ON mo.year_month = nc.year_month
    LEFT JOIN top_products AS tp ON mo.year_month = tp.year_month AND tp.rn = 1
)
SELECT
    year_month,
    revenue,
    order_count,
    new_customers,
    active_customers,
    avg_order_value,
    mom_growth_pct,
    top_product
FROM with_growth
ORDER BY year_month;

실행 결과 (총 12행 중 상위 7행)

year_month revenue order_count new_customers active_customers avg_order_value mom_growth_pct top_product
2024-01 288,908,320.00 314 52 269 920,090.00 NULL Razer Blade 18 블랙
2024-02 403,127,749.00 416 48 335 969,057.00 39.50 Razer Blade 16 실버
2024-03 519,844,502.00 555 71 421 936,657.00 29.00 Razer Blade 16 실버
2024-04 451,877,581.00 466 53 353 969,694.00 -13.10 ASUS ROG Swift PG32UCDM 실버
2024-05 425,264,478.00 385 43 315 1,104,583.00 -5.90 Razer Blade 18 블랙
2024-06 362,715,211.00 389 68 324 932,430.00 -14.70 ASUS ROG Strix Scar 16
2024-07 343,929,897.00 381 62 311 902,703.00 -5.20 Razer Blade 18 블랙

40. JSON 스펙 조회

products 테이블의 specs(JSON) 칼럼에서 CPU 정보를 추출하세요. specs가 NULL이 아닌 상품의 이름과 CPU 정보를 표시합니다.

힌트 1: - SQLite: JSON_EXTRACT(specs, '$.cpu') 사용 - PostgreSQL: specs->>'cpu' 사용 - WHERE specs IS NOT NULL 필터

정답
SELECT
    name,
    brand,
    price,
    JSON_EXTRACT(specs, '$.cpu') AS cpu,
    JSON_EXTRACT(specs, '$.ram') AS ram,
    JSON_EXTRACT(specs, '$.storage') AS storage
FROM products
WHERE specs IS NOT NULL
  AND JSON_EXTRACT(specs, '$.cpu') IS NOT NULL
ORDER BY price DESC
LIMIT 20;
SELECT
    name,
    brand,
    price,
    JSON_EXTRACT(specs, '$.cpu') AS cpu,
    JSON_EXTRACT(specs, '$.ram') AS ram,
    JSON_EXTRACT(specs, '$.storage') AS storage
FROM products
WHERE specs IS NOT NULL
  AND JSON_EXTRACT(specs, '$.cpu') IS NOT NULL
ORDER BY price DESC
LIMIT 20;
SELECT
    name,
    brand,
    price,
    specs->>'cpu' AS cpu,
    specs->>'ram' AS ram,
    specs->>'storage' AS storage
FROM products
WHERE specs IS NOT NULL
  AND specs->>'cpu' IS NOT NULL
ORDER BY price DESC
LIMIT 20;
SELECT
    name,
    brand,
    price,
    JSON_VALUE(specs, '$.cpu') AS cpu,
    JSON_VALUE(specs, '$.ram') AS ram,
    JSON_VALUE(specs, '$.storage') AS storage
FROM products
WHERE specs IS NOT NULL
  AND JSON_VALUE(specs, '$.cpu') IS NOT NULL
ORDER BY price DESC
FETCH FIRST 20 ROWS ONLY;
SELECT TOP 20
    name,
    brand,
    price,
    JSON_VALUE(specs, '$.cpu') AS cpu,
    JSON_VALUE(specs, '$.ram') AS ram,
    JSON_VALUE(specs, '$.storage') AS storage
FROM products
WHERE specs IS NOT NULL
  AND JSON_VALUE(specs, '$.cpu') IS NOT NULL
ORDER BY price DESC;

실행 결과 (총 20행 중 상위 7행)

name brand price cpu ram storage
MacBook Air 15 M3 실버 Apple 5,481,100.00 Intel Core i9-13900H 8GB 256GB
Razer Blade 18 블랙 Razer 4,353,100.00 Intel Core i7-13700H 8GB 1024GB
Razer Blade 16 실버 Razer 3,702,900.00 AMD Ryzen 9 7945HX 32GB 512GB
ASUS ROG Strix G16CH 화이트 ASUS 3,671,500.00 AMD Ryzen 5 7600X 16GB 2048GB
ASUS ROG Zephyrus G16 ASUS 3,429,900.00 Apple M3 16GB 512GB
ASUS ROG Strix GT35 ASUS 3,296,800.00 Intel Core i7-13700K 64GB 2048GB
Razer Blade 18 블랙 Razer 2,987,500.00 Apple M3 8GB 256GB