SQL Tutorial v4.1 2026-04-16

A hands-on SQL tutorial using a realistic e-commerce database (30 tables · 750K rows).

Learn SQL by querying 10 years of business data from TechShop, a fictional online store selling computers and peripherals. Progress systematically through 27 lessons · 910 exercises, from basics to advanced topics. Supports SQLite, MySQL, PostgreSQL, Oracle, and SQL Server — all examples and exercises include DB-specific tabs for comparison.

About the Author

Youngje Ahn (civilian7@gmail.com) · Fullbit Computing · CEO

I first encountered programming in 1983 with an 8-bit computer and BASIC, and have been a professional programmer since 1989. Over the years, I've worked across finance, government, healthcare, IoT, and real estate, building programs in many languages. Among them, Delphi has been my primary language for over 30 years, starting from version 1.0. I run the Delphi community Delmadang.

Online, I go by "Country Programmer" — a name I adopted while living in the countryside of Hongcheon, Gangwon Province. After encountering AI, the creative drive I had in my twenties came rushing back, and I moved back to the city. I currently run Fullbit Computing in Hanam, Gyeonggi Province, developing a universal query browser called ezQuery.

For feature suggestions, bug reports, or inquiries, please contact civilian7@gmail.com.

What is ezQuery?

ezQuery is a universal query browser that manages multiple databases — SQLite, MySQL, PostgreSQL, Oracle, SQL Server, Tibero, and more — through a single interface. Built with Delphi, it uses DirectX (Direct2D) rendering to smoothly display large datasets.

Fast — Native binary + DirectX rendering for instant scrolling through hundreds of thousands of rows
Keyboard-friendly — Monaco editor-based SQL editor with shortcut-driven workflow
AI Assistant — AI that automatically recognizes schema context for SQL generation, error correction, and execution plan analysis
All-in-one — Table designer, ERD viewer, visual query builder, data import/export
Multi-DB — Designed from the ground up with a DB abstraction layer, making it easy to add new databases
Multilingual — Supports Korean and English by default, with architecture designed for easy addition of other languages

We plan to include this tutorial's database as a sample. The product is currently in development, and we'll announce it separately upon release. We plan to offer a free Community Edition and a paid Pro Edition with advanced features.

Why I Created This Tutorial

It started while creating a sample database for ezQuery. As I worked on it, I recalled the frustrations of my junior days — wanting to learn SQL but having nowhere to practice with real data. So I expanded it beyond a simple sample into a proper tutorial, and I'm sharing it in hopes of helping anyone with the same struggle.

Collaboration with AI

This tutorial was created in collaboration with Claude (Anthropic). From the data model, generator code, lessons, exercises, DB-specific DDL/views/stored procedures, to this document itself — every step involved working with AI.

However, we didn't simply use AI-generated output as-is. At each stage, roles were clearly divided:

Stage	Human	AI
Ideas	Direction setting, requirements definition	Structure proposals, initial drafts
Code	Review/revision, execution/debugging	Code generation, DB syntax conversion
Data	Business logic validation	Bulk data generation/conversion
Documentation	Wrote the first lesson directly (baseline)	Pattern learning, then expanding the rest
Verification	Final judgment	Automated checking of number mismatches, broken links, DDL differences

We repeated this cycle hundreds of times to improve quality.

flowchart TD A["💡 Ideas Direction setting + Structure proposals"] --> B["⌨️ Code Review/revision + Generation/conversion"] B --> C["🗄️ Data Generation Logic validation + Bulk generation"] C --> D["📝 Documentation Baseline writing + Pattern expansion"] D --> E["✅ Verification Final judgment + Automated checks"] E -->|"Design change"| A E -->|"Code fix"| B E -->|"Data correction"| C E -->|"Doc revision"| D

The human focuses on "Is this correct?" while the AI handles "fast and at scale." Thanks to this combination, work that would have taken months alone was completed in a much shorter time.

Why Python?

My primary languages are Delphi and C/C++, but the data generator is written in Python.

Faker library — The only option for generating realistic fake data (Korean names, addresses, phone numbers) by locale
Minimal barrier to entry — Readers are here to learn SQL. They may not know Delphi and have no reason to purchase a commercial license. With Python, anyone can start with a single pip install
Cross-platform — Works identically on Windows, macOS, and Linux
AI collaboration efficiency — Python is the language AI handles best. Collaboration speed is maximized at every stage: code generation, refactoring, and debugging

The right tool for the right job. For a one-time task like data generation, Python — with its rich ecosystem and high accessibility — was the optimal choice.

Why This Tutorial?

SQL is hard to learn from books alone. Memorizing syntax doesn't enable you to write queries, and 10-row samples don't build real-world intuition. This tutorial is a "learn by querying" resource.

Realistic data — 750K rows of data from a growing online store over 10 years includes sales growth trends, year-end peaks, customer churn, NULLs, and outliers. Not textbook-clean data, but the real thing you encounter in practice.

Five databases at once — Solve the same problems in SQLite, MySQL, PostgreSQL, Oracle, and SQL Server. Compare DB-specific syntax with a single tab switch, building SQL skills that aren't tied to any particular database.

Data you build yourself — A seed-based generator is included, letting you freely adjust data size, language, and noise. You can use the tutorial's data as-is or create your own.

Traditional Resources	This Tutorial
Syntax explanations only, no practice data	750K rows with 10-year growth curves, seasonality, and customer behavior
Covers only one DB	SQLite, MySQL, PostgreSQL, Oracle, SQL Server — same problems across five DBs
Answers only	Full answers + explanations + result tables for all 910 problems
Fixed data, no customization	Seed-based generator — freely adjust size, language, and noise
Grammar-listing format	Practice centered on shopping mall business scenarios
English or Korean only	Both data and documentation in Korean/English

Supported Databases

This tutorial supports five databases. Each has different characteristics — choose based on your needs, or try multiple DBs.

SQLite

A single file is the database. No server installation needed — you can use it immediately with just the Python standard library, no pip install required. This is the default DB for this tutorial.

Pros	Cons
No installation needed, deploy by copying a file	Limited concurrent writes (single writer)
Lightweight and fast (embedded)	No user management or access control
Supports most of the SQL standard	No stored procedure support
Embedded in mobile and desktop apps	Not suitable for large-scale concurrent access

MySQL / MariaDB

The world's most widely used open-source RDBMS. It's the standard for web services, and MariaDB is a compatible fork of MySQL. The MySQL SQL in this tutorial runs as-is on MariaDB.

Pros	Cons
Rich ecosystem, hosting support	Some SQL standard non-compliance (FULL OUTER JOIN, etc.)
Excellent read performance, easy replication	Subquery optimization weaker than PG
Convenience features like ENUM, AUTO_INCREMENT	CHECK constraints only enforced from 8.0.16+
Cloud support: AWS RDS, Cloud SQL, etc.	Window functions only from 8.0+

PostgreSQL

The open-source RDBMS with the highest standards compliance. It excels in complex queries, JSON processing, and extensibility, and is especially popular in data analytics and geospatial (PostGIS) domains.

Pros	Cons
Highest SQL standard compliance	May be slightly slower than MySQL for simple reads
Rich types: JSONB, arrays, range types	Configuration somewhat complex for beginners
Native Materialized Views, partitioning	Default replication setup more complex than MySQL
PL/pgSQL, custom types, extension modules	Less web hosting support than MySQL

Oracle

The #1 commercial RDBMS in the enterprise market. Excels in large-scale transaction processing and high availability, widely used in finance, government, and enterprise systems.

Pros	Cons
Enterprise-grade stability and performance	High commercial license costs
Powerful server-side logic with PL/SQL	Syntax differs significantly from other DBs
Advanced features: partitioning, RAC, Data Guard	Complex installation and administration
Rich built-in functions (NVL, DECODE, ROWNUM, etc.)	Free edition (XE) has resource limitations

SQL Server

Microsoft's commercial RDBMS, tightly integrated with the .NET ecosystem and Windows environments. Uses T-SQL (Transact-SQL) extended syntax.

Pros	Cons
Perfect integration with Windows/.NET	Linux support only from 2017 (relatively recent)
Powerful SSMS (Management Studio) admin tool	Commercial license costs (Express is free)
T-SQL extensions (TOP, ISNULL, CONVERT, etc.)	Some differences from SQL standard syntax
Azure SQL cloud integration	Uses TOP/OFFSET FETCH instead of LIMIT

Support Scope

DB	Data Generation	Exercise SQL Tabs	Notes
SQLite			Default DB, single file
MySQL			DDL + INSERT SQL files, `--apply` supported
PostgreSQL			DDL + INSERT SQL files, `--apply` supported
Oracle			PL/SQL, IDENTITY, partitioning, 18 views, 16 SP + 5 functions
SQL Server			T-SQL, NVARCHAR, partitioning, 18 views, 14 SP

Oracle / SQL Server Users

Use python -m src.cli.generate --target oracle or --target sqlserver to generate DDL + INSERT + stored procedure SQL files. Exercises also provide Oracle (NVL, FETCH FIRST N ROWS ONLY, etc.) and SQL Server (TOP, ISNULL, CONVERT, etc.) answer tabs where DB syntax differs.

Future Support Plans

Database	Status	Notes
DB2	Under review	Legacy in finance and government
CUBRID	Under review	Korean open-source RDBMS
Tibero	Under review	Korean commercial RDBMS, Oracle-compatible

The data generator's architecture separates DB-specific exporters as plugins, so adding a new DB only requires writing DDL/data conversion modules. Contributions are welcome.

What You'll Learn

Beginner — 7 Lessons · 62 Problems

Learn the most fundamental SQL syntax. After this stage, you'll be able to query, filter, sort, and aggregate data from a single table. No programming experience required to start.

#	Lesson	What You Learn	What You Can Do
00	Introduction to Databases and SQL	DB concepts, tables/rows/columns	Understand DB structure and run queries in a SQL tool
01	SELECT Basics	Column selection, aliases, DISTINCT	Select only the columns you need
02	Filtering with WHERE	Comparison operators, AND/OR, IN, LIKE, BETWEEN	Extract only rows matching conditions
03	Sorting and Paging	ORDER BY, LIMIT, OFFSET	Sort results in desired order and extract top N rows
04	Aggregate Functions	COUNT, SUM, AVG, MIN, MAX	Calculate summary statistics: totals, sums, averages
05	GROUP BY and HAVING	Group-level aggregation, group filtering	Analyze by category, month, or other groupings
06	NULL Handling	IS NULL, COALESCE, IFNULL	Properly handle empty values and set defaults

Intermediate — 11 Lessons · 84 Problems

Combine multiple tables and transform/manipulate data. After this stage, you'll be able to write most SQL needed in practice — covering JOINs, subqueries, DML, DDL, and transactions.

#	Lesson	What You Learn	What You Can Do
07	INNER JOIN	ON conditions, multiple JOINs	Combine orders + customers + products in a single query
08	LEFT JOIN	Outer joins, NULL matching	Find customers with no orders, products with no reviews
09	Subqueries	Scalar, inline views, WHERE clause	Extract products above average price, top buyers
10	CASE Expressions	Conditional branching, pivoting, categorization	Classify by price range, convert rows to columns
11	Date/Time Functions	Extraction, calculation, formatting, DB differences	Monthly sales trends, days from signup to first order
12	String Functions	SUBSTR, REPLACE, CONCAT	Extract email domains, process product names
13	UNION	Union, INTERSECT, EXCEPT	Combine results from multiple queries
14	INSERT, UPDATE, DELETE	Data insertion, modification, deletion	Register new customers, bulk price changes, data cleanup
15	DDL	CREATE TABLE, ALTER, constraints	Create/modify tables, set PK/FK/CHECK
16	Transactions and ACID	BEGIN, COMMIT, ROLLBACK	Process orders + payments atomically, rollback on failure
17	SELF JOIN and CROSS JOIN	Self-reference, Cartesian product	Query manager hierarchies, generate date × category combinations

Advanced — 8 Lessons · 124 Problems

Expand to production-level analytical queries, performance tuning, and DB design. After this stage, you'll be able to calculate rankings, cumulative totals, and moving averages with window functions, read execution plans, design indexes, and write stored procedures.

#	Lesson	What You Learn	What You Can Do
18	Window Functions	ROW_NUMBER, RANK, LAG/LEAD	Ranking, month-over-month growth, cumulative sums
19	CTEs and Recursive CTEs	WITH clause, recursive hierarchy traversal	Make complex queries readable, traverse category trees
20	EXISTS and Correlated Subqueries	EXISTS/NOT EXISTS patterns	Determine existence of related data matching conditions
21	Views	CREATE VIEW, Materialized View	Save frequently-used complex queries as views for reuse
22	Indexes and Performance	EXPLAIN, index design	Find causes of slow queries and improve with indexes
23	Triggers	BEFORE/AFTER triggers	Automatically log history and validate on data changes
24	JSON Data Queries	JSON extraction/filtering, DB differences	Extract values from JSON columns, conditional searches
25	Stored Procedures	PL/pgSQL, MySQL procedures	Encapsulate business logic inside the database

Exercises — 30 Sets · 640 Problems

Apply the syntax learned in lessons to real business data. Every problem includes an answer, explanation, and result table.

Beginner Exercises (4 Sets · 62 Problems)

Practice SELECT, WHERE, aggregation, GROUP BY, and NULL from beginner lessons (00-06) with real data. Build the ability to accurately extract desired data from a single table.

Set	Required Knowledge	Purpose	What You Gain
Product Search	SELECT, WHERE, ORDER BY, LIMIT	Search products with various condition combinations	Practical sense for filtering, sorting, and paging
Customer Analysis	Aggregate functions, GROUP BY, HAVING	Calculate customer statistics by tier, gender, signup channel	Ability to analyze data by grouping
Order Basics	GROUP BY, HAVING, date filtering	Summarize order data by period and status	Aggregate queries that form the basis of business reports
Fill in the Blanks	All beginner topics	Complete incomplete SQL by filling in blanks	Solidify syntax structures in your mind

Intermediate Exercises (7 Sets · 84 Problems)

Combine JOINs, subqueries, CASE, DML, DDL, and transactions from intermediate lessons (07-17). Build practical ability to combine, transform, and manipulate data across multiple tables.

Set	Required Knowledge	Purpose	What You Gain
JOIN Master	INNER/LEFT/RIGHT JOIN	Combine orders + customers + products + payments in various ways	Ability to judge which JOIN to use when
Date/Time Analysis	Date functions, GROUP BY	Monthly sales trends, day-of-week patterns, period calculations	Practical patterns for working with time-series data
Subqueries & Transformations	Subqueries, EXISTS, CASE	Above-average products, tier classification, data transformation	Ability to solve complex conditions within queries
Constraint Experience	DDL, PK/FK/CHECK/UNIQUE	Try INSERTs that violate constraints and observe errors	Understanding how constraints ensure data integrity
Transactions	BEGIN, COMMIT, ROLLBACK	Experience success/failure scenarios firsthand	Feel the meaning of atomicity and the need for rollback
SQL Debugging	All intermediate topics	Find and fix intentionally broken queries	Debugging ability to read errors and identify causes
Data Quality Checks	NULL, JOIN, aggregation	Detect NULLs, duplicates, and referential integrity violations	Practical patterns for verifying data consistency

Advanced Exercises (12 Sets · 124 Problems)

Apply window functions, CTEs, EXISTS, views, indexes, triggers, JSON, and stored procedures from advanced lessons (18-25) to production-level business scenarios — covering analytical report writing, performance tuning, and technical interview preparation.

Set	Required Knowledge	Purpose	What You Gain
Sales Analysis	Window functions, CTE	Monthly growth rates, moving averages, cumulative sales	Sales analysis queries suitable for executive reports
Customer Segmentation	NTILE, CTE, CASE, JOIN	RFM analysis, cohorts, tier migration patterns	Customer classification techniques used in marketing targeting
Inventory Management	Window functions, subqueries	Track in/out movements, safety stock calculation, ABC classification	Inventory management queries used by operations teams
CS Performance Analysis	JOIN, aggregation, date functions	Per-agent throughput, resolution rates, SLA compliance	Ability to measure CS team performance numerically
Practical SQL Patterns	Window functions, CTE, CASE	Pivoting, gap detection, consecutive ranges, missing data imputation	Patterns for solving common real-world challenges
Understanding Normalization	DDL, FK, design concepts	Identify 1NF-3NF, denormalization trade-offs	Ability to judge table design principles and exceptions
Business Scenarios	Full range, composite	Write composite queries for real business decisions	Ability to answer "What can we learn from this data?"
Query Optimization	EXPLAIN, indexes	EXPLAIN analysis, index design, query refactoring	Ability to find and fix causes of slow queries
Interview Prep	Full range	Commonly asked SQL problems in technical interviews	Confidence to write queries in front of an interviewer
Advanced Analytics	Recursive CTE, JSON, window functions	Hierarchy traversal, JSON processing, composite analysis	Ability to use advanced SQL features fluently
Challenge Problems	Full range, composite	High-difficulty problems combining multiple concepts	Ability to solve complex requirements as queries under time pressure
Execution Plan Analysis	EXPLAIN, indexes, join strategies	Read EXPLAIN output and identify bottlenecks	Ability to diagnose "why a query is slow"

Study Tips

Follow the lessons in numerical order. Each lesson builds on what was covered before.

Type it yourself — Typing by hand sticks in memory better than copy-pasting.
Always solve the review problems — Each lesson ends with problems that check the key concepts.
Modify the queries — Change conditions, add columns, or intentionally cause errors. Understanding why the results change is the key.
Write the same result in different ways — Try different approaches using subqueries, JOINs, CTEs, etc. to feel the pros and cons of each syntax.
Get in the habit of reading error messages — SQL error messages usually pinpoint the cause and location precisely. Read the message before searching online.
Run on other DBs too — Running queries learned in SQLite on MySQL and PostgreSQL helps you feel the differences. Use the DB-specific tabs in each lesson.

Recommended Study Schedule

Adjust to your own pace. Based on 1-2 hours per day.

2 Weeks (Intensive)4 Weeks (Standard)8 Weeks (Relaxed)

For those who want to quickly cover everything. When you can invest 2+ hours per day.

Week	Days	Content
1	Days 1-3	Beginner lessons 00-06 + Beginner exercises
	Days 4-5	Intermediate lessons 07-12 (JOIN, subqueries, CASE, date/string)
	Days 6-7	Intermediate lessons 13-17 (UNION, DML, DDL, transactions) + Intermediate exercises
2	Days 1-3	Advanced lessons 18-21 (window functions, CTE, EXISTS, views)
	Days 4-5	Advanced lessons 22-25 (indexes, triggers, JSON, SP)
	Days 6-7	Intensive advanced exercises

Recommended for working professionals and students. 1 hour/day, 5 days/week.

Week	Days	Content
1	Mon-Fri	Beginner lessons 00-06 + Beginner exercises
2	Mon-Wed	Intermediate lessons 07-11 (JOIN, subqueries, CASE, date/time)
	Thu-Fri	Intermediate lessons 12-14 (string, UNION, DML)
3	Mon-Wed	Intermediate lessons 15-17 (DDL, transactions, SELF/CROSS JOIN) + Intermediate exercises
	Thu-Fri	Advanced lessons 18-19 (window functions, CTE)
4	Mon-Wed	Advanced lessons 20-25 (EXISTS, views, indexes, triggers, JSON, SP)
	Thu-Fri	Intensive advanced exercises

For those new to SQL. 30 min - 1 hour/day, 3-4 days/week.

Week	Content
Weeks 1-2	Beginner lessons 00-06 (1 lesson/day)
Week 3	Beginner exercises (4 sets) + Review
Weeks 4-5	Intermediate lessons 07-12 (JOIN - string functions)
Week 6	Intermediate lessons 13-17 (UNION - SELF JOIN) + Intermediate exercises
Week 7	Advanced lessons 18-25
Week 8	Intensive advanced exercises + Full review

License

Tutorial documents and generated data — CC BY 4.0 (free to use, modify, and redistribute with attribution)
Source code (generator, scripts, etc.) — MIT License

Changelog

Version	Date	Key Changes
v4.1	2026-04-16	SQL Playground (CodeMirror + sql.js, usage guide)
v4.0	2026-04-16	1,065 exercises YAML consolidation, DB comment system, tutorial quality overhaul
v3.6	2026-04-16	Docker-based DB installation guide, Oracle --apply, interactive wizard
v3.5	2026-04-16	Project restructure (src/ consolidation), Oracle/SQL Server SQL answers for 30 problems, SQLite script output, bug fixes
v3.4	2026-04-15	Oracle / SQL Server data generation exporters, full 5 DB support
v3.3	2026-04-15	Oracle / SQL Server exercise SQL tabs
v3.2	2026-04-14	26 lesson review YAML integration (270 problems), total 910 problems
v3.0	2026-04-12	27 lessons · 640 exercises (30 sets) complete, full English translation, persona learning paths, MkDocs build automation, PDF export
v2.1	2026-04-11	Document structure overhaul (schema/setup split), 18 MySQL/PG views added, expanded to 15 stored procedures, trigger/view/SP flowcharts, DB selection guide, study schedule
v2.0	2026-04-01	Full renewal — 30 tables, 3 DB simultaneous support, Korean/English support, seed-based generator
v1.0	2025-12-01	Initial release — SQLite only, 10 tables, basic lessons

Get Started →