データベースについて勉強(クエリーを書く編)

select文の書く順番

  • 取得カラムを指定: select節
  • 対象のテーブルを指定: from節
  • 絞り込みの条件を指定: where節
  • グループ化の条件を指定: group by節
  • グループ化した後の絞り込み条件の指定: having節
  • 並び換えの条件を指定: order by節
  • 取得条件の行数を指定: limit節

select文の実行順番

  • 対象のテーブルを指定: from節
  • 絞り込みの条件を指定: where節
  • グループ化の条件を指定: group by節
  • 取得カラムを指定: select節
  • 集約関数による集約
  • グループ化した後の絞り込み条件の指定: having節
  • 並び換えの条件を指定: order by節
  • 取得条件の行数を指定: limit節

単純なSQLの例

注文量が多い顧客トップ10

SELECT 
  customer_id
  , SUM(order_amount) AS sum_orders
FROM `orders` 
GROUP BY customer_id 
ORDER BY sum_orders DESC 
LIMIT 10;

+---------------+--------------+
|   customer_id | sum_orders   |
|---------------+--------------|
|         44802 | 138220       |
|         73302 | 129500       |
|         18476 | 126550       |
|         84062 | 120040       |
|         88878 | 117310       |
|         74888 | 116410       |
|         84003 | 115630       |
|          3852 | 111790       |
|         46976 | 110300       |
|         81033 | 109350       |
+---------------+--------------+

注文回数が多い顧客トップ10

SELECT 
  customer_id
  , COUNT(order_amount) AS cnt_orders
FROM `orders` 
GROUP BY customer_id 
ORDER BY cnt_orders DESC 
LIMIT 10;

+---------------+--------------+
|   customer_id |   cnt_orders |
|---------------+--------------|
|         20854 |           14 |
|         44802 |           12 |
|         15811 |           11 |
|         43514 |           11 |
|         29160 |           10 |
|         27966 |           10 |
|         84003 |           10 |
|         37264 |           10 |
|         39370 |           10 |
|         81161 |           10 |
+---------------+--------------+

購入された中で単価が高い商品トップ10

SELECT * 
FROM`orders` 
ORDER BY order_amount DESC 
LIMIT 10;
+------------+---------------------+-----------+---------------+----------------+
|   order_id | order_time          |   shop_id |   customer_id |   order_amount |
|------------+---------------------+-----------+---------------+----------------|
|       4907 | 2012-07-13 21:46:43 |        20 |         63496 |          90270 |
|       4385 | 2012-07-04 10:19:15 |        27 |          3988 |          89160 |
|       4643 | 2012-07-08 21:57:18 |        27 |         94938 |          89160 |
|      17936 | 2013-07-12 23:41:26 |        15 |          3220 |          88440 |
|      18545 | 2013-07-21 11:43:05 |        15 |         49411 |          88440 |
|       4878 | 2012-07-13 11:27:32 |        12 |         84062 |          85740 |
|      30422 | 2014-04-22 10:16:30 |        12 |         47756 |          85740 |
|      22974 | 2013-11-06 07:54:57 |        21 |          3852 |          80680 |
|      36850 | 2014-07-12 13:06:43 |        21 |         77181 |          72820 |
|      48964 | 2014-12-24 14:03:02 |        12 |         89736 |          63930 |
+------------+---------------------+-----------+---------------+----------------+

アクセス数が多い上位10件の月を出力

日付の関数がsqliteと違うので注意。

SELECT 
  date_format(request_time, '%Y-%m') AS month
  , COUNT(*) AS cnt
FROM access_log
GROUP BY month
HAVING COUNT(*) > 100000
ORDER BY cnt DESC LIMIT 10;

+---------+--------+
| month   |    cnt |
|---------+--------|
| 2014-12 | 353323 |
| 2014-07 | 326572 |
| 2014-06 | 309823 |
| 2014-11 | 282565 |
| 2014-08 | 262916 |
| 2013-12 | 258500 |
| 2013-07 | 241836 |
| 2014-05 | 229900 |
| 2013-06 | 214118 |
| 2013-11 | 185971 |
+---------+--------+

クロス集計

数が多くて見にくいので、上位10件のみ表示。

SELECT
  date_format(request_time, '%Y-%m') AS month
  , request_path
  , COUNT(DISTINCT customer_id) AS cnt
FROM
  (SELECT * FROM access_log LIMIT 1000000) AS tmp
GROUP BY
  month
  , request_path
ORDER BY cnt DESC
LIMIT 10
;

+---------+----------------+-------+
| month   | request_path   |   cnt |
|---------+----------------+-------|
| 2012-07 | /search        | 20506 |
| 2012-06 | /search        | 18278 |
| 2012-08 | /search        | 16542 |
| 2012-07 | /              | 15607 |
| 2012-05 | /search        | 14996 |
| 2012-06 | /              | 13705 |
| 2012-08 | /              | 12319 |
| 2012-05 | /              | 10983 |
| 2012-11 | /search        | 10835 |
| 2012-04 | /search        | 10554 |
+---------+----------------+-------+

Join

正規化でばらした複数のテーブルを1つにまとめるときに使う。

joinしたtableからアクセス数の多い顧客を10名

SELECT 
  customer_name
  , COUNT(*) AS cnt 
FROM access_log as a
  join customers as c 
  on a.customer_id = c.customer_id 
GROUP BY customer_name 
ORDER BY cnt DESC 
LIMIT 10;

+-----------------+-------+
| customer_name   |   cnt |
|-----------------+-------|
| 黒木 優         |   567 |
| 岩田 優         |   461 |
| 柴田 なつみ     |   442 |
| 根本 優         |   387 |
| 中岡 陽子       |   383 |
| 芹沢 ひろ子     |   382 |
| 畑中 優         |   378 |
| 小出 愛梨       |   371 |
| 藤井 優         |   365 |
| 松山 恵梨香     |   364 |
+-----------------+-------+

2つのテーブルとjoin

誰がいつ検索しているか。

SELECT
  a.request_time
  , c.customer_name
FROM
  access_log as a
  inner join web_pages p
  on a.request_path = p.request_path
  join customers as c
  on a.customer_id = c.customer_id
where
  p.page_category = 'search'
LIMIT 10
;

+---------------------+-----------------+
| request_time        | customer_name   |
|---------------------+-----------------|
| 2012-01-01 00:00:02 | 堀川 沙知絵     |
| 2012-01-01 00:04:07 | 堀川 沙知絵     |
| 2012-01-01 00:03:20 | 森本 美幸       |
| 2012-01-01 00:05:10 | 遠藤 玲那       |
| 2012-01-01 00:05:52 | 遠藤 玲那       |
| 2012-01-01 00:07:11 | 遠藤 玲那       |
| 2012-01-01 00:08:17 | 岩村 璃子       |
| 2012-01-01 00:09:57 | 岩村 璃子       |
| 2012-01-01 00:12:18 | 長谷部 一恵     |
| 2012-01-01 00:12:37 | 長谷部 一恵     |
+---------------------+-----------------+

サブクエリ

SELECTした結果もテーブルと見れば、FROMの対象にもできるよね?という感じ。

SELECT
  s.shop_name
  , m.sales_month
  , m.sales_amount
FROM
  (
    SELECT
      shop_id
      , date_format(order_time, '%Y-%m') AS sales_month
      , SUM(order_amount) as sales_amount
    FROM orders
    GROUP BY shop_id, sales_month
  ) as m
  inner join shops as s on m.shop_id = s.shop_id
HAVING m.sales_amount > 1800000
ORDER BY
  s.shop_id, m.sales_month
;

+-------------+---------------+----------------+
| shop_name   | sales_month   | sales_amount   |
|-------------+---------------+----------------|
| アズーレス  | 2014-07       | 2047020        |
| アズーレス  | 2014-12       | 2231810        |
| スフィンク  | 2014-12       | 1953140        |
+-------------+---------------+----------------+