PostgreSQLの実行計画を読めるようになりたい Part3(待って!そのJOINプルーニング効いてる?)
最近仕事に追われているこれしきです。
何か知らんけど最近めちゃくちゃ忙しい。
テレワークになってからというものの、仕事が長引いてしまうとついつい切り上げるタイミングが分からず作業を続けてしまいます。
PostgreSQLで遊んでいたら面白い現象に遭遇したので、記事にしてみました。
まずは次のようなパーティションテーブルを用意します。
CREATE TABLE part_table( yyyymm varchar(6), col1 int, col2 int, col3 int, PRIMARY KEY(yyyymm, col1) ) PARTITION BY RANGE (yyyymm); CREATE INDEX idx_part_table_col3 ON part_table(col3); CREATE TABLE part_table_202301 PARTITION OF part_table FOR VALUES FROM ('202301') TO ('202302'); CREATE TABLE part_table_202302 PARTITION OF part_table FOR VALUES FROM ('202302') TO ('202302'); CREATE TABLE part_table_202303 PARTITION OF part_table FOR VALUES FROM ('202303') TO ('202302'); CREATE TABLE part_table_202304 PARTITION OF part_table FOR VALUES FROM ('202304') TO ('202302'); CREATE TABLE part_table_202305 PARTITION OF part_table FOR VALUES FROM ('202305') TO ('202302'); CREATE TABLE part_table_202306 PARTITION OF part_table FOR VALUES FROM ('202306') TO ('202302'); CREATE TABLE part_table_202307 PARTITION OF part_table FOR VALUES FROM ('202307') TO ('202302'); CREATE TABLE part_table_202308 PARTITION OF part_table FOR VALUES FROM ('202308') TO ('202302'); CREATE TABLE part_table_202309 PARTITION OF part_table FOR VALUES FROM ('202309') TO ('202302'); CREATE TABLE part_table_202310 PARTITION OF part_table FOR VALUES FROM ('202310') TO ('202302'); CREATE TABLE part_table_202311 PARTITION OF part_table FOR VALUES FROM ('202311') TO ('202302'); CREATE TABLE part_table_202312 PARTITION OF part_table FOR VALUES FROM ('202312') TO ('202401');
月毎でパーティション化しているテーブルです。
パーティションテーブルとしてはよくある形式だと思います。
ここに作為的にデータをINSERTします。
db01=# select yyyymm, count(*), count(case when col3 = 49 then 1 else null end) cnt from part_table group by yyyymm; yyyymm | count | cnt --------+--------+------- 202301 | 100000 | 50000 202302 | 100000 | 1 202303 | 100000 | 50000 202304 | 100000 | 50000 202305 | 100000 | 50000 202306 | 100000 | 50000 202307 | 100000 | 50000 202308 | 100000 | 50000 202309 | 100000 | 50000 202310 | 100000 | 50000 202311 | 100000 | 50000 202312 | 100000 | 50000 (12 行)
これはかなり極端な例だけど、各子パーティションに欲しい条件を満たす行があるって感じです。
もう1個次のようなテーブルを作成しておきます。
CREATE TABLE purchase_history( purchase_id serial, item_id int, store_id int, purchase_cnt int, purchase_date date, PRIMARY KEY(purchase_id) ); CREATE INDEX idx_purchase_history_date ON purchase_history(purchase_date);
対象となるデータ量は100件未満。
db01=# select count(*) from purchase_history where purchase_date between '2023-02-01' and '2023-02-15' and item_id = 49; count ------- 156
本題はここから。
上のpurchase_history
とpart_table
をJOINすることを考えます。
SQLは次の通り。
SELECT * FROM purchase_history ph LEFT JOIN part_table pt ON pt.yyyymm = to_char(ph.purchase_date, 'yyyymm') AND pt.col3 = ph.item_id WHERE ph.purchase_date between '2023-02-01' and '2023-02-15' AND ph.item_id = 49;
LEFT JOINは単純に結合する順番を固定したいからです。
ph.purchase_date between '2023-02-01' and '2023-02-15'
という条件があるため、感覚的にはpt.yyyymm = to_char(ph.purchase_date, 'yyyymm')
の右辺は202302
となって、パーティションプルーニングが効きそうな気がします。
では、実行計画を見てみましょう。
1 Merge Left Join (cost=123482.47..144553.66 rows=875196 width=39) (actual time=739.149..739.309 rows=156 loops=1) 2 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, pt.yyyymm, pt.col1, pt.col2, pt.col3 3 Merge Cond: ((to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) = (pt.yyyymm)::text) 4 Join Filter: (pt.col3 = ph.item_id) 5 Buffers: shared hit=5 read=17863, temp read=621 written=2033 6 -> Sort (cost=40291.75..40292.54 rows=318 width=20) (actual time=106.316..106.332 rows=156 loops=1) 7 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, (to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 8 Sort Key: (to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 9 Sort Method: quicksort Memory: 37kB 10 Buffers: shared hit=4 read=13046 11 -> Bitmap Heap Scan on part_test.purchase_history ph (cost=418.19..40278.53 rows=318 width=20) (actual time=6.108..105.860 rows=156 loops=1) 12 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text) 13 Recheck Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 14 Filter: (ph.item_id = 49) 15 Rows Removed by Filter: 15516 16 Heap Blocks: exact=13034 17 Buffers: shared hit=4 read=13046 18 -> Bitmap Index Scan on idx_purchase_history_purchase_date (cost=0.00..418.11 rows=30968 width=0) (actual time=2.336..2.336 rows=15672 loops=1) 19 Index Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 20 Buffers: shared hit=3 read=13 21 -> Materialize (cost=83190.72..85942.91 rows=550438 width=19) (actual time=579.755..615.324 rows=50157 loops=1) 22 Output: pt.yyyymm, pt.col1, pt.col2, pt.col3 23 Buffers: shared hit=1 read=4817, temp read=621 written=2033 24 -> Sort (cost=83190.72..84566.82 rows=550438 width=19) (actual time=579.748..597.670 rows=50002 loops=1) 25 Output: pt.yyyymm, pt.col1, pt.col2, pt.col3 26 Sort Key: pt.yyyymm 27 Sort Method: external merge Disk: 16152kB 28 Buffers: shared hit=1 read=4817, temp read=621 written=2033 29 -> Append (cost=0.00..19414.85 rows=550438 width=19) (actual time=0.025..306.028 rows=550001 loops=1) 30 Buffers: shared hit=1 read=4817 31 -> Seq Scan on part_test.part_table_202301 pt_1 (cost=0.00..1887.00 rows=49697 width=19) (actual time=0.024..28.968 rows=50000 loops=1) 32 Output: pt_1.yyyymm, pt_1.col1, pt_1.col2, pt_1.col3 33 Filter: (pt_1.col3 = 49) 34 Rows Removed by Filter: 50000 35 Buffers: shared read=637 36 -> Bitmap Heap Scan on part_test.part_table_202302 pt_2 (cost=4.31..12.00 rows=2 width=19) (actual time=0.055..0.056 rows=1 loops=1) 37 Output: pt_2.yyyymm, pt_2.col1, pt_2.col2, pt_2.col3 38 Recheck Cond: (pt_2.col3 = 49) 39 Heap Blocks: exact=1 40 Buffers: shared hit=1 read=2 41 -> Bitmap Index Scan on part_test_202302_col3_idx (cost=0.00..4.31 rows=2 width=0) (actual time=0.048..0.048 rows=1 loops=1) 42 Index Cond: (pt_2.col3 = 49) 43 Buffers: shared read=2 44 -> Seq Scan on part_test.part_table_202303 pt_3 (cost=0.00..1887.00 rows=50583 width=19) (actual time=0.017..29.543 rows=50000 loops=1) 45 Output: pt_3.yyyymm, pt_3.col1, pt_3.col2, pt_3.col3 46 Filter: (pt_3.col3 = 49) 47 Rows Removed by Filter: 50000 48 Buffers: shared read=637 49 -> Seq Scan on part_test.part_table_202304 pt_4 (cost=0.00..1887.00 rows=50023 width=19) (actual time=0.015..28.921 rows=50000 loops=1) 50 Output: pt_4.yyyymm, pt_4.col1, pt_4.col2, pt_4.col3 51 Filter: (pt_4.col3 = 49) 52 Rows Removed by Filter: 50000 53 Buffers: shared read=637 54 -> Index Scan using part_test_202305_col3_idx on part_test.part_table_202305 pt_5 (cost=0.29..1378.32 rows=50230 width=19) (actual time=0.025..15.366 rows=50000 loops=1) 55 Output: pt_5.yyyymm, pt_5.col1, pt_5.col2, pt_5.col3 56 Index Cond: (pt_5.col3 = 49) 57 Buffers: shared read=363 58 -> Index Scan using part_test_202306_col3_idx on part_test.part_table_202306 pt_6 (cost=0.29..1376.45 rows=50123 width=19) (actual time=0.027..18.216 rows=50000 loops=1) 59 Output: pt_6.yyyymm, pt_6.col1, pt_6.col2, pt_6.col3 60 Index Cond: (pt_6.col3 = 49) 61 Buffers: shared read=363 62 -> Index Scan using part_test_202307_col3_idx on part_test.part_table_202307 pt_7 (cost=0.29..1369.32 rows=49830 width=19) (actual time=0.024..15.687 rows=50000 loops=1) 63 Output: pt_7.yyyymm, pt_7.col1, pt_7.col2, pt_7.col3 64 Index Cond: (pt_7.col3 = 49) 65 Buffers: shared read=363 66 -> Index Scan using part_test_202308_col3_idx on part_test.part_table_202308 pt_8 (cost=0.29..1373.77 rows=50027 width=19) (actual time=0.029..20.799 rows=50000 loops=1) 67 Output: pt_8.yyyymm, pt_8.col1, pt_8.col2, pt_8.col3 68 Index Cond: (pt_8.col3 = 49) 69 Buffers: shared read=363 70 -> Index Scan using part_test_202309_col3_idx on part_test.part_table_202309 pt_9 (cost=0.29..1376.22 rows=50110 width=19) (actual time=0.043..14.840 rows=50000 loops=1) 71 Output: pt_9.yyyymm, pt_9.col1, pt_9.col2, pt_9.col3 72 Index Cond: (pt_9.col3 = 49) 73 Buffers: shared read=363 74 -> Index Scan using part_test_202310_col3_idx on part_test.part_table_202310 pt_10 (cost=0.29..1370.02 rows=49870 width=19) (actual time=0.077..23.957 rows=50000 loops=1) 75 Output: pt_10.yyyymm, pt_10.col1, pt_10.col2, pt_10.col3 76 Index Cond: (pt_10.col3 = 49) 77 Buffers: shared read=363 78 -> Index Scan using part_test_202311_col3_idx on part_test.part_table_202311 pt_11 (cost=0.29..1372.94 rows=49980 width=19) (actual time=0.022..17.228 rows=50000 loops=1) 79 Output: pt_11.yyyymm, pt_11.col1, pt_11.col2, pt_11.col3 80 Index Cond: (pt_11.col3 = 49) 81 Buffers: shared read=363 82 -> Index Scan using part_test_202312_col3_idx on part_test.part_table_202312 pt_12 (cost=0.29..1372.64 rows=49963 width=19) (actual time=0.062..16.716 rows=50000 loops=1) 83 Output: pt_12.yyyymm, pt_12.col1, pt_12.col2, pt_12.col3 84 Index Cond: (pt_12.col3 = 49) 85 Buffers: shared read=363 86 Planning: 87 Buffers: shared hit=89 88 Planning Time: 3.635 ms 89 Execution Time: 743.087 ms 90 (89 行)
実行計画を見ると、結合にMerge Joinが使われていることが分かります。
Merge Joinというのは、結合する前に2つのテーブルを結合キーでソートし、その後結合キー順に突き合わせる結合方式です。
実際に6、24行目でSortと表示されています。
そして27行目にSort Method: external merge Disk: 16152kB
とあるので、work_memが足りずソート時に外部へ一時ファイルを書き込んでいることが分かります。
少し遠回りになりますが、せっかくなのでwork_memを増やして性能改善するか見てみます。
db01=# show work_mem; work_mem ---------- 400MB 1 Merge Left Join (cost=112191.47..131886.56 rows=875196 width=39) (actual time=638.178..638.317 rows=156 loops=1) 2 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, pt.yyyymm, pt.col1, pt.col2, pt.col3 3 Merge Cond: ((to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) = (pt.yyyymm)::text) 4 Join Filter: (pt.col3 = ph.item_id) 5 Buffers: shared hit=5 read=17863 6 -> Sort (cost=40291.75..40292.54 rows=318 width=20) (actual time=126.135..126.148 rows=156 loops=1) 7 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, (to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 8 Sort Key: (to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 9 Sort Method: quicksort Memory: 37kB 10 Buffers: shared hit=4 read=13046 11 -> Bitmap Heap Scan on part_test.purchase_history ph (cost=418.19..40278.53 rows=318 width=20) (actual time=11.066..125.489 rows=156 loops=1) 12 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text) 13 Recheck Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 14 Filter: (ph.item_id = 49) 15 Rows Removed by Filter: 15516 16 Heap Blocks: exact=13034 17 Buffers: shared hit=4 read=13046 18 -> Bitmap Index Scan on idx_purchase_history_purchase_date (cost=0.00..418.11 rows=30968 width=0) (actual time=2.092..2.092 rows=15672 loops=1) 19 Index Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 20 Buffers: shared hit=3 read=13 21 -> Sort (cost=71899.72..73275.82 rows=550438 width=19) (actual time=498.414..502.474 rows=50157 loops=1) 22 Output: pt.yyyymm, pt.col1, pt.col2, pt.col3 23 Sort Key: pt.yyyymm 24 Sort Method: quicksort Memory: 67545kB 25 Buffers: shared hit=1 read=4817 26 -> Append (cost=0.00..19414.85 rows=550438 width=19) (actual time=0.021..291.674 rows=550001 loops=1) 27 Buffers: shared hit=1 read=4817 28 -> Seq Scan on part_test.part_table_202301 pt_1 (cost=0.00..1887.00 rows=49697 width=19) (actual time=0.020..30.481 rows=50000 loops=1) 29 Output: pt_1.yyyymm, pt_1.col1, pt_1.col2, pt_1.col3 30 Filter: (pt_1.col3 = 49) 31 Rows Removed by Filter: 50000 32 Buffers: shared read=637 33 -> Bitmap Heap Scan on part_test.part_table_202302 pt_2 (cost=4.31..12.00 rows=2 width=19) (actual time=0.020..0.021 rows=1 loops=1) 34 Output: pt_2.yyyymm, pt_2.col1, pt_2.col2, pt_2.col3 35 Recheck Cond: (pt_2.col3 = 49) 36 Heap Blocks: exact=1 37 Buffers: shared hit=1 read=2 38 -> Bitmap Index Scan on part_test_202302_col3_idx (cost=0.00..4.31 rows=2 width=0) (actual time=0.016..0.016 rows=1 loops=1) 39 Index Cond: (pt_2.col3 = 49) 40 Buffers: shared read=2 41 -> Seq Scan on part_test.part_table_202303 pt_3 (cost=0.00..1887.00 rows=50583 width=19) (actual time=0.012..29.291 rows=50000 loops=1) 42 Output: pt_3.yyyymm, pt_3.col1, pt_3.col2, pt_3.col3 43 Filter: (pt_3.col3 = 49) 44 Rows Removed by Filter: 50000 45 Buffers: shared read=637 46 -> Seq Scan on part_test.part_table_202304 pt_4 (cost=0.00..1887.00 rows=50023 width=19) (actual time=0.012..28.503 rows=50000 loops=1) 47 Output: pt_4.yyyymm, pt_4.col1, pt_4.col2, pt_4.col3 48 Filter: (pt_4.col3 = 49) 49 Rows Removed by Filter: 50000 50 Buffers: shared read=637 51 -> Index Scan using part_test_202305_col3_idx on part_test.part_table_202305 pt_5 (cost=0.29..1378.32 rows=50230 width=19) (actual time=0.026..14.182 rows=50000 loops=1) 52 Output: pt_5.yyyymm, pt_5.col1, pt_5.col2, pt_5.col3 53 Index Cond: (pt_5.col3 = 49) 54 Buffers: shared read=363 55 -> Index Scan using part_test_202306_col3_idx on part_test.part_table_202306 pt_6 (cost=0.29..1376.45 rows=50123 width=19) (actual time=0.022..16.405 rows=50000 loops=1) 56 Output: pt_6.yyyymm, pt_6.col1, pt_6.col2, pt_6.col3 57 Index Cond: (pt_6.col3 = 49) 58 Buffers: shared read=363 59 -> Index Scan using part_test_202307_col3_idx on part_test.part_table_202307 pt_7 (cost=0.29..1369.32 rows=49830 width=19) (actual time=0.059..17.381 rows=50000 loops=1) 60 Output: pt_7.yyyymm, pt_7.col1, pt_7.col2, pt_7.col3 61 Index Cond: (pt_7.col3 = 49) 62 Buffers: shared read=363 63 -> Index Scan using part_test_202308_col3_idx on part_test.part_table_202308 pt_8 (cost=0.29..1373.77 rows=50027 width=19) (actual time=0.023..16.160 rows=50000 loops=1) 64 Output: pt_8.yyyymm, pt_8.col1, pt_8.col2, pt_8.col3 65 Index Cond: (pt_8.col3 = 49) 66 Buffers: shared read=363 67 -> Index Scan using part_test_202309_col3_idx on part_test.part_table_202309 pt_9 (cost=0.29..1376.22 rows=50110 width=19) (actual time=0.021..14.780 rows=50000 loops=1) 68 Output: pt_9.yyyymm, pt_9.col1, pt_9.col2, pt_9.col3 69 Index Cond: (pt_9.col3 = 49) 70 Buffers: shared read=363 71 -> Index Scan using part_test_202310_col3_idx on part_test.part_table_202310 pt_10 (cost=0.29..1370.02 rows=49870 width=19) (actual time=0.021..16.445 rows=50000 loops=1) 72 Output: pt_10.yyyymm, pt_10.col1, pt_10.col2, pt_10.col3 73 Index Cond: (pt_10.col3 = 49) 74 Buffers: shared read=363 75 -> Index Scan using part_test_202311_col3_idx on part_test.part_table_202311 pt_11 (cost=0.29..1372.94 rows=49980 width=19) (actual time=0.030..14.975 rows=50000 loops=1) 76 Output: pt_11.yyyymm, pt_11.col1, pt_11.col2, pt_11.col3 77 Index Cond: (pt_11.col3 = 49) 78 Buffers: shared read=363 79 -> Index Scan using part_test_202312_col3_idx on part_test.part_table_202312 pt_12 (cost=0.29..1372.64 rows=49963 width=19) (actual time=0.045..16.592 rows=50000 loops=1) 80 Output: pt_12.yyyymm, pt_12.col1, pt_12.col2, pt_12.col3 81 Index Cond: (pt_12.col3 = 49) 82 Buffers: shared read=363 83 Planning: 84 Buffers: shared hit=8 85 Planning Time: 1.669 ms 86 Execution Time: 641.769 ms
だいたい100msくらい速くなりましたね。
ただこのSQLで問題なのは、part_tableのすべてのパーティションを検索していることです。
28~82行目が実際にpart_tableの子パーティションにアクセスしているところです。
これを見るとすべてのパーティションに対して、検索をかけていることが分かると思います。
それらをすべて見終わった後、結果をまとめるフェーズ(26行目のAppendの部分)にかかった時間を見るとおよそ300msかかっています。
クエリ実行時間が640msであることを考えると、大体半分の割合を占めているこの処理は重いと言えます。
よく考えるとMerge Joinではその結合方式枯らして、パーティションプルーニングが利用できません。これはHash Joinも同様です。
実際にヒント句を使ってHash Joinで結合してみます。(work_memはデフォルトの4MBに戻しています)
/*+ HashJoin(ph pt) */ SELECT * FROM purchase_history ph LEFT JOIN part_table pt ON pt.yyyymm = to_char(ph.purchase_date, 'yyyymm') AND pt.col3 = ph.item_id WHERE ph.purchase_date between '2023-02-01' and '2023-02-15' AND ph.item_id = 49;
pg_hint_plan
を有効化していない環境であれば、enable_nestloop
とenable_mergejoin
を無効化しましょう。
SET enable_nestloop = off; SET enable_mergejoin = off;
下記の実行計画が取得できました。
1 Hash Left Join (cost=28089.62..149636.81 rows=875196 width=39) (actual time=711.587..835.828 rows=156 loops=1) 2 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, pt.yyyymm, pt.col1, pt.col2, pt.col3 3 Hash Cond: ((ph.item_id = pt.col3) AND (to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text) = (pt.yyyymm)::text)) 4 Buffers: shared hit=6384 read=11484, temp read=4407 written=4407 5 -> Bitmap Heap Scan on part_test.purchase_history ph (cost=418.19..40278.53 rows=318 width=20) (actual time=6.848..88.646 rows=156 loops=1) 6 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date 7 Recheck Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 8 Filter: (ph.item_id = 49) 9 Rows Removed by Filter: 15516 10 Heap Blocks: exact=13034 11 Buffers: shared hit=1566 read=11484 12 -> Bitmap Index Scan on idx_purchase_history_purchase_date (cost=0.00..418.11 rows=30968 width=0) (actual time=2.065..2.066 rows=15672 loops=1) 13 Index Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 14 Buffers: shared hit=3 read=13 15 -> Hash (cost=19414.85..19414.85 rows=550438 width=19) (actual time=526.674..526.680 rows=550001 loops=1) 16 Output: pt.yyyymm, pt.col1, pt.col2, pt.col3 17 Buckets: 65536 (originally 65536) Batches: 64 (originally 16) Memory Usage: 3585kB 18 Buffers: shared hit=4818, temp written=2440 19 -> Append (cost=0.00..19414.85 rows=550438 width=19) (actual time=0.037..277.738 rows=550001 loops=1) 20 Buffers: shared hit=4818 21 -> Seq Scan on part_test.part_table_202301 pt_1 (cost=0.00..1887.00 rows=49697 width=19) (actual time=0.035..30.808 rows=50000 loops=1) 22 Output: pt_1.yyyymm, pt_1.col1, pt_1.col2, pt_1.col3 23 Filter: (pt_1.col3 = 49) 24 Rows Removed by Filter: 50000 25 Buffers: shared hit=637 26 -> Bitmap Heap Scan on part_test.part_table_202302 pt_2 (cost=4.31..12.00 rows=2 width=19) (actual time=0.055..0.057 rows=1 loops=1) 27 Output: pt_2.yyyymm, pt_2.col1, pt_2.col2, pt_2.col3 28 Recheck Cond: (pt_2.col3 = 49) 29 Heap Blocks: exact=1 30 Buffers: shared hit=3 31 -> Bitmap Index Scan on part_test_202302_col3_idx (cost=0.00..4.31 rows=2 width=0) (actual time=0.041..0.041 rows=1 loops=1) 32 Index Cond: (pt_2.col3 = 49) 33 Buffers: shared hit=2 34 -> Seq Scan on part_test.part_table_202303 pt_3 (cost=0.00..1887.00 rows=50583 width=19) (actual time=0.054..30.651 rows=50000 loops=1) 35 Output: pt_3.yyyymm, pt_3.col1, pt_3.col2, pt_3.col3 36 Filter: (pt_3.col3 = 49) 37 Rows Removed by Filter: 50000 38 Buffers: shared hit=637 39 -> Seq Scan on part_test.part_table_202304 pt_4 (cost=0.00..1887.00 rows=50023 width=19) (actual time=0.013..27.193 rows=50000 loops=1) 40 Output: pt_4.yyyymm, pt_4.col1, pt_4.col2, pt_4.col3 41 Filter: (pt_4.col3 = 49) 42 Rows Removed by Filter: 50000 43 Buffers: shared hit=637 44 -> Index Scan using part_test_202305_col3_idx on part_test.part_table_202305 pt_5 (cost=0.29..1378.32 rows=50230 width=19) (actual time=0.013..14.305 rows=50000 loops=1) 45 Output: pt_5.yyyymm, pt_5.col1, pt_5.col2, pt_5.col3 46 Index Cond: (pt_5.col3 = 49) 47 Buffers: shared hit=363 48 -> Index Scan using part_test_202306_col3_idx on part_test.part_table_202306 pt_6 (cost=0.29..1376.45 rows=50123 width=19) (actual time=0.042..13.395 rows=50000 loops=1) 49 Output: pt_6.yyyymm, pt_6.col1, pt_6.col2, pt_6.col3 50 Index Cond: (pt_6.col3 = 49) 51 Buffers: shared hit=363 52 -> Index Scan using part_test_202307_col3_idx on part_test.part_table_202307 pt_7 (cost=0.29..1369.32 rows=49830 width=19) (actual time=0.013..13.232 rows=50000 loops=1) 53 Output: pt_7.yyyymm, pt_7.col1, pt_7.col2, pt_7.col3 54 Index Cond: (pt_7.col3 = 49) 55 Buffers: shared hit=363 56 -> Index Scan using part_test_202308_col3_idx on part_test.part_table_202308 pt_8 (cost=0.29..1373.77 rows=50027 width=19) (actual time=0.010..13.714 rows=50000 loops=1) 57 Output: pt_8.yyyymm, pt_8.col1, pt_8.col2, pt_8.col3 58 Index Cond: (pt_8.col3 = 49) 59 Buffers: shared hit=363 60 -> Index Scan using part_test_202309_col3_idx on part_test.part_table_202309 pt_9 (cost=0.29..1376.22 rows=50110 width=19) (actual time=0.057..13.422 rows=50000 loops=1) 61 Output: pt_9.yyyymm, pt_9.col1, pt_9.col2, pt_9.col3 62 Index Cond: (pt_9.col3 = 49) 63 Buffers: shared hit=363 64 -> Index Scan using part_test_202310_col3_idx on part_test.part_table_202310 pt_10 (cost=0.29..1370.02 rows=49870 width=19) (actual time=0.014..13.408 rows=50000 loops=1) 65 Output: pt_10.yyyymm, pt_10.col1, pt_10.col2, pt_10.col3 66 Index Cond: (pt_10.col3 = 49) 67 Buffers: shared hit=363 68 -> Index Scan using part_test_202311_col3_idx on part_test.part_table_202311 pt_11 (cost=0.29..1372.94 rows=49980 width=19) (actual time=0.067..13.069 rows=50000 loops=1) 69 Output: pt_11.yyyymm, pt_11.col1, pt_11.col2, pt_11.col3 70 Index Cond: (pt_11.col3 = 49) 71 Buffers: shared hit=363 72 -> Index Scan using part_test_202312_col3_idx on part_test.part_table_202312 pt_12 (cost=0.29..1372.64 rows=49963 width=19) (actual time=0.010..13.347 rows=50000 loops=1) 73 Output: pt_12.yyyymm, pt_12.col1, pt_12.col2, pt_12.col3 74 Index Cond: (pt_12.col3 = 49) 75 Buffers: shared hit=363 76 Planning: 77 Buffers: shared hit=8 78 Planning Time: 1.744 ms 79 Execution Time: 836.125 ms
21~75行目を見ると、やはりこちらもすべてのパーティションに対して検索しているようです。
17行目あたりを見ると、work_mem
を増やせば少しは性能改善ができそうな気はしますね。(今回はやりません)
実行時間を見ると約840msでwork_memを増やす前のMerge Joinよりも若干遅いですね。
さあ、ここで真打登場。Nested Loop Join(NLJ)を使ってJOINをしてみましょう。
/*+ NestLoop(ph pt) */ SELECT * FROM purchase_history ph LEFT JOIN part_table pt ON pt.yyyymm = to_char(ph.purchase_date, 'yyyymm') AND pt.col3 = ph.item_id WHERE ph.purchase_date between '2023-02-01' and '2023-02-15' AND ph.item_id = 49;
次の実行計画が取れました。
1 Gather (cost=1418.48..1264124.61 rows=875196 width=39) (actual time=12.900..46.196 rows=156 loops=1) 2 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, pt.yyyymm, pt.col1, pt.col2, pt.col3 3 Workers Planned: 2 4 Workers Launched: 2 5 Buffers: shared hit=13462 read=58 6 -> Nested Loop Left Join (cost=418.48..1175605.01 rows=364665 width=39) (actual time=5.691..23.056 rows=52 loops=3) 7 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date, pt.yyyymm, pt.col1, pt.col2, pt.col3 8 Buffers: shared hit=13462 read=58 9 Worker 0: actual time=0.701..15.192 rows=20 loops=1 10 Buffers: shared hit=1518 read=1 11 Worker 1: actual time=5.040..16.542 rows=18 loops=1 12 Buffers: shared hit=1727 13 -> Parallel Bitmap Heap Scan on part_test.purchase_history ph (cost=418.19..39962.40 rows=132 width=20) (actual time=5.392..21.900 rows=52 loops=3) 14 Output: ph.purchase_id, ph.item_id, ph.store_id, ph.purchase_cnt, ph.purchase_date 15 Recheck Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 16 Filter: (ph.item_id = 49) 17 Rows Removed by Filter: 5172 18 Heap Blocks: exact=9904 19 Buffers: shared hit=12995 read=55 20 Worker 0: actual time=0.376..14.547 rows=20 loops=1 21 Buffers: shared hit=1458 22 Worker 1: actual time=4.580..15.862 rows=18 loops=1 23 Buffers: shared hit=1672 24 -> Bitmap Index Scan on idx_purchase_history_purchase_date (cost=0.00..418.11 rows=30968 width=0) (actual time=4.599..4.599 rows=15672 loops=1) 25 Index Cond: ((ph.purchase_date >= '2023-02-01'::date) AND (ph.purchase_date <= '2023-02-15'::date)) 26 Buffers: shared hit=3 read=13 27 -> Append (cost=0.29..6602.59 rows=200076 width=19) (actual time=0.017..0.018 rows=1 loops=156) 28 Buffers: shared hit=467 read=3 29 Worker 0: actual time=0.025..0.026 rows=1 loops=20 30 Buffers: shared hit=60 read=1 31 Worker 1: actual time=0.032..0.033 rows=1 loops=18 32 Buffers: shared hit=55 33 -> Index Scan using part_test_202301_col3_idx on part_test.part_table_202301 pt_1 (cost=0.29..6.17 rows=2 width=19) (never executed) 34 Output: pt_1.yyyymm, pt_1.col1, pt_1.col2, pt_1.col3 35 Index Cond: ((pt_1.col3 = ph.item_id) AND (pt_1.col3 = 49)) 36 Filter: ((pt_1.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 37 -> Index Scan using part_test_202302_col3_idx on part_test.part_table_202302 pt_2 (cost=0.29..5.94 rows=1 width=19) (actual time=0.008..0.008 rows=1 loops=156) 38 Output: pt_2.yyyymm, pt_2.col1, pt_2.col2, pt_2.col3 39 Index Cond: ((pt_2.col3 = ph.item_id) AND (pt_2.col3 = 49)) 40 Filter: ((pt_2.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 41 Buffers: shared hit=467 read=3 42 Worker 0: actual time=0.006..0.007 rows=1 loops=20 43 Buffers: shared hit=60 read=1 44 Worker 1: actual time=0.008..0.008 rows=1 loops=18 45 Buffers: shared hit=55 46 -> Index Scan using part_test_202303_col3_idx on part_test.part_table_202303 pt_3 (cost=0.29..6.52 rows=3 width=19) (never executed) 47 Output: pt_3.yyyymm, pt_3.col1, pt_3.col2, pt_3.col3 48 Index Cond: ((pt_3.col3 = ph.item_id) AND (pt_3.col3 = 49)) 49 Filter: ((pt_3.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 50 -> Index Scan using part_test_202304_col3_idx on part_test.part_table_202304 pt_4 (cost=0.29..6.55 rows=3 width=19) (never executed) 51 Output: pt_4.yyyymm, pt_4.col1, pt_4.col2, pt_4.col3 52 Index Cond: ((pt_4.col3 = ph.item_id) AND (pt_4.col3 = 49)) 53 Filter: ((pt_4.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 54 -> Index Scan using part_test_202305_col3_idx on part_test.part_table_202305 pt_5 (cost=0.29..700.06 rows=25115 width=19) (never executed) 55 Output: pt_5.yyyymm, pt_5.col1, pt_5.col2, pt_5.col3 56 Index Cond: ((pt_5.col3 = ph.item_id) AND (pt_5.col3 = 49)) 57 Filter: ((pt_5.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 58 -> Index Scan using part_test_202306_col3_idx on part_test.part_table_202306 pt_6 (cost=0.29..698.60 rows=25062 width=19) (never executed) 59 Output: pt_6.yyyymm, pt_6.col1, pt_6.col2, pt_6.col3 60 Index Cond: ((pt_6.col3 = ph.item_id) AND (pt_6.col3 = 49)) 61 Filter: ((pt_6.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 62 -> Index Scan using part_test_202307_col3_idx on part_test.part_table_202307 pt_7 (cost=0.29..694.56 rows=24915 width=19) (never executed) 63 Output: pt_7.yyyymm, pt_7.col1, pt_7.col2, pt_7.col3 64 Index Cond: ((pt_7.col3 = ph.item_id) AND (pt_7.col3 = 49)) 65 Filter: ((pt_7.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 66 -> Index Scan using part_test_202308_col3_idx on part_test.part_table_202308 pt_8 (cost=0.29..697.26 rows=25013 width=19) (never executed) 67 Output: pt_8.yyyymm, pt_8.col1, pt_8.col2, pt_8.col3 68 Index Cond: ((pt_8.col3 = ph.item_id) AND (pt_8.col3 = 49)) 69 Filter: ((pt_8.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 70 -> Index Scan using part_test_202309_col3_idx on part_test.part_table_202309 pt_9 (cost=0.29..698.41 rows=25055 width=19) (never executed) 71 Output: pt_9.yyyymm, pt_9.col1, pt_9.col2, pt_9.col3 72 Index Cond: ((pt_9.col3 = ph.item_id) AND (pt_9.col3 = 49)) 73 Filter: ((pt_9.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 74 -> Index Scan using part_test_202310_col3_idx on part_test.part_table_202310 pt_10 (cost=0.29..695.11 rows=24935 width=19) (never executed) 75 Output: pt_10.yyyymm, pt_10.col1, pt_10.col2, pt_10.col3 76 Index Cond: ((pt_10.col3 = ph.item_id) AND (pt_10.col3 = 49)) 77 Filter: ((pt_10.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 78 -> Index Scan using part_test_202311_col3_idx on part_test.part_table_202311 pt_11 (cost=0.29..696.62 rows=24990 width=19) (never executed) 79 Output: pt_11.yyyymm, pt_11.col1, pt_11.col2, pt_11.col3 80 Index Cond: ((pt_11.col3 = ph.item_id) AND (pt_11.col3 = 49)) 81 Filter: ((pt_11.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 82 -> Index Scan using part_test_202312_col3_idx on part_test.part_table_202312 pt_12 (cost=0.29..696.40 rows=24982 width=19) (never executed) 83 Output: pt_12.yyyymm, pt_12.col1, pt_12.col2, pt_12.col3 84 Index Cond: ((pt_12.col3 = ph.item_id) AND (pt_12.col3 = 49)) 85 Filter: ((pt_12.yyyymm)::text = to_char((ph.purchase_date)::timestamp with time zone, 'yyyymm'::text)) 86 Planning: 87 Buffers: shared hit=8 88 Planning Time: 1.111 ms 89 Execution Time: 46.365 ms
46ms、速い!!爆速です!!
原因を探ってみましょう。
注目すべきは33~85行目のpart_table
へアクセスしている部分です。
part_table_202302
以外つまり、検索対象以外の子パーティションについては(never executed)
の表示があります。
これは条件からこの子パーティションへアクセスする必要はないと、オプティマイザが判断している証拠です。
つまり、Merge JoinやHash Joinではその結合方式から不可能だったパーティションプルーニングが、NLJではでてきしまうわけですね。
その結果27行目のAppendも0.018msと高速に処理できていることが分かります。
ただし注意が必要です。この0.018msというのは1ループ当たりの平均実行時間です。
NLJはforループのようなもので、外側の行数分のループが発生します。これはloops=156
を見れば分かります。
つまり実際には0.018ms x 156 ≒ 2.8msほどかかっているというわけです。それでも十分速いですね。
逆に考えるとこの1回の処理が遅かったり、ループ数が極端に多かったりするとNLJは遅くなるということです。
たとえ1回あたりの処理が0.01msだったとしても、ループ回数が100万回あれば10秒になります。
NLJを行う場合は、ループの回数を小さくする(=外側の行数を減らす)ことと、1回あたりの実行時間を減らすことが大切だと分かります。
前者を達成するためには、外側のテーブルに対して適切なインデックスを設定することが効果的です。
今回はpurchase_date
にしかインデックスを設定していませんが、今回のクエリではおそらく(item_id
, purchase_date
)というインデックスがあったほうがより効果的だったでしょう。
後者を達成するためには、結合条件にインデックスを設定することが重要です。今回はcol3
がまさにその部分に当たります。
それにしても、これだけ実行時間に差が生まれるにもかかわらずPostgreSQLはMerge Joinを選んでしまうんですね。(VACUUM FULL+ANALAYZEした後でも同じ結果だった)
ヒント句で実行計画を固定するのはデメリットもありますが、こういう場合は積極的に活用すべきでしょう。
余談ですが、よくMySQLはNLJしかできないからJOINが遅いと言われていました*1が、正しい使い方をすればNLJだって高速に動くんです。