Are we really needed a smart statement timeout?

Intro

If we make 'JOB' benchmark results with specific preferences (script):

max_parallel_workers_per_gather = 0
statement_timeout = 600000

we will see result on the graph below (gnuplot template, raw data):

As you can see, we have 22 queries, that can't be executed in reasonable time: 14a.sql, 14c.sql, 19a.sql, 19c.sql, 21a.sql, 21b.sql, 22a.sql, 22b.sql, 22c.sql, 22d.sql, 25a.sql, 25c.sql, 28b.sql, 29a.sql, 29b.sql, 29c.sql, 30c.sql, 31a.sql, 31b.sql, 31c.sql, 7a.sql, 7b.sql.

This Situation looks better with the parallel workers machinery usage, but for clarity we turned it off. What if we had a machinery to learn on a partial query plan, interrupted by a statement_timeout? We should have a chance to recognize situations, when we were interrupted under AQO control and have an option to make one more attempt with, possible, different query plan.

Use learning on partial plans

After enabling the partial learning feature (aqo.learn_query_statement = 'on') we got such a picture for timed out queries:

Test Number	Query	Execution time, s	Interrupts
15	14a.sql	8	1
17	14c.sql	8	1
32	18a.sql	13	0+1
37	19c.sql	18	0+1
46	21a.sql	7	3+1
47	21b.sql	5	1
49	22a.sql	7	3+1+1
51	22c.sql	418	7
52	22d.sql	10	3+2
58	25a.sql	12	1+1
60	25c.sql	16	4
67	28a.sql	306	1
70	29a.sql	14	0+1+1
72	29c.sql	14	0+2+2+2+1
77	30a.sql	13	1+1
78	30b.sql	13	1
79	30c.sql	15	1+3+1
80	31a.sql	14	1+5+2
81	31b.sql	86	1+1+1
82	31c.sql	15	3+5+1
103	7a.sql	9	1+1

A set of timed out queries slightly changed (why?). All queries fit into 10min limit now. Interruptions in the table - number of learnings on timed out queries. Learning process on some queries increased execution time. So, after successful execution we observe interruptions too. In the table we use '+' to show the fact of timed out query after successful execution. Also, for the most part of queries optimizer has found a good solution.

Speedup factor as a result of AQO learning procedure shown at the picture below. PG vs AQO comparison

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are we really needed a smart statement timeout?

Intro

Use learning on partial plans

Clone this wiki locally