filmov
tv
Webinar recording: How to tune Postgres autovacuum to improve performance and reduce bloat
Показать описание
Find the presentation slide deck and all other shared material below!
On November 16th, 2022, we hosted our webinar on how to tune Postgres autovacuum settings to ensure that VACUUM runs at the right time, and is effective for your workload. We also talked about the two main goals of VACUUM: Preventing table bloat and avoiding Transaction ID and MultiXact wraparound.
In detail, we talked about
1. the physicality of tables
2. The Transaction ID clock
3. All-visible pages and slow queries
4. Understanding autovacuum scheduling
5. Cost delay 101
6. Aggressive and Failsafe VACUUMs
7. Dead Tuples Not Yet Removable
8. VACUUMing in the cloud - RDS, Aurora, Cloud SQL & AlloyDB
9. Estimating and fixing table bloat
We were happy to answer over 50 questions in the Q&A session and the post-webinar follow-up emails. Come join us live for our next webinar!
Articles and material we talked about in the webinar (in chronological order):
- - - - - - - - - - - - - -
Following, we are sharing some of the responses to the questions we could not fit into our live Q&A session with everyone as they might prove to be useful to all of you, not just the original questioner:
- - -
"Does that improve the duration of autovacuum if we change the scale to 0 and set the threshold to 10,000 rows?" - by Prateej G.
Changing the scale factor and threshold does not impact the runtime of a particular VACUUM - it only impacts the scheduling of the VACUUM. To influence the runtime you would need to adjust the cost delay / cost limit settings.
- - -
- - -
"How should we set autovacuum settings for partitioned tables?" - by Mahesh S.
Generally, autovacuum will run on each table partition individually, so its more a question of the workload, rather than the table being a partitioned table. That said, a common pattern is to use partitioning for append-only tables. If you can, make sure you run on Postgres 13 or newer, so you have the insert-based threshold vacuums. On older Postgres versions its otherwise likely you would only get anti-wraparound vacuums on append-only tables.
- - -
- - -
"Is it possible to track/monitor autovacuum frequency on RDS Postgres?" - by Abdul M.
Yes, you can track autovacuum frequency either through the logs, or through the pg_stat_user_tables statistics table. We also have this built-in to pganalyze (which integrates directly with Amazon RDS).
- - -
- - -
"We have a large table with a lot of dead tuples. When this table was not used by others, we ran vacuum full. But dead tuples are still there. What should the problem be?" - by Tuğçe D.
As discussed, you may be running into Dead Tuples Not Yet Removable. I would try a VACUUM VERBOSE (without FULL) and look at the log output to see if these dead tuples are considered "not yet removable". If they are, look for long running transactions, open replication slots, etc.
- - -
- - -
"Are there alternatives to pg_repack that don't require a command line tool (that you would recommend)?" - by Todd B.
You could look into pg_squeeze - its newer (and so not as well tested as pg_repack), but works solely as a Postgres extension, and uses logical replication instead of triggers for keeping a log of changes during the rewrite.
- - -
On November 16th, 2022, we hosted our webinar on how to tune Postgres autovacuum settings to ensure that VACUUM runs at the right time, and is effective for your workload. We also talked about the two main goals of VACUUM: Preventing table bloat and avoiding Transaction ID and MultiXact wraparound.
In detail, we talked about
1. the physicality of tables
2. The Transaction ID clock
3. All-visible pages and slow queries
4. Understanding autovacuum scheduling
5. Cost delay 101
6. Aggressive and Failsafe VACUUMs
7. Dead Tuples Not Yet Removable
8. VACUUMing in the cloud - RDS, Aurora, Cloud SQL & AlloyDB
9. Estimating and fixing table bloat
We were happy to answer over 50 questions in the Q&A session and the post-webinar follow-up emails. Come join us live for our next webinar!
Articles and material we talked about in the webinar (in chronological order):
- - - - - - - - - - - - - -
Following, we are sharing some of the responses to the questions we could not fit into our live Q&A session with everyone as they might prove to be useful to all of you, not just the original questioner:
- - -
"Does that improve the duration of autovacuum if we change the scale to 0 and set the threshold to 10,000 rows?" - by Prateej G.
Changing the scale factor and threshold does not impact the runtime of a particular VACUUM - it only impacts the scheduling of the VACUUM. To influence the runtime you would need to adjust the cost delay / cost limit settings.
- - -
- - -
"How should we set autovacuum settings for partitioned tables?" - by Mahesh S.
Generally, autovacuum will run on each table partition individually, so its more a question of the workload, rather than the table being a partitioned table. That said, a common pattern is to use partitioning for append-only tables. If you can, make sure you run on Postgres 13 or newer, so you have the insert-based threshold vacuums. On older Postgres versions its otherwise likely you would only get anti-wraparound vacuums on append-only tables.
- - -
- - -
"Is it possible to track/monitor autovacuum frequency on RDS Postgres?" - by Abdul M.
Yes, you can track autovacuum frequency either through the logs, or through the pg_stat_user_tables statistics table. We also have this built-in to pganalyze (which integrates directly with Amazon RDS).
- - -
- - -
"We have a large table with a lot of dead tuples. When this table was not used by others, we ran vacuum full. But dead tuples are still there. What should the problem be?" - by Tuğçe D.
As discussed, you may be running into Dead Tuples Not Yet Removable. I would try a VACUUM VERBOSE (without FULL) and look at the log output to see if these dead tuples are considered "not yet removable". If they are, look for long running transactions, open replication slots, etc.
- - -
- - -
"Are there alternatives to pg_repack that don't require a command line tool (that you would recommend)?" - by Todd B.
You could look into pg_squeeze - its newer (and so not as well tested as pg_repack), but works solely as a Postgres extension, and uses logical replication instead of triggers for keeping a log of changes during the rewrite.
- - -
Комментарии