I’ve a ~1 billion row desk referred to as
issues. The issues transition by a number of states (tracked within the
state column), finally ending up in both a
completed rows nearly by no means change, however we have to learn them now and again. The info may be very chilly. Rows in different states change so much, in order that they result in numerous bloat and autovacuums.
We’re utilizing postgres 12, so this looks as if a great case for declarative partitioning. It seems like a small-hot desk for energetic issues is smart, then transfer the rows to bigger chilly partitions as soon as they cease altering.
This is a simplified construction we have thought-about:
CREATE SEQUENCE things_id_seq START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE CACHE 1; CREATE TABLE issues ( id integer NOT NULL DEFAULT nextval('things_id_seq'), account_id integer NOT NULL, state textual content NOT NULL, worth integer NOT NULL ) PARTITION BY LIST (state); CREATE TABLE things_default PARTITION OF issues DEFAULT; CREATE TABLE things_finished PARTITION OF issues FOR VALUES IN ('completed') PARTITION BY LIST (account_id); CREATE TABLE things_finished_default PARTITION OF things_finished DEFAULT; CREATE TABLE things_finished_account_id_1 PARTITION OF things_finished FOR VALUES IN (1); CREATE TABLE things_finished_account_id_2 PARTITION OF things_finished FOR VALUES IN (2); CREATE TABLE things_broken PARTITION OF issues FOR VALUES IN ('damaged'); CREATE INDEX things_id ON issues USING btree (id); CREATE INDEX things_state ON issues USING btree (state); CREATE INDEX things_account ON issues USING btree (account_id); CREATE INDEX things_value ON issues USING btree (worth);
Over half of the issues are in a
completed state, in order that chilly partition remains to be fairly massive. To maintain it from rising too massive, we have included sub-partitions for some massive accounts.
Is possible it’s to re-partition if we have to regulate the construction down the monitor, or do we have to get the partitioning scheme proper at the beginning? At present we handle to use practically all schema modifications with no user-visible downtime – is it potential to re-partition with out long-held locks that influence person site visitors?
For instance, what if I wish to add a brand new sub-partition to the things_finished desk as a result of an account has grown fairly massive? This appears to work:
BEGIN; CREATE TABLE things_finished_account_id_3 ( like issues together with all ); INSERT INTO things_finished_account_id_3 SELECT * FROM issues WHERE state = 'completed' and account_id = 3; DELETE FROM issues WHERE state = 'completed' AND account_id = 3; ALTER TABLE things_finished ATTACH PARTITION things_finished_account_id_3 FOR VALUES IN (3); COMMIT;
ALTER TABLE takes out a
AccessExclusiveLock on the
things_finished_default desk, blocking all reads and writes to that partition for the final a part of the transaction.
Perhaps another partitioning scheme can be simpler to re-partition on a dwell database?