Skip to content
代码片段 群组 项目
提交 cf14f419 编辑于 作者: Marius Bobin's avatar Marius Bobin 提交者: Grzegorz Bizon
浏览文件

Add foreign key findings to CI data decay

上级 e43c6d12
No related branches found
No related tags found
无相关合并请求
...@@ -323,9 +323,116 @@ scope block takes an argument). Preloading instance dependent scopes is not ...@@ -323,9 +323,116 @@ scope block takes an argument). Preloading instance dependent scopes is not
supported. supported.
``` ```
We also need to build a proof of concept for removing data on the PostgreSQL ### Foreign keys
side (using foreign keys with `ON DELETE CASCADE`) and removing data through
Rails associations, as this might be an important area of uncertainty. Foreign keys must reference columns that either are a primary key or form a
unique constraint. We can define them using these strategies:
#### Between routing tables sharing partition ID
For relations that are part of the same pipeline hierarchy it is possible to
share the `partition_id` column to define the foreign key constraint:
```plaintext
p_ci_pipelines:
- id
- partition_id
p_ci_builds:
- id
- partition_id
- pipeline_id
```
In this case, `p_ci_builds.partition_id` indicates the partition for the build
and also for the pipeline. We can add a FK on the routing table using:
```sql
ALTER TABLE ONLY p_ci_builds
ADD CONSTRAINT fk_on_pipeline_and_partition
FOREIGN KEY (pipeline_id, partition_id)
REFERENCES p_ci_pipelines(id, partition_id) ON DELETE CASCADE;
```
#### Between routing tables with different partition IDs
It's not possible to reuse the `partition_id` for all relations in the CI domain,
so in this case we'll need to store the value as a different attribute. For
example, when canceling redundant pipelines we store on the old pipeline row
the ID of the new pipeline that cancelled it as `auto_canceled_by_id`:
```plaintext
p_ci_pipelines:
- id
- partition_id
- auto_canceled_by_id
- auto_canceled_by_partition_id
```
In this case we can't ensure that the canceling pipeline is part of the same
hierarchy as the canceled pipelines, so we need an extra attribute to store its
partition, `auto_canceled_by_partition_id`, and the FK becomes:
```sql
ALTER TABLE ONLY p_ci_pipelines
ADD CONSTRAINT fk_cancel_redundant_pieplines
FOREIGN KEY (auto_canceled_by_id, auto_canceled_by_partition_id)
REFERENCES p_ci_pipelines(id, partition_id) ON DELETE SET NULL;
```
#### Between routing tables and regular tables
Not all of the tables in the CI domain will be partitioned, so we'll have routing
tables that will reference non-partitioned tables, for example we reference
`external_pull_requests` from `ci_pipelines`:
```sql
FOREIGN KEY (external_pull_request_id)
REFERENCES external_pull_requests(id)
ON DELETE SET NULL
```
In this case we only need to move the FK definition from the partition level
to the routing table so that new pipeline partitions may use it:
```sql
ALTER TABLE p_ci_pipelines
ADD CONSTRAINT fk_external_request
FOREIGN KEY (external_pull_request_id)
REFERENCES external_pull_requests(id) ON DELETE SET NULL;
```
#### Between regular tables and routing tables
Most of the tables from the CI domain reference at least one table that will be
turned into a routing tables, for example `ci_pipeline_messages` references
`ci_pipelines`. These definitions will need to be updated to use the routing
tables and for this they will need a `partition_id` column:
```plaintext
p_ci_pipelines:
- id
- partition_id
ci_pipeline_messages:
- id
- pipeline_id
- pipeline_partition_id
```
The foreign key can be defined by using:
```sql
ALTER TABLE ci_pipeline_messages ADD CONSTRAINT fk_pipeline_partitioned
FOREIGN KEY (pipeline_id, pipeline_partition_id)
REFERENCES p_ci_pipelines(id, partition_id) ON DELETE CASCADE;
```
The old FK definition will need to be removed, otherwise new inserts in the
`ci_pipeline_messages` with pipeline IDs from non-zero partition will fail with
reference errors.
### Indexes
We [learned](https://gitlab.com/gitlab-org/gitlab/-/issues/360148) that `PostgreSQL` We [learned](https://gitlab.com/gitlab-org/gitlab/-/issues/360148) that `PostgreSQL`
does not allow to create a single index (unique or otherwise) across all partitions of a table. does not allow to create a single index (unique or otherwise) across all partitions of a table.
......
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册