Skip to content
GitLab
菜单
为什么选择 GitLab
定价
联系销售
探索
为什么选择 GitLab
定价
联系销售
探索
登录
获取免费试用
主导航
搜索或转到…
项目
GitLab
管理
动态
成员
标记
计划
议题
议题看板
里程碑
迭代
需求
代码
合并请求
仓库
分支
提交
标签
仓库图
比较修订版本
代码片段
锁定的文件
构建
流水线
作业
流水线计划
测试用例
产物
部署
发布
Package registry
Container registry
模型注册表
运维
环境
Terraform 模块
监控
事件
服务台
分析
价值流分析
贡献者分析
CI/CD 分析
仓库分析
代码评审分析
议题分析
洞察
模型实验
效能分析
帮助
帮助
支持
GitLab 文档
比较 GitLab 各版本
社区论坛
为极狐GitLab 提交贡献
提交反馈
隐私声明
快捷键
?
新增功能
4
代码片段
群组
项目
显示更多面包屑
gitlab-cn
GitLab
提交
cf14f419
提交
cf14f419
编辑于
2 years ago
作者:
Marius Bobin
提交者:
Grzegorz Bizon
2 years ago
浏览文件
操作
下载
补丁
差异文件
Add foreign key findings to CI data decay
上级
e43c6d12
No related branches found
No related tags found
无相关合并请求
变更
1
隐藏空白变更内容
行内
左右并排
显示
1 个更改的文件
doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md
+110
-3
110 个添加, 3 个删除
...tecture/blueprints/ci_data_decay/pipeline_partitioning.md
有
110 个添加
和
3 个删除
doc/architecture/blueprints/ci_data_decay/pipeline_partitioning.md
+
110
−
3
浏览文件 @
cf14f419
...
@@ -323,9 +323,116 @@ scope block takes an argument). Preloading instance dependent scopes is not
...
@@ -323,9 +323,116 @@ scope block takes an argument). Preloading instance dependent scopes is not
supported.
supported.
```
```
We also need to build a proof of concept for removing data on the PostgreSQL
### Foreign keys
side (using foreign keys with
`ON DELETE CASCADE`
) and removing data through
Rails associations, as this might be an important area of uncertainty.
Foreign keys must reference columns that either are a primary key or form a
unique constraint. We can define them using these strategies:
#### Between routing tables sharing partition ID
For relations that are part of the same pipeline hierarchy it is possible to
share the
`partition_id`
column to define the foreign key constraint:
```
plaintext
p_ci_pipelines:
- id
- partition_id
p_ci_builds:
- id
- partition_id
- pipeline_id
```
In this case,
`p_ci_builds.partition_id`
indicates the partition for the build
and also for the pipeline. We can add a FK on the routing table using:
```
sql
ALTER
TABLE
ONLY
p_ci_builds
ADD
CONSTRAINT
fk_on_pipeline_and_partition
FOREIGN
KEY
(
pipeline_id
,
partition_id
)
REFERENCES
p_ci_pipelines
(
id
,
partition_id
)
ON
DELETE
CASCADE
;
```
#### Between routing tables with different partition IDs
It's not possible to reuse the
`partition_id`
for all relations in the CI domain,
so in this case we'll need to store the value as a different attribute. For
example, when canceling redundant pipelines we store on the old pipeline row
the ID of the new pipeline that cancelled it as
`auto_canceled_by_id`
:
```
plaintext
p_ci_pipelines:
- id
- partition_id
- auto_canceled_by_id
- auto_canceled_by_partition_id
```
In this case we can't ensure that the canceling pipeline is part of the same
hierarchy as the canceled pipelines, so we need an extra attribute to store its
partition,
`auto_canceled_by_partition_id`
, and the FK becomes:
```
sql
ALTER
TABLE
ONLY
p_ci_pipelines
ADD
CONSTRAINT
fk_cancel_redundant_pieplines
FOREIGN
KEY
(
auto_canceled_by_id
,
auto_canceled_by_partition_id
)
REFERENCES
p_ci_pipelines
(
id
,
partition_id
)
ON
DELETE
SET
NULL
;
```
#### Between routing tables and regular tables
Not all of the tables in the CI domain will be partitioned, so we'll have routing
tables that will reference non-partitioned tables, for example we reference
`external_pull_requests`
from
`ci_pipelines`
:
```
sql
FOREIGN
KEY
(
external_pull_request_id
)
REFERENCES
external_pull_requests
(
id
)
ON
DELETE
SET
NULL
```
In this case we only need to move the FK definition from the partition level
to the routing table so that new pipeline partitions may use it:
```
sql
ALTER
TABLE
p_ci_pipelines
ADD
CONSTRAINT
fk_external_request
FOREIGN
KEY
(
external_pull_request_id
)
REFERENCES
external_pull_requests
(
id
)
ON
DELETE
SET
NULL
;
```
#### Between regular tables and routing tables
Most of the tables from the CI domain reference at least one table that will be
turned into a routing tables, for example
`ci_pipeline_messages`
references
`ci_pipelines`
. These definitions will need to be updated to use the routing
tables and for this they will need a
`partition_id`
column:
```
plaintext
p_ci_pipelines:
- id
- partition_id
ci_pipeline_messages:
- id
- pipeline_id
- pipeline_partition_id
```
The foreign key can be defined by using:
```
sql
ALTER
TABLE
ci_pipeline_messages
ADD
CONSTRAINT
fk_pipeline_partitioned
FOREIGN
KEY
(
pipeline_id
,
pipeline_partition_id
)
REFERENCES
p_ci_pipelines
(
id
,
partition_id
)
ON
DELETE
CASCADE
;
```
The old FK definition will need to be removed, otherwise new inserts in the
`ci_pipeline_messages`
with pipeline IDs from non-zero partition will fail with
reference errors.
### Indexes
We
[
learned
](
https://gitlab.com/gitlab-org/gitlab/-/issues/360148
)
that
`PostgreSQL`
We
[
learned
](
https://gitlab.com/gitlab-org/gitlab/-/issues/360148
)
that
`PostgreSQL`
does not allow to create a single index (unique or otherwise) across all partitions of a table.
does not allow to create a single index (unique or otherwise) across all partitions of a table.
...
...
此差异已折叠。
点击以展开。
预览
0%
加载中
请重试
或
添加新附件
.
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
保存评论
取消
想要评论请
注册
或
登录