Skip to content
代码片段 群组 项目
未验证 提交 7c063127 编辑于 作者: Igor's avatar Igor 提交者: GitLab
浏览文件

Fix assets caching in scheduled cache-assets:production job

One of the optimizations for the build and deploy process is to cache
assets as a generic package that can then be consumed by the build
process.

Assets in this context refers to frontend assets built by the
gitlab:assets:compile rake task, which calls out to yarn. We compute a
cached-assets-hash over all frontend files. If none of these source
files changed, the build can reuse the previously compiled assets and
save approximately 40 minutes of build time.

The way this process is intended to work is via a scheduled pipeline on
gitlab-org/gitlab that runs every 2 hours. It checks the
cached-assets-hash, if no package exists, it builds an assets package
and publishes it to the package registry on gitlab-org/gitlab.

This logic was introduced by https://gitlab.com/gitlab-org/gitlab/-/merge_requests/96297. It was most recently updated by https://gitlab.com/gitlab-org/gitlab/-/merge_requests/179950.

That MR introduced a subtle bug: By changing the order of setting
`$GITLAB_ASSETS_HASH` and including
`scripts/gitlab_component_helpers.sh`, that helper library no longer is
able to consume the `$GITLAB_ASSETS_HASH` and instead defaults to the
string `"NO_HASH"`.

There is no logic to fail, when no hash is supplied. And so we compute
a package URL containing the string `NO_HASH`. The job then publishes a
package to that URL, and on the next run it will skip re-compiling
assets, because there already is a package present under `NO_HASH`.

The current cached assets package is 9 days old:

```
$ curl -I https://gitlab.com/api/v4/projects/278964/packages/generic/assets/production-ee-NO_HASH/assets-production-ee-NO_HASH-v2.tar.gz

last-modified: Wed, 05 Feb 2025 22:06:11 GMT
```

The saving grace is that this bug was only introduced for the scheduled
job, and not for the jobs consuming that cache. Thus we avoid building
and deploying omnibus packages or CNG images which contain a stale
cache. We got lucky here.

The only real consequence is that we no longer get any cache hits, so
the build process will always need to rebuild assets, even if none
changed. This was surfaced as part of
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/19280.

This patch fixes the bug by re-introducing the original order. This
allows the cache-assets:production job to produce valid assets cache
packages again, which will speed up builds and deploys in cases where no
assets were changed, which is crucial for rolling forward urgent fixes,
as it cuts 40m from time-to-production.

Additional measures we should consider for more safety:

- Check for NO_HASH and bail out.
- After downloading an assets archive, validate the contained
  cached-assets-hash against the one from the filesystem.
上级 aff8c129
No related branches found
No related tags found
无相关合并请求
......@@ -38,8 +38,11 @@ cache-workhorse:
- |
function cache_assets() {
yarn_install_script
source scripts/gitlab_component_helpers.sh
# GITLAB_ASSETS_HASH must be defined before loading scripts/gitlab_component_helpers.sh
export GITLAB_ASSETS_HASH=$(bundle exec rake gitlab:assets:hash_sum)
source scripts/gitlab_component_helpers.sh
gitlab_assets_archive_doesnt_exist || { echoinfo "INFO: Exiting early as package exists."; exit 0; }
assets_compile_script
echo -n "${GITLAB_ASSETS_HASH}" > "cached-assets-hash.txt"
......
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册