diff --git a/doc/ci/caching/index.md b/doc/ci/caching/index.md index 3c9a796a9f2a5637e09bd41581e93ec66ed0d67b..136c6c282dfab6453e4ef26658f6de4e561acccf 100644 --- a/doc/ci/caching/index.md +++ b/doc/ci/caching/index.md @@ -57,55 +57,69 @@ For runners to work with caches efficiently, you must do one of the following: - Use multiple runners that have [distributed caching](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching), where the cache is stored in S3 buckets. Shared runners on GitLab.com behave this way. These runners can be in autoscale mode, - but they don't have to be. + but they don't have to be. - Use multiple runners with the same architecture and have these runners share a common network-mounted directory to store the cache. This directory should use NFS or something similar. - These runners must be in autoscale mode. + These runners must be in autoscale mode. -### Share caches between jobs in the same branch - -To have jobs for each branch use the same cache, define a cache with the `key: ${CI_COMMIT_REF_SLUG}`: - -```yaml -cache: - key: ${CI_COMMIT_REF_SLUG} -``` +## Use multiple caches -This configuration prevents you from accidentally overwriting the cache. However, the -first pipeline for a merge request is slow. The next time a commit is pushed to the branch, the -cache is re-used and jobs run faster. +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/32814) in GitLab 13.10. +> - [Feature Flag removed](https://gitlab.com/gitlab-org/gitlab/-/issues/321877), in GitLab 13.12. -To enable per-job and per-branch caching: +You can have a maximum of four caches: ```yaml -cache: - key: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG" +test-job: + stage: build + cache: + - key: + files: + - Gemfile.lock + paths: + - vendor/ruby + - key: + files: + - yarn.lock + paths: + - .yarn-cache/ + script: + - bundle install --path=vendor + - yarn install --cache-folder .yarn-cache + - echo Run tests... ``` -To enable per-stage and per-branch caching: +If multiple caches are combined with a [Fallback cache key](#fallback-cache-key), +the fallback cache is fetched every time a cache is not found. -```yaml -cache: - key: "$CI_JOB_STAGE-$CI_COMMIT_REF_SLUG" -``` +## Fallback cache key -### Share caches across jobs in different branches +> [Introduced](https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/1534) in GitLab Runner 13.4. -To share a cache across all branches and all jobs, use the same key for everything: +You can use the `$CI_COMMIT_REF_SLUG` [predefined variable](../variables/predefined_variables.md) +to specify your [`cache:key`](../yaml/README.md#cachekey). For example, if your +`$CI_COMMIT_REF_SLUG` is `test` you can set a job to download cache that's tagged with `test`. -```yaml -cache: - key: one-key-to-rule-them-all -``` +If a cache with this tag is not found, you can use `CACHE_FALLBACK_KEY` to +specify a cache to use when none exists. -To share caches between branches, but have a unique cache for each job: +In the following example, if the `$CI_COMMIT_REF_SLUG` is not found, the job uses the key defined +by the `CACHE_FALLBACK_KEY` variable: ```yaml -cache: - key: ${CI_JOB_NAME} +variables: + CACHE_FALLBACK_KEY: fallback-key + +job1: + script: + - echo + cache: + key: "$CI_COMMIT_REF_SLUG" + paths: + - binaries/ ``` -### Disable cache for specific jobs +## Disable cache for specific jobs If you have defined the cache globally, it means that each job uses the same definition. You can override this behavior per-job, and if you want to @@ -116,7 +130,7 @@ job: cache: {} ``` -### Inherit global configuration, but override specific settings per job +## Inherit global configuration, but override specific settings per job You can override cache settings without overwriting the global cache by using [anchors](../yaml/README.md#anchors). For example, if you want to override the @@ -124,7 +138,7 @@ You can override cache settings without overwriting the global cache by using ```yaml cache: &global_cache - key: ${CI_COMMIT_REF_SLUG} + key: $CI_COMMIT_REF_SLUG paths: - node_modules/ - public/ @@ -150,6 +164,49 @@ PHP packages, Ruby gems, Python libraries, and others can all be cached. For more examples, check out our [GitLab CI/CD templates](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates). +### Share caches between jobs in the same branch + +To have jobs for each branch use the same cache, define a cache with the `key: $CI_COMMIT_REF_SLUG`: + +```yaml +cache: + key: $CI_COMMIT_REF_SLUG +``` + +This configuration prevents you from accidentally overwriting the cache. However, the +first pipeline for a merge request is slow. The next time a commit is pushed to the branch, the +cache is re-used and jobs run faster. + +To enable per-job and per-branch caching: + +```yaml +cache: + key: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG" +``` + +To enable per-stage and per-branch caching: + +```yaml +cache: + key: "$CI_JOB_STAGE-$CI_COMMIT_REF_SLUG" +``` + +### Share caches across jobs in different branches + +To share a cache across all branches and all jobs, use the same key for everything: + +```yaml +cache: + key: one-key-to-rule-them-all +``` + +To share caches between branches, but have a unique cache for each job: + +```yaml +cache: + key: $CI_JOB_NAME +``` + ### Cache Node.js dependencies If your project is using [npm](https://www.npmjs.com/) to install the Node.js @@ -166,7 +223,7 @@ image: node:latest # Cache modules in between jobs cache: - key: ${CI_COMMIT_REF_SLUG} + key: $CI_COMMIT_REF_SLUG paths: - .npm/ @@ -193,7 +250,7 @@ image: php:7.2 # Cache libraries in between jobs cache: - key: ${CI_COMMIT_REF_SLUG} + key: $CI_COMMIT_REF_SLUG paths: - vendor/ @@ -262,7 +319,7 @@ image: ruby:2.6 # Cache gems in between builds cache: - key: ${CI_COMMIT_REF_SLUG} + key: $CI_COMMIT_REF_SLUG paths: - vendor/ruby @@ -287,7 +344,7 @@ cache: key: files: - Gemfile.lock - prefix: ${CI_JOB_NAME} + prefix: $CI_JOB_NAME paths: - vendor/ruby diff --git a/doc/ci/yaml/README.md b/doc/ci/yaml/README.md index 8e9cf00b16093ae250c2aa133a6642b4361efed0..b01fcaa5bc3010e7ce0d0b884a39112a2d1115d1 100644 --- a/doc/ci/yaml/README.md +++ b/doc/ci/yaml/README.md @@ -2351,250 +2351,215 @@ as Review Apps. You can see an example that uses Review Apps at Use `cache` to specify a list of files and directories to cache between jobs. You can only use paths that are in the local working copy. -If `cache` is defined outside the scope of jobs, it's set -globally and all jobs use that configuration. - Caching is shared between pipelines and jobs. Caches are restored before [artifacts](#artifacts). -Read how caching works and find out some good practices in the -[caching dependencies documentation](../caching/index.md). +Learn more about caches in [Caching in GitLab CI/CD](../caching/index.md). #### `cache:paths` -Use the `paths` directive to choose which files or directories to cache. Paths -are relative to the project directory (`$CI_PROJECT_DIR`) and can't directly link outside it. -You can use Wildcards that use [glob](https://en.wikipedia.org/wiki/Glob_(programming)) -patterns and: +Use the `cache:paths` keyword to choose which files or directories to cache. + +**Keyword type**: Job-specific. You can use it only as part of a job. + +**Possible inputs**: An array of paths relative to the project directory (`$CI_PROJECT_DIR`). +You can use wildcards that use [glob](https://en.wikipedia.org/wiki/Glob_(programming)) +patterns: - In [GitLab Runner 13.0](https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2620) and later, [`doublestar.Glob`](https://pkg.go.dev/github.com/bmatcuk/doublestar@v1.2.2?tab=doc#Match). - In GitLab Runner 12.10 and earlier, [`filepath.Match`](https://pkg.go.dev/path/filepath#Match). +**Example of `cache:paths`**: + Cache all files in `binaries` that end in `.apk` and the `.config` file: ```yaml rspec: - script: test + script: + - echo "This job uses a cache." cache: + key: binaries-cache paths: - binaries/*.apk - .config ``` -Locally defined cache overrides globally defined options. The following `rspec` -job caches only `binaries/`: - -```yaml -cache: - paths: - - my/files - -rspec: - script: test - cache: - key: rspec - paths: - - binaries/ -``` +**Related topics**: -The cache is shared between jobs, so if you're using different -paths for different jobs, you should also set a different `cache:key`. -Otherwise cache content can be overwritten. +- See the [common `cache` use cases](../caching/index.md#common-use-cases) for more + `cache:paths` examples. #### `cache:key` -The `key` keyword defines the affinity of caching between jobs. -You can have a single cache for all jobs, cache per-job, cache per-branch, -or any other way that fits your workflow. You can fine tune caching, -including caching data between different jobs or even different branches. - -The `cache:key` variable can use any of the -[predefined variables](../variables/README.md). The default key, if not -set, is just literal `default`, which means everything is shared between -pipelines and jobs by default. - -For example, to enable per-branch caching: - -```yaml -cache: - key: "$CI_COMMIT_REF_SLUG" - paths: - - binaries/ -``` - -If you use **Windows Batch** to run your shell scripts you need to replace -`$` with `%`: +Use the `cache:key` keyword to give each cache a unique identifying key. All jobs +that use the same cache key use the same cache, including in different pipelines. -```yaml -cache: - key: "%CI_COMMIT_REF_SLUG%" - paths: - - binaries/ -``` +If not set, the default key is `default`. All jobs with the `cache:` keyword but +no `cache:key` share the `default` cache. -The `cache:key` variable can't contain the `/` character, or the equivalent -URI-encoded `%2F`. A value made only of dots (`.`, `%2E`) is also forbidden. - -You can specify a [fallback cache key](#fallback-cache-key) to use if the specified `cache:key` is not found. +**Keyword type**: Job-specific. You can use it only as part of a job. -##### Multiple caches +**Possible inputs**: -> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/32814) in GitLab 13.10. -> - [Feature Flag removed](https://gitlab.com/gitlab-org/gitlab/-/issues/321877), in GitLab 13.12. +- A string. +- A [predefined variables](../variables/README.md). +- A combination of both. -You can have a maximum of four caches: +**Example of `cache:key`**: ```yaml -test-job: - stage: build - cache: - - key: - files: - - Gemfile.lock - paths: - - vendor/ruby - - key: - files: - - yarn.lock - paths: - - .yarn-cache/ +cache-job: script: - - bundle install --path=vendor - - yarn install --cache-folder .yarn-cache - - echo Run tests... + - echo "This job uses a cache." + cache: + key: binaries-cache-$CI_COMMIT_REF_SLUG + paths: + - binaries/ ``` -If multiple caches are combined with a [Fallback cache key](#fallback-cache-key), -the fallback is fetched multiple times if multiple caches are not found. - -#### Fallback cache key - -> [Introduced](https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/1534) in GitLab Runner 13.4. +**Additional details**: -You can use the `$CI_COMMIT_REF_SLUG` [variable](#variables) to specify your [`cache:key`](#cachekey). -For example, if your `$CI_COMMIT_REF_SLUG` is `test` you can set a job -to download cache that's tagged with `test`. +- If you use **Windows Batch** to run your shell scripts you need to replace + `$` with `%`. For example: `key: %CI_COMMIT_REF_SLUG%` +- The `cache:key` value can't contain: -If a cache with this tag is not found, you can use `CACHE_FALLBACK_KEY` to -specify a cache to use when none exists. + - The `/` character, or the equivalent URI-encoded `%2F`. + - Only the `.` character (any number), or the equivalent URI-encoded `%2E`. -In the following example, if the `$CI_COMMIT_REF_SLUG` is not found, the job uses the key defined -by the `CACHE_FALLBACK_KEY` variable: +- The cache is shared between jobs, so if you're using different + paths for different jobs, you should also set a different `cache:key`. + Otherwise cache content can be overwritten. -```yaml -variables: - CACHE_FALLBACK_KEY: fallback-key +**Related topics**: -cache: - key: "$CI_COMMIT_REF_SLUG" - paths: - - binaries/ -``` +- You can specify a [fallback cache key](../caching/index.md#fallback-cache-key) + to use if the specified `cache:key` is not found. +- You can [use multiple cache keys](../caching/index.md#use-multiple-caches) in a single job. +- See the [common `cache` use cases](../caching/index.md#common-use-cases) for more + `cache:key` examples. ##### `cache:key:files` > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/18986) in GitLab v12.5. -The `cache:key:files` keyword extends the `cache:key` functionality by making it easier -to reuse some caches, and rebuild them less often, which speeds up subsequent pipeline -runs. +Use the `cache:key:files` keyword to generate a new key when one or two specific files +change. `cache:key:files` lets you reuse some caches, and rebuild them less often, +which speeds up subsequent pipeline runs. -When you include `cache:key:files`, you must also list the project files that are used to generate the key, up to a maximum of two files. -The cache `key` is a SHA checksum computed from the most recent commits (up to two, if two files are listed) -that changed the given files. If neither file is changed in any commits, -the fallback key is `default`. +**Keyword type**: Job-specific. You can use it only as part of a job. + +**Possible inputs**: An array of one or two file paths. + +**Example of `cache:key:files`**: ```yaml -cache: - key: - files: - - Gemfile.lock - - package.json - paths: - - vendor/ruby - - node_modules +cache-job: + script: + - echo "This job uses a cache." + cache: + key: + files: + - Gemfile.lock + - package.json + paths: + - vendor/ruby + - node_modules ``` -This example creates a cache for Ruby and Node.js dependencies that -is tied to current versions of the `Gemfile.lock` and `package.json` files. Whenever one of +This example creates a cache for Ruby and Node.js dependencies. The cache +is tied to the current versions of the `Gemfile.lock` and `package.json` files. When one of these files changes, a new cache key is computed and a new cache is created. Any future job runs that use the same `Gemfile.lock` and `package.json` with `cache:key:files` use the new cache, instead of rebuilding the dependencies. +**Additional details**: The cache `key` is a SHA computed from the most recent commits +that changed each listed file. If neither file is changed in any commits, the +fallback key is `default`. + ##### `cache:key:prefix` > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/18986) in GitLab v12.5. -When you want to combine a prefix with the SHA computed for `cache:key:files`, -use the `prefix` keyword with `key:files`. -For example, if you add a `prefix` of `test`, the resulting key is: `test-feef9576d21ee9b6a32e30c5c79d0a0ceb68d1e5`. -If neither file is changed in any commits, the prefix is added to `default`, so the -key in the example would be `test-default`. +Use `cache:key:prefix` to combine a prefix with the SHA computed for [`cache:key:files`](#cachekeyfiles). -Like `cache:key`, `prefix` can use any of the [predefined variables](../variables/README.md), -but cannot include: +**Keyword type**: Job-specific. You can use it only as part of a job. -- the `/` character (or the equivalent URI-encoded `%2F`) -- a value made only of `.` (or the equivalent URI-encoded `%2E`) +**Possible inputs**: -```yaml -cache: - key: - files: - - Gemfile.lock - prefix: ${CI_JOB_NAME} - paths: - - vendor/ruby +- A string +- A [predefined variables](../variables/README.md) +- A combination of both. +**Example of `cache:key:prefix`**: + +```yaml rspec: script: - - bundle exec rspec + - echo "This rspec job uses a cache." + cache: + key: + files: + - Gemfile.lock + prefix: $CI_JOB_NAME + paths: + - vendor/ruby ``` -For example, adding a `prefix` of `$CI_JOB_NAME` -causes the key to look like: `rspec-feef9576d21ee9b6a32e30c5c79d0a0ceb68d1e5` and -the job cache is shared across different branches. If a branch changes -`Gemfile.lock`, that branch has a new SHA checksum for `cache:key:files`. A new cache key -is generated, and a new cache is created for that key. -If `Gemfile.lock` is not found, the prefix is added to -`default`, so the key in the example would be `rspec-default`. +For example, adding a `prefix` of `$CI_JOB_NAME` causes the key to look like `rspec-feef9576d21ee9b6a32e30c5c79d0a0ceb68d1e5`. +If a branch changes `Gemfile.lock`, that branch has a new SHA checksum for `cache:key:files`. +A new cache key is generated, and a new cache is created for that key. If `Gemfile.lock` +is not found, the prefix is added to `default`, so the key in the example would be `rspec-default`. + +**Additional details**: If no file in `cache:key:files` is changed in any commits, +the prefix is added to the `default` key. #### `cache:untracked` -Set `untracked: true` to cache all files that are untracked in your Git -repository: +Use `untracked: true` to cache all files that are untracked in your Git repository: -```yaml -rspec: - script: test - cache: - untracked: true -``` +**Keyword type**: Job-specific. You can use it only as part of a job. -Cache all Git untracked files and files in `binaries`: +**Possible inputs**: `true` or `false` (default). + +**Example of `cache:untracked`**: ```yaml rspec: script: test cache: untracked: true - paths: - - binaries/ ``` +**Additional details**: + +- You can combine `cache:untracked` with `cache:paths` to cache all untracked files + as well as files in the configured paths. For example: + + ```yaml + rspec: + script: test + cache: + untracked: true + paths: + - binaries/ + ``` + #### `cache:when` > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/18969) in GitLab 13.5 and GitLab Runner v13.5.0. -`cache:when` defines when to save the cache, based on the status of the job. You can -set `cache:when` to: +Use `cache:when` to define when to save the cache, based on the status of the job. + +**Keyword type**: Job-specific. You can use it only as part of a job. + +**Possible inputs**: - `on_success` (default): Save the cache only when the job succeeds. - `on_failure`: Save the cache only when the job fails. - `always`: Always save the cache. -For example, to store a cache whether or not the job fails or succeeds: +**Example of `cache:untracked`**: ```yaml rspec: @@ -2605,32 +2570,47 @@ rspec: when: 'always' ``` +This example stores the cache whether or not the job fails or succeeds. + #### `cache:policy` -The default behavior of a caching job is to download the files at the start of -execution, and to re-upload them at the end. Any changes made by the -job are persisted for future runs. This behavior is known as the `pull-push` cache -policy. +To change the upload and download behavior of a cache, use the `cache:policy` keyword. +By default, the job downloads the cache when the job starts, and uploads changes +to the cache when the job ends. This is the `pull-push` policy (default). -If you know the job does not alter the cached files, you can skip the upload step -by setting `policy: pull` in the job specification. You can add an ordinary cache -job at an earlier stage to ensure the cache is updated from time to time: +To set a job to only download the cache when the job starts, but never upload changes +when the job finishes, use `cache:policy:pull`. -```yaml -stages: - - setup - - test +To set a job to only upload a cache when the job finishes, but never download the +cache when the job starts, use `cache:policy:push`. + +Use the `pull` policy when you have many jobs executing in parallel that use the same cache. +This policy speeds up job execution and reduces load on the cache server. You can +use a job with the `push` policy to build the cache. + +**Keyword type**: Job-specific. You can use it only as part of a job. + +**Possible inputs**: + +- `pull` +- `push` +- `pull-push` (default) -prepare: - stage: setup +**Example of `cache:policy`**: + +```yaml +prepare-dependencies-job: + stage: build cache: key: gems paths: - vendor/bundle + policy: push script: - - bundle install --deployment + - echo "This job only downloads dependencies and builds the cache." + - echo "Downloading dependencies..." -rspec: +faster-test-job: stage: test cache: key: gems @@ -2638,16 +2618,10 @@ rspec: - vendor/bundle policy: pull script: - - bundle exec rspec ... + - echo "This job script uses the cache, but does not update it." + - echo "Running tests..." ``` -Use the `pull` policy when you have many jobs executing in parallel that use caches. This -policy speeds up job execution and reduces load on the cache server. - -If you have a job that unconditionally recreates the cache without -referring to its previous contents, you can skip the download step. -To do so, add `policy: push` to the job. - ### `artifacts` Use `artifacts` to specify a list of files and directories that are diff --git a/doc/development/pipelines.md b/doc/development/pipelines.md index 0dc1481f542e60b0df37acaed0e8218efbe2c264..30a92181a7dba4dea1393a539ab7f2479a9ead84 100644 --- a/doc/development/pipelines.md +++ b/doc/development/pipelines.md @@ -559,7 +559,7 @@ request, be sure to start the `dont-interrupt-me` job before pushing. - `.qa-cache` - `.yarn-cache` - `.assets-compile-cache` (the key includes `${NODE_ENV}` so it's actually two different caches). -1. These cache definitions are composed of [multiple atomic caches](../ci/yaml/README.md#multiple-caches). +1. These cache definitions are composed of [multiple atomic caches](../ci/caching/index.md#use-multiple-caches). 1. Only 6 specific jobs, running in 2-hourly scheduled pipelines, are pushing (i.e. updating) to the caches: - `update-setup-test-env-cache`, defined in [`.gitlab/ci/rails.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/ci/rails.gitlab-ci.yml). - `update-gitaly-binaries-cache`, defined in [`.gitlab/ci/rails.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/ci/rails.gitlab-ci.yml).