Skip to content
代码片段 群组 项目
未验证 提交 ce5c0466 编辑于 作者: Vishwa Bhat's avatar Vishwa Bhat 提交者: GitLab
浏览文件

Add Gem Scanning Interface details in the Secret Detection blueprint

上级 7534506f
No related branches found
No related tags found
无相关合并请求
......@@ -290,6 +290,21 @@ sequenceDiagram
Rails->>User: accepted
```
#### Gem Scanning Interface
For the Phase1, we use the private [Secret Detection Ruby Gem](https://gitlab.com/gitlab-org/gitlab/-/tree/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection) that is invoked by the [Secrets Push Check](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/ee/lib/gitlab/checks/secrets_check.rb) on the GitLab Rails platform.
The private SD gem offers the following support in addition to running scan on multiple blobs:
- Configurable Timeout on the entire scan-level and on each blob level.
- Ability to run the scan within subprocess instead of the main process. The number of processes spawned per request is capped to [`5`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L29).
The Ruleset file referred during the Pre-receive Secret Detection scan is
located [here](https://gitlab.com/gitlab-org/gitlab/-/blob/2da1c72dbc9df4d9130262c6b79ea785b6bb14ac/gems/gitlab-secret_detection/lib/gitleaks.toml).
More details about the Gem can be found in the [README](https://gitlab.com/gitlab-org/gitlab/-/blob/master/gems/gitlab-secret_detection/README.md) file.
### Phase 2 - Standalone pre-receive service
The critical paths as outlined under [goals above](#goals) cover two major object
......
# Gitlab::SecretDetection
The gitlab-secret_detection gem performs keyword and regex matching on git blobs that may include secrets. The gem accepts one or more git blobs, matches them against a defined ruleset of regular expressions, and returns scan results.
##### Scan parameters
The method for triggering the scan (
i.e.,[`Gitlab::SecretDetection.secrets_scan`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L75))
accepts the following parameters:
| Parameter | Type | Required | Default | Description |
|----------------|---------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `blobs` | Array | Yes | NA | Array of blobs with each blob to have `id` and `data` properties. `id` represents the uniqueness of the blob in the given array and `data` is the content of the blob to scan. |
| `timeout` | Number | No | [`60s`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L22) | The maximum duration allowed for the scan to run on a commit request comprising multiple blobs. If the specified timeout elapses, the scan is automatically terminated. The timeout duration is specified in seconds but can also accept floating-point values to denote smaller units. For instance, use `0.5` to represent `500ms`. |
| `blob_timeout` | Number | No | [`5s`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L24) | The maximum duration allowed for the scan to run on an individual blob. Upon expiration of the specified timeout, the scan is interrupted for the current blob and advances to the next blob in the request. The timeout duration is specified in seconds but can also accept floating-point values to denote smaller units. For instance, use `0.5` to represent `500ms`. |
| `subprocess` | Boolean | No | [`true`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L34) | Runs the scan operation within a subprocess rather than the main process. This design aims to mitigate memory overconsumption issues that may arise from scanning multiple large blobs within a single subprocess. Check [here](https://docs.gitlab.com/ee/architecture/blueprints/secret_detection/decisions/002_run_scan_within_subprocess.html) for more details. |
##### Scan Constraints
| Name | Value | Description |
|---------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `MAX_PROCS_PER_REQUEST` | [`5`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L29) | The maximum number of processes spawned per commit request. |
| `MIN_CHUNK_SIZE_PER_PROC_BYTES` | [`2MiB`](https://gitlab.com/gitlab-org/gitlab/-/blob/5dfcf7431bfff25519c05a7e66c0cbb8d7b362be/gems/gitlab-secret_detection/lib/gitlab/secret_detection/scan.rb#L32) | The minimum cumulative size of blobs necessary to initiate the creation of a new subprocess, where the scan will be executed within that dedicated process. |
##### Ruleset Source
The Ruleset file referenced for running the Pre-receive Secret Detection is
located [here](https://gitlab.com/gitlab-org/gitlab/-/blob/2da1c72dbc9df4d9130262c6b79ea785b6bb14ac/gems/gitlab-secret_detection/lib/gitleaks.toml).
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册