Skip to content
代码片段 群组 项目
提交 1f7798fb 编辑于 作者: Adam Hegyi's avatar Adam Hegyi
浏览文件

Update VSA development docs

Update VSA development docs
上级 53b2676b
No related branches found
No related tags found
无相关合并请求
...@@ -6,33 +6,86 @@ info: To determine the technical writer assigned to the Stage/Group associated w ...@@ -6,33 +6,86 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Value stream analytics development guide # Value stream analytics development guide
Value stream analytics calculates the time between two arbitrary events recorded on domain objects and provides aggregated statistics about the duration. For information on how to configure value stream analytics (VSA) in GitLab, see our [analytics documentation](../user/analytics/value_stream_analytics.md).
For information on how to configure value stream analytics in GitLab, see our [analytics documentation](../user/analytics/value_stream_analytics.md). ## How does Value Stream Analytics work?
## Stage Value Stream Analytics calculates the duration between two timestamp columns or timestamp
expressions and runs various aggregations on the data.
During development, events occur that move issues and merge requests through different stages of progress until they are considered finished. These stages can be expressed with the `Stage` model. For example:
Example stage: - Duration between the Merge Request creation time and Merge Request merge time.
- Duration between the Issue creation time and Issue close time.
- Name: Development This duration is exposed in various ways:
- Start event: Issue created
- End event: Issue first mentioned in commit - Aggregation: median, average
- Parent: `Group: gitlab-org` - Listing: list the duration for individual Merge Request and Issue records
Apart from the durations, we expose the record count within a stage.
## Feature availability
- Group level (licensed): Requires Ultimate or Premium subscription. This version is the most
feature-full.
- Project level (licensed): We are continually adding features to project level VSA to bring it in line with group level VSA.
- Project level (FOSS): Keep it as is.
|Feature|Group level (licensed)|Project level (licensed)|Project level (FOSS)|
|-|-|-|-|
|Create custom value streams|Yes|No, only one value stream (default) is present with the default stages|no, only one value stream (default) is present with the default stages|
|Create custom stages|Yes|No|No|
|Filtering (author, label, milestone, etc.)|Yes|Yes|Yes|
|Stage time chart|Yes|No|No|
|Total time chart|Yes|No|No|
|Task by type chart|Yes|No|No|
|DORA Metrics|Yes|Yes|No|
|Cycle time and lead time summary (Key metrics)|Yes|Yes|No|
|New issues, commits and deploys (Key metrics)|Yes, excluding commits|Yes|Yes|
|Uses aggregated backend|Yes|No|No|
|Date filter behavior|Filters items [finished within the date range](https://gitlab.com/groups/gitlab-org/-/epics/6046)|Filters items by creation date.|Filters items by creation date.|
|Authorization|At least reporter|At least reporter|Can be public.|
## VSA core domain objects
### Stages
A stage represents an event pair (start and end events) with additional metadata, such as the name
of the stage. Stages are configurable by the user within the pairing rules defined in the backend.
**Example stage: Code Review**
- Start event identifier: Merge request creation time.
- Start event column: uses the `merge_requests.created_at` timestamp column.
- End event identifier: Merge request merge time.
- End event column: uses the `merge_request_metrics.merged_at` timestamp column.
- Stage event hash ID: a calculated hash for the pair of start and end event identifiers.
- If two stages have the same configuration of start and end events, then their stage event hash.
IDs are identical.
- The stage event hash ID is later used to store the aggregated data in partitioned database tables.
Historically, value stream analytics defined [7 stages](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/analytics/cycle_analytics/default_stages.rb)
which are always available to the end-users regardless of the subscription.
### Value streams
Value streams are container objects for the stages. There can be multiple value streams per group
focusing on different aspects of the Dev Ops lifecycle.
### Events ### Events
Events are the smallest building blocks of the value stream analytics feature. A stage consists of two events: Events are the smallest building blocks of the value stream analytics feature. A stage consists of two events:
- Start - Start event
- End - End event
These events play a key role in the duration calculation. These events play a key role in the duration calculation.
Formula: `duration = end_event_time - start_event_time` Formula: `duration = end_event_time - start_event_time`
To make the duration calculation flexible, each `Event` is implemented as a separate class. They're responsible for defining a timestamp expression that is used in the calculation query. To make the duration calculation flexible, each `Event` is implemented as a separate class.
They're responsible for defining a timestamp expression that is used in the calculation query.
#### Implementing an `Event` class #### Implementing an `Event` class
...@@ -81,7 +134,7 @@ def apply_query_customization(query) ...@@ -81,7 +134,7 @@ def apply_query_customization(query)
end end
``` ```
### Validating start and end events #### Validating start and end events
Some start/end event pairs are not "compatible" with each other. For example: Some start/end event pairs are not "compatible" with each other. For example:
...@@ -171,23 +224,7 @@ graph LR; ...@@ -171,23 +224,7 @@ graph LR;
MergeRequestLabelRemoved --> MergeRequestLabelRemoved; MergeRequestLabelRemoved --> MergeRequestLabelRemoved;
``` ```
### Parent ## Default stages
Teams and organizations might define their own way of building software, thus stages can be completely different. For each stage, a parent object needs to be defined.
Currently supported parents:
- `Project`
- `Group`
#### How parent relationship it work
1. User navigates to the value stream analytics page.
1. User selects a group.
1. Backend loads the defined stages for the selected group.
1. Additions and modifications to the stages are persisted within the selected group only.
### Default stages
The [original implementation](https://gitlab.com/gitlab-org/gitlab/-/issues/847) of value stream analytics defined 7 stages. These stages are always available for each parent, however altering these stages is not possible. The [original implementation](https://gitlab.com/gitlab-org/gitlab/-/issues/847) of value stream analytics defined 7 stages. These stages are always available for each parent, however altering these stages is not possible.
...@@ -209,31 +246,15 @@ The reason for this was that we'd like to add the abilities to hide and order st ...@@ -209,31 +246,15 @@ The reason for this was that we'd like to add the abilities to hide and order st
For a new calculation or a query, implement it as a new method call in the `DataCollector` class. For a new calculation or a query, implement it as a new method call in the `DataCollector` class.
## Database query To support the aggregated value stream analytics backend, these classes were reimplemented within [`Aggregated`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/analytics/cycle_analytics/aggregated) namespace.
Structure of the database query: ### Database query backend
```sql VSA supports two backends: [aggregated](value_stream_analytics/value_stream_analytics_aggregated_backend.md) and "live". The live query backend can be
SELECT (customized by: Median or RecordsFetcher or DataForDurationChart) considered legacy, which will be phased out at some point.
FROM OBJECT_TYPE (Issue or MergeRequest)
INNER JOIN (several JOIN statements, depending on the events)
WHERE
(Filter by the PARENT model, example: filter Issues from Project A)
(Date range filter based on the OBJECT_TYPE.created_at)
(Check if the START_EVENT is earlier than END_EVENT, preventing negative duration)
```
Structure of the `SELECT` statement for `Median`:
```sql - "live": uses the standard `IssuableFinders`.
SELECT (calculate median from START_EVENT_TIME-END_EVENT_TIME) - aggregated: queries data from pre-aggregated database tables.
```
Structure of the `SELECT` statement for `DataForDurationChart`:
```sql
SELECT (START_EVENT_TIME-END_EVENT_TIME) as duration, END_EVENT.timestamp
```
## High-level overview ## High-level overview
...@@ -252,3 +273,31 @@ Writing a test case for a stage using a new `Event` can be challenging since dat ...@@ -252,3 +273,31 @@ Writing a test case for a stage using a new `Event` can be challenging since dat
- Different parents: `Group` or `Project` - Different parents: `Group` or `Project`
- Different calculations: `Median`, `RecordsFetcher` or `DataForDurationChart` - Different calculations: `Median`, `RecordsFetcher` or `DataForDurationChart`
The VSA frontend is tested extensively on two different levels (integration, unit):
- End-to-end integration tests using a real backend via Capybara and RSpec.
- Jest frontend tests with pre-generated data fixtures.
## Development setup and testing
Running Value Stream Analytics can be done via the [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit). By default, you'll be able to view the project-level (FOSS) version of the feature.
If your GDK is up and running, you can run the seed script to generate some data:
```shell
SEED_CYCLE_ANALYTICS=true SEED_VSA=true FILTER=cycle_analytics rake db:seed_fu
```
The data generator script creates a new group and a new project with issue and merge request
data (see the output of the script). To view the group-level version of the feature, you
need to request a license for your GDK instance.
After this step, you can access the group level value stream analytics page where you can create
value streams and stages. The data aggregation might be delayed so you might not see the
data right after the stage creation. To speed up this process, you can run the following command
in your rails console (`rails c`):
```ruby
Analytics::CycleAnalytics::ReaggregationWorker.new.perform
```
...@@ -21,8 +21,7 @@ Value Stream Analytics (VSA). ...@@ -21,8 +21,7 @@ Value Stream Analytics (VSA).
## Current Status ## Current Status
As of 14.8 the aggregated VSA backend is used only in the `gitlab-org` group, for testing purposes The aggregated backend is used by default since GitLab 15.0 on the group-level.
. We plan to gradually roll it out in the next major release (15.0) for the rest of the groups.
## Motivation ## Motivation
...@@ -53,44 +52,6 @@ database with a minimal development effort. ...@@ -53,44 +52,6 @@ database with a minimal development effort.
used by the system. used by the system.
- Example: `MIN(issues.created_at, issues.updated_at)` - Example: `MIN(issues.created_at, issues.updated_at)`
## How does Value Stream Analytics work?
Value Stream Analytics calculates the duration between two timestamp columns or timestamp
expressions and runs various aggregations on the data.
Examples:
- Duration between the Merge Request creation time and Merge Request merge time.
- Duration between the Issue creation time and Issue close time.
This duration is exposed in various ways:
- Aggregation: median, average
- Listing: list the duration for individual Merge Request and Issue records
Apart from the durations, we expose the record count within a stage.
### Stages
A stage represents an event pair (start and end events) with additional metadata, such as the name
of the stage. Stages are configurable by the user within the pairing rules defined in the backend.
**Example stage: Code Review**
- Start event identifier: Merge Request creation time
- Start event column: uses the `merge_requests.created_at` timestamp column.
- End event identifier: Merge Request merge time
- End event column: uses the `merge_request_metrics.merged_at` timestamp column.
- Stage event hash ID: a calculated hash for the pair of start and end event identifiers.
- If two stages have the same configuration of start and end events, then their stage event hash
IDs are identical.
- The stage event hash ID is later used to store the aggregated data in partitioned database tables.
### Value streams
Value streams are container objects for the stages. There can be multiple value streams per group
or project focusing on different aspects of the Dev Ops lifecycle.
### Example configuration ### Example configuration
![vsa object hierarchy example](img/object_hierarchy_example_V14_10.png) ![vsa object hierarchy example](img/object_hierarchy_example_V14_10.png)
......
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册