Skip to content
代码片段 群组 项目
未验证 提交 4d1a85ce 编辑于 作者: Sebastian Rehm's avatar Sebastian Rehm 提交者: GitLab
浏览文件

Change sisense references for internal analytics to tableau

上级 5ebc4e9f
No related branches found
No related tags found
无相关合并请求
...@@ -64,28 +64,48 @@ On our SaaS instance both individual events and pre-computed metrics are availab ...@@ -64,28 +64,48 @@ On our SaaS instance both individual events and pre-computed metrics are availab
Additionally for SaaS page views are automatically instrumented. Additionally for SaaS page views are automatically instrumented.
For self-managed only the metrics instrumented on the version installed on the instance are available. For self-managed only the metrics instrumented on the version installed on the instance are available.
### Events
Events are collected in real-time but processed in an asynchronous manner.
In general events are available in the data warehouse at the latest 48 hours after being fired but can already be available earlier.
### Metrics
Metrics are being computed and sent once per week for every instance. On GitLab.com this happens on Sunday and newest values become available throughout Monday.
On self-managed this depends on the particular instance. In general, only the metrics instrumented for the installed GitLab version will be sent.
## Data discovery ## Data discovery
The data visualization tools [Sisense](https://about.gitlab.com/handbook/business-technology/data-team/platform/sisensecdt/) and [Tableau](https://about.gitlab.com/handbook/business-technology/data-team/platform/tableau/), Event and metrics data is ultimately stored in our [Snowflake data warehouse](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/snowflake/).
which have access to our Data Warehouse, can be used to query the internal analytics data. It can either be accessed directly via SQL in Snowflake for [ad-hoc analyses](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/#snowflake-analyst) or visualized in our data visualization tool
[Tableau](https://about.gitlab.com/handbook/business-technology/data-team/platform/tableau/), which has access to Snowflake.
Both platforms need an access request ([Snowflake](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/#warehouse-access), [Tableau](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/tableau/#tableau-online-access)).
### Querying metrics ### Tableau
The following example query returns all values reported for `count_distinct_user_id_from_feature_used_7d` within the last six months and the according `instance_id`: Tableau is a data visualization platform and allows building dashboards and GUI based discovery of events and metrics.
This method of discovery is most suited for users who are familiar with business intelligence tooling, basic verifications
and for creating persisted, shareable dashboards and visualizations.
Access to Tableau requires an [access request](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/tableau/#tableau-online-access).
```sql #### Checking events
SELECT
date_trunc('week', ping_created_at),
dim_instance_id,
metric_value
FROM common.fct_ping_instance_metric_rolling_6_months --model limited to last 6 months for performance
WHERE metrics_path = 'counts.users_visiting_dashboard_weekly' --set to metric of interest
ORDER BY ping_created_at DESC
```
For a list of other metrics tables refer to the [Data Models Cheat Sheet](https://handbook.gitlab.com/handbook/product/product-analysis/data-model-cheat-sheet/#commonly-used-data-models). Visit the [Snowplow event exploration dashboard](https://10az.online.tableau.com/#/site/gitlab/views/SnowplowEventExplorationLast30Days/SnowplowEventExplorationLast30D?:iid=1).
This dashboard shows you event counts as well as the most fired events.
You can scroll down to the "Structured Events Firing in Production Last 30 Days" chart and filter for your specific event action. The filter only works with exact names.
#### Checking metrics
You can visit the [Metrics exploration dashboard](https://10az.online.tableau.com/#/site/gitlab/views/PDServicePingExplorationDashboard/MetricsExploration).
On the side there is a filter for metric path which is the `key_path` of your metric and a filter for the installation ID including guidance on how to filter for GitLab.com.
### Querying events ### Snowflake
Snowflake allows direct querying of relevant tables in the warehouse within their UI with the [Snowflake SQL dialect](https://docs.snowflake.com/en/sql-reference-commands).
This method of discovery is most suited to users who are familiar with SQL and for quick and flexible checks whether data is correctly propagated.
Access to Snowflake requires an [access request](https://handbook.gitlab.com/handbook/business-technology/data-team/platform/#warehouse-access).
#### Querying events
The following example query returns the number of daily event occurrences for the `feature_used` event. The following example query returns the number of daily event occurrences for the `feature_used` event.
...@@ -100,7 +120,23 @@ AND app_id='gitlab' -- use gitlab for production events and gitlab-staging for e ...@@ -100,7 +120,23 @@ AND app_id='gitlab' -- use gitlab for production events and gitlab-staging for e
GROUP BY 1 ORDER BY 1 desc GROUP BY 1 ORDER BY 1 desc
``` ```
For a list of other event tables refer to the [Data Models Cheat Sheet](https://handbook.gitlab.com/handbook/product/product-analysis/data-model-cheat-sheet/#commonly-used-data-models-2). For a list of other metrics tables refer to the [Data Models Cheat Sheet](https://handbook.gitlab.com/handbook/product/product-analysis/data-model-cheat-sheet/#commonly-used-data-models).
#### Querying metrics
The following example query returns all values reported for `count_distinct_user_id_from_feature_used_7d` within the last six months and the according `instance_id`:
```sql
SELECT
date_trunc('week', ping_created_at),
dim_instance_id,
metric_value
FROM common.fct_ping_instance_metric_rolling_6_months --model limited to last 6 months for performance
WHERE metrics_path = 'counts.users_visiting_dashboard_weekly' --set to metric of interest
ORDER BY ping_created_at DESC
```
For a list of other metrics tables refer to the [Data Models Cheat Sheet](https://about.gitlab.com/handbook/product/product-analysis/data-model-cheat-sheet/#commonly-used-data-models).
## Data flow ## Data flow
...@@ -131,8 +167,8 @@ flowchart LR; ...@@ -131,8 +167,8 @@ flowchart LR;
end end
end end
snowplow[\Snowplow Pipeline\] snowplow[\Snowplow Pipeline\]
snowflake[(Data Warehouse)] snowflake[(Snowflake Data Warehouse)]
vis[Dashboards in Sisense/Tableau] vis[Dashboards in Tableau]
``` ```
## Data Privacy ## Data Privacy
......
...@@ -162,7 +162,3 @@ To use a metric definition to manage [performance indicator](https://about.gitla ...@@ -162,7 +162,3 @@ To use a metric definition to manage [performance indicator](https://about.gitla
[Metrics Dictionary is a separate application](https://gitlab.com/gitlab-org/analytics-section/analytics-instrumentation/metric-dictionary). [Metrics Dictionary is a separate application](https://gitlab.com/gitlab-org/analytics-section/analytics-instrumentation/metric-dictionary).
All metrics available in Service Ping are in the [Metrics Dictionary](https://metrics.gitlab.com/). All metrics available in Service Ping are in the [Metrics Dictionary](https://metrics.gitlab.com/).
### Copy query to clipboard
To check if a metric has data in Sisense, use the copy query to clipboard feature. This copies a query that's ready to use in Sisense. The query gets the last five service ping data for GitLab.com for a given metric. For information about how to check if a Service Ping metric has data in Sisense, see this [demo](https://www.youtube.com/watch?v=n4o65ivta48).
...@@ -33,12 +33,12 @@ Currently, the [Metrics Dictionary](https://metrics.gitlab.com/) is built automa ...@@ -33,12 +33,12 @@ Currently, the [Metrics Dictionary](https://metrics.gitlab.com/) is built automa
## Remove a metric ## Remove a metric
WARNING: WARNING:
If a metric is not used in Sisense or any other system after 6 months, the If a metric is not used in Tableau or any other system after 6 months, the
Analytics Instrumentation team marks it as inactive and assigns it to the group owner for review. Analytics Instrumentation team marks it as inactive and assigns it to the group owner for review.
We are working on automating this process. See [this epic](https://gitlab.com/groups/gitlab-org/-/epics/8988) for details. We are working on automating this process. See [this epic](https://gitlab.com/groups/gitlab-org/-/epics/8988) for details.
Analytics Instrumentation removes metrics from Service Ping if they are not used in any Sisense dashboard. Analytics Instrumentation removes metrics from Service Ping if they are not used in any Tableau dashboard.
For an example of the metric removal process, see this [example issue](https://gitlab.com/gitlab-org/gitlab/-/issues/388236). For an example of the metric removal process, see this [example issue](https://gitlab.com/gitlab-org/gitlab/-/issues/388236).
......
...@@ -43,7 +43,7 @@ We use the following terminology to describe the Service Ping components: ...@@ -43,7 +43,7 @@ We use the following terminology to describe the Service Ping components:
## Service Ping request flow ## Service Ping request flow
The following example shows a basic request/response flow between a GitLab instance, the Versions Application, the License Application, Salesforce, the GitLab S3 Bucket, the GitLab Snowflake Data Warehouse, and Sisense: The following example shows a basic request/response flow between a GitLab instance, the Versions Application, the License Application, Salesforce, the GitLab S3 Bucket, the GitLab Snowflake Data Warehouse, and Tableau:
```mermaid ```mermaid
sequenceDiagram sequenceDiagram
...@@ -53,7 +53,7 @@ sequenceDiagram ...@@ -53,7 +53,7 @@ sequenceDiagram
participant Salesforce participant Salesforce
participant S3 Bucket participant S3 Bucket
participant Snowflake DW participant Snowflake DW
participant Sisense Dashboards participant Tableau Dashboards
GitLab Instance->>Versions Application: Send Service Ping GitLab Instance->>Versions Application: Send Service Ping
loop Process usage data loop Process usage data
Versions Application->>Versions Application: Parse usage data Versions Application->>Versions Application: Parse usage data
...@@ -70,7 +70,7 @@ sequenceDiagram ...@@ -70,7 +70,7 @@ sequenceDiagram
Versions Application->>S3 Bucket: Export Versions database Versions Application->>S3 Bucket: Export Versions database
S3 Bucket->>Snowflake DW: Import data S3 Bucket->>Snowflake DW: Import data
Snowflake DW->>Snowflake DW: Transform data using dbt Snowflake DW->>Snowflake DW: Transform data using dbt
Snowflake DW->>Sisense Dashboards: Data available for querying Snowflake DW->>Tableau Dashboards: Data available for querying
Versions Application->>GitLab Instance: DevOps Score (Conversational Development Index) Versions Application->>GitLab Instance: DevOps Score (Conversational Development Index)
``` ```
......
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册