diff --git a/doc/architecture/blueprints/clickhouse_usage/index.md b/doc/architecture/blueprints/clickhouse_usage/index.md index 610ff81f571fdbe2e733710f70afdfe26e0dd1dd..8a5530313e56a23e20486dbc3d6987be09f166ff 100644 --- a/doc/architecture/blueprints/clickhouse_usage/index.md +++ b/doc/architecture/blueprints/clickhouse_usage/index.md @@ -2,7 +2,7 @@ status: proposed creation-date: "2023-02-02" authors: [ "@nhxnguyen" ] -coach: "@grzesiek" +coach: "@grzesiek" approvers: [ "@dorrino", "@nhxnguyen" ] owning-stage: "~devops::data_stores" participating-stages: ["~section::ops", "~section::dev"] @@ -52,3 +52,7 @@ Note that we are still formulating proposals and will update the blueprint accor ## Best Practices Best practices and guidelines for developing performant and scalable features using ClickHouse are located in the [ClickHouse developer documentation](../../../development/database/clickhouse/index.md). + +## Cost and maintenance analysis + +ClickHouse components cost and maintenance analysis is located in the [ClickHouse Self-Managed component costs and maintenance requirements](self_managed_costs_and_requirements/index.md). diff --git a/doc/architecture/blueprints/clickhouse_usage/self_managed_costs_and_requirements/index.md b/doc/architecture/blueprints/clickhouse_usage/self_managed_costs_and_requirements/index.md new file mode 100644 index 0000000000000000000000000000000000000000..d8c9c0b25d5ee22339475a0fc43f8f51359ffcb0 --- /dev/null +++ b/doc/architecture/blueprints/clickhouse_usage/self_managed_costs_and_requirements/index.md @@ -0,0 +1,65 @@ +--- +status: proposed +creation-date: "2023-04-04" +authors: [ "@niskhakova", "@dmakovey" ] +coach: "@grzesiek" +approvers: [ "@dorrino", "@nhxnguyen" ] +owning-stage: "~workinggroup::clickhouse" +participating-stages: ["~section::enablement"] +--- + +# ClickHouse Self-Managed component costs and maintenance requirements + +## Summary + +[ClickHouse](https://clickhouse.com/) requires additional cost and maintenance for self-managed customers: + +- **Resource allocation cost**: ClickHouse requires a considerable amount of resources to run optimally. + - [Minimum cost estimation](#minimum-self-managed-component-costs) shows that setting up ClickHouse can be applicable only for very large Reference Architectures: 25k and up. +- **High availability**: ClickHouse SaaS supports HA. No documented HA configuration for self-managed at the moment. +- **Geo setups**: Sync and replication complexity for GitLab Geo setups. +- **Upgrades**: An additional database to maintain and upgrade along with existing Postgres database. This also includes compatibility issues of mapping GitLab version to ClickHouse version and keeping them up-to-date. +- **Backup and restore:** Self-managed customers need to have an engineer who is familiar with backup strategies and disaster recovery process in ClickHouse or switch to ClickHouse SaaS. +- **Monitoring**: ClickHouse can use Prometheus, additional component to monitor and troubleshoot. +- **Limitations**: Azure object storage is not supported. GitLab does not have the documentation or support expertise to assist customers with deployment and operation of self-managed ClickHouse. +- **ClickHouse SaaS**: Customers using a self-managed GitLab instance with regulatory or compliance requirements, or latency concerns likely cannot use ClickHouse SaaS. + +### Minimum self-managed component costs + +Based on [ClickHouse spec requirements](https://gitlab.com/gitlab-com/www-gitlab-com/-/issues/14384#note_1307456092) analysis +and collaborating with ClickHouse team, we identified the following minimal configurations for ClickHouse self-managed: + +1. ClickHouse High Availability (HA) + - ClickHouse - 2 machines with >=16-cores, >=64 GB RAM, SSD, 10 GB Internet. Each machine also runs Keeper. + - [Keeper](https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper) - 1 machine with 2 CPU, 4 GB of RAM, SSD with high IOPS +1. ClickHouse non-HA + - ClickHouse - 1 machine with >=16-cores, >=64 GB RAM, SSD, 10 GB Internet. + +The following [cost table](https://gitlab.com/gitlab-com/www-gitlab-com/-/issues/14384#note_1324085466) was compiled using the machine CPU and memory requirements for ClickHouse, and comparing them to the +GitLab Reference Architecture sizes and [costs](../../../../administration/reference_architectures/index.md#cost-to-run) from the GCP calculator. + +| Reference Architecture | ClickHouse type | ClickHouse cost / (GitLab cost + ClickHouse cost) | +|-------------|-----------------|-----------------------------------| +| [1k - non HA](https://cloud.google.com/products/calculator#id=a6d6a94a-c7dc-4c22-85c4-7c5747f272ed) | [non-HA](https://cloud.google.com/products/calculator#id=9af5359e-b155-451c-b090-5f0879bb591e) | 78.01% | +| [2k - non HA](https://cloud.google.com/products/calculator#id=0d3aff1f-ea3d-43f9-aa59-df49d27c35ca) | [non-HA](https://cloud.google.com/products/calculator#id=9af5359e-b155-451c-b090-5f0879bb591e) | 44.50% | +| [3k - HA](https://cloud.google.com/products/calculator/#id=15fc2bd9-5b1c-479d-bc46-d5ce096b8107) | [HA](https://cloud.google.com/products/calculator#id=9909f5af-d41a-4da2-b8cc-a0347702a823) | 37.87% | +| [5k - HA](https://cloud.google.com/products/calculator/#id=9a798136-53f2-4c35-be43-8e1e975a6663) | [HA](https://cloud.google.com/products/calculator#id=9909f5af-d41a-4da2-b8cc-a0347702a823) | 30.92% | +| [10k - HA](https://cloud.google.com/products/calculator#id=cbe61840-31a1-487f-88fa-631251c2fde5) | [HA](https://cloud.google.com/products/calculator#id=9909f5af-d41a-4da2-b8cc-a0347702a823) | 20.47% | +| [25k - HA](https://cloud.google.com/products/calculator#id=b4b8b587-508a-4433-adc8-dc506bbe924f) | [HA](https://cloud.google.com/products/calculator#id=9909f5af-d41a-4da2-b8cc-a0347702a823) | 14.30% | +| [50k - HA](https://cloud.google.com/products/calculator/#id=48b4d817-d6cd-44b8-b069-0ba9a5d123ea) | [HA](https://cloud.google.com/products/calculator#id=9909f5af-d41a-4da2-b8cc-a0347702a823) | 8.16% | + +NOTE: +The ClickHouse Self-Managed component evaluation is the minimum estimation for the costs +with a simplified architecture. + +The following components increase the cost, and were not considered in the minimum calculation: + +- Disk size - depends on data size, hard to estimate. +- Disk types - ClickHouse recommends [fast SSDs](https://clickhouse.com/docs/ru/operations/tips#storage-subsystem). +- Network usage - ClickHouse recommends using [10 GB network, if possible](https://clickhouse.com/docs/en/operations/tips#network). +- For HA we sum minimum cost across all reference architectures from 3k to 50k users, but HA specs tend to increase with user count. + +### Resources + +- [Research and understand component costs and maintenance requirements of running a ClickHouse instance with GitLab](https://gitlab.com/gitlab-com/www-gitlab-com/-/issues/14384) +- [ClickHouse for Error Tracking on GitLab.com](https://gitlab.com/gitlab-com/gl-infra/readiness/-/blob/master/library/database/clickhouse/index.md)