diff --git a/doc/administration/geo/index.md b/doc/administration/geo/index.md index a78b51c5d0d85aebac39b8695b1e5f446497736f..f709bac02c0ab2ad2fe4446838ac79ea7328d870 100644 --- a/doc/administration/geo/index.md +++ b/doc/administration/geo/index.md @@ -18,10 +18,12 @@ Geo undergoes significant changes from release to release. Upgrades are supported and [documented](#upgrading-geo), but you should ensure that you're using the right version of the documentation for your installation. -Fetching large repositories can take a long time for teams located far from a single GitLab instance. +Fetching large repositories can take a long time for teams and runners located far from a single GitLab instance. -Geo provides local, read-only sites of your GitLab instances. This can reduce the time it takes -to clone and fetch large repositories, speeding up development. +Geo provides local caches that can be placed geographically close to remote teams which can serve read requests. This can reduce the time it takes +to clone and fetch large repositories, speeding up development and increasing the productivity of your remote teams. + +Geo secondary sites transparently proxy write requests to the primary site. All Geo sites can be configured to respond to a single GitLab URL, to deliver a consistent, seamless, and comprehensive experience whichever site the user lands on. To make sure you're using the right version of the documentation, go to [the Geo page on GitLab.com](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/administration/geo/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown list. For example, [`v13.7.6-ee`](https://gitlab.com/gitlab-org/gitlab/-/blob/v13.7.6-ee/doc/administration/geo/index.md). @@ -46,9 +48,8 @@ In addition, it: Geo provides: -- Read-only **secondary** sites: Maintain one **primary** GitLab site while still enabling read-only **secondary** sites for each of your distributed teams. +- A complete GitLab experience on **Secondary** sites: Maintain one **primary** GitLab site while enabling **secondary** sites with full read and write and UI experience for each of your distributed teams. - Authentication system hooks: **Secondary** sites receive all authentication data (like user accounts and logins) from the **primary** instance. -- An intuitive UI: **Secondary** sites use the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** sites. ### Gitaly Cluster @@ -64,7 +65,7 @@ Your Geo instance can be used for cloning and fetching projects, in addition to When Geo is enabled, the: - Original instance is known as the **primary** site. -- Replicated read-only sites are known as **secondary** sites. +- Replicating sites are known as **secondary** sites. Keep in mind that: @@ -86,7 +87,7 @@ In this diagram: - There is the **primary** site and the details of one **secondary** site. - Writes to the database can only be performed on the **primary** site. A **secondary** site receives database - updates via PostgreSQL streaming replication. + updates by using [PostgreSQL streaming replication](https://wiki.postgresql.org/wiki/Streaming_Replication). - If present, the [LDAP server](#ldap) should be configured to replicate for [Disaster Recovery](disaster_recovery/index.md) scenarios. - A **secondary** site performs different type of synchronizations against the **primary** site, using a special authorization protected by JWT: @@ -97,6 +98,7 @@ From the perspective of a user performing Git operations: - The **primary** site behaves as a full read-write GitLab instance. - **Secondary** sites are read-only but proxy Git push operations to the **primary** site. This makes **secondary** sites appear to support push operations themselves. +- **Secondary** sites proxy web UI requests to the primary. This makes the **secondary** sites appear to support full UI read/write operations. To simplify the diagram, some necessary components are omitted. @@ -106,7 +108,7 @@ To simplify the diagram, some necessary components are omitted. A **secondary** site needs two different PostgreSQL databases: - A read-only database instance that streams data from the main GitLab database. -- [Another database instance](#geo-tracking-database) used internally by the **secondary** site to record what data has been replicated. +- A [read/write database instance(tracking database)](#geo-tracking-database) used internally by the **secondary** site to record what data has been replicated. In **secondary** sites, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor). @@ -120,7 +122,10 @@ The following are required to run Geo: - [CentOS](https://www.centos.org) 7.4 or later - [Ubuntu](https://ubuntu.com) 16.04 or later - [Supported PostgreSQL versions](https://about.gitlab.com/handbook/engineering/development/enablement/data_stores/database/postgresql-upgrade-cadence.html) for your GitLab releases with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication). - - Note,[PostgreSQL 12 is deprecated](../../update/deprecations.md#postgresql-12-deprecated) and is removed in GitLab 16.0. + +NOTE: +[PostgreSQL Logical replication](https://www.postgresql.org/docs/current/logical-replication.html) is not supported. + - All sites must run [the same PostgreSQL versions](setup/database.md#postgresql-replication). - Where possible, you should also use the same operating system version on all Geo sites. If using different operating system versions between Geo sites, you @@ -171,7 +176,7 @@ To update the internal URL of the primary Geo site: ### Geo Tracking Database -The tracking database instance is used as metadata to control what needs to be updated on the disk of the local instance. For example: +The tracking database instance is used as metadata to control what needs to be updated on the local instance. For example: - Download new assets. - Fetch new LFS Objects. @@ -206,11 +211,12 @@ This list of limitations only reflects the latest version of GitLab. If you are - GitLab Runners cannot register with a **secondary** site. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294). - [Selective synchronization](replication/configuration.md#selective-synchronization) only limits what repositories and files are replicated. The entire PostgreSQL data is still replicated. Selective synchronization is not built to accommodate compliance / export control use cases. - [Pages access control](../../user/project/pages/pages_access_control.md) doesn't work on secondaries. See [GitLab issue #9336](https://gitlab.com/gitlab-org/gitlab/-/issues/9336) for details. -- [Disaster recovery](disaster_recovery/index.md) for multi-secondary sites causes downtime due to the complete re-synchronization and re-configuration of all non-promoted secondaries. +- [Disaster recovery](disaster_recovery/index.md) for deployments that have multiple secondary sites causes downtime due to the need to perform complete re-synchronization and re-configuration of all non-promoted secondaries to follow the new primary site. - For Git over SSH, to make the project clone URL display correctly regardless of which site you are browsing, secondary sites must use the same port as the primary. [GitLab issue #339262](https://gitlab.com/gitlab-org/gitlab/-/issues/339262) proposes to remove this limitation. - Git push over SSH against a secondary site does not work for pushes over 1.86 GB. [GitLab issue #413109](https://gitlab.com/gitlab-org/gitlab/-/issues/413109) tracks this bug. - Backups [cannot be run on secondaries](replication/troubleshooting.md#message-error-canceling-statement-due-to-conflict-with-recovery). - Git clone and fetch requests with option `--depth` over SSH against a secondary site does not work and hangs indefinitely if the secondary site is not up to date at the time the request is initiated. For more information, see [issue 391980](https://gitlab.com/gitlab-org/gitlab/-/issues/391980). +- Git push with options over SSH against a secondary site does not work and terminates the connection. For more information, see [issue 417186](https://gitlab.com/gitlab-org/gitlab/-/issues/417186). ### Limitations on replication/verification @@ -236,12 +242,6 @@ For information on how to update your Geo sites to the latest GitLab version, se > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/35913) in GitLab 13.2. -WARNING: -In GitLab 13.2 and 13.3, promoting a secondary site to a primary while the -secondary is paused fails. Do not pause replication before promoting a -secondary. If the site is paused, be sure to resume before promoting. This -issue has been fixed in GitLab 13.4 and later. - WARNING: Pausing and resuming of replication is only supported for Geo installations using a Linux package-managed database. External databases are not supported.