From 339b0ed2d59261dfee18f6d0a09838d371d64517 Mon Sep 17 00:00:00 2001
From: Madelein van Niekerk <mvanniekerk@gitlab.com>
Date: Tue, 29 Aug 2023 23:14:00 +0000
Subject: [PATCH] Update developer documentation for User Caching during GitHub
 Import

---
 doc/development/github_importer.md | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/doc/development/github_importer.md b/doc/development/github_importer.md
index d38be071f3965..2fcf1eb1c99ce 100644
--- a/doc/development/github_importer.md
+++ b/doc/development/github_importer.md
@@ -243,11 +243,13 @@ To avoid mismatching users, the search by GitHub user ID is not done when import
 Enterprise.
 
 Because this process is quite expensive we cache the result of these lookups in
-Redis. For every user looked up we store three keys:
+Redis. For every user looked up we store five keys:
 
 - A Redis key mapping GitHub usernames to their Email addresses.
 - A Redis key mapping a GitHub Email addresses to a GitLab user ID.
 - A Redis key mapping a GitHub user ID to GitLab user ID.
+- A Redis key mapping a GitHub username to an ETAG header.
+- A Redis key indicating whether an email lookup has been done for a project.
 
 We cache two types of lookups:
 
@@ -260,9 +262,12 @@ The expiration time of these keys is 24 hours. When retrieving the cache of a
 positive lookup, we refresh the TTL automatically. The TTL of false lookups is
 never refreshed.
 
+If a lookup for email returns an empty or negative lookup, a [Conditional Request](https://docs.github.com/en/rest/overview/resources-in-the-rest-api#conditional-requests) is made with a cached ETAG in the header once for every project.
+Conditional Requests do not count towards the GitHub API rate limit.
+
 Because of this caching layer, it's possible newly registered GitLab accounts
 aren't linked to their corresponding GitHub accounts. This, however, is resolved
-after the cached keys expire.
+after the cached keys expire or if a new project is imported.
 
 The user cache lookup is shared across projects. This means that the greater the number of
 projects that are imported, fewer GitHub API calls are needed.
-- 
GitLab