Skip to content
代码片段 群组 项目
提交 87d1a409 编辑于 作者: Vasilii Iakliushin's avatar Vasilii Iakliushin
浏览文件

Cleanup "use_lock_for_update_repository_storage" feature flag

Contributes to https://gitlab.com/gitlab-org/gitlab/-/issues/431198

**Problem**

It's possible to run multiple migration workers for the same
project/snippet/group simultaneously
The worker can be killed and rescheduled by Sidekiq interrupt signal.
It will leave the migration in an inconsistent state.

**Solution**

Use an exclusive lock in storage migration workers. The exclusive lease
key includes a project/snippet/group id to prevent simultaneous updates.
The key value is a Sidekiq worker jid to track the owner of the update.
This setup should handle following situations:

Worker tries to migrate a repository under existing
migration (result: job is marked as failed)
Worker started a migration but was interrupted and
rescheduled. (result: job is marked as failed, lock is released)

Changelog: changed
上级 dff22bf8
No related branches found
No related tags found
无相关合并请求
......@@ -38,36 +38,32 @@ def perform(*args)
container_id ||= repository_storage_move.container_id
if Feature.enabled?(:use_lock_for_update_repository_storage)
# Use exclusive lock to prevent multiple storage migrations at the same time
#
# Note: instead of using a randomly generated `uuid`, we provide a worker jid value.
# That will allow to track a worker that requested a lease.
lease_key = [self.class.name.underscore, container_id].join(':')
exclusive_lease = Gitlab::ExclusiveLease.new(lease_key, uuid: jid, timeout: LEASE_TIMEOUT)
lease = exclusive_lease.try_obtain
if lease
begin
update_repository_storage(repository_storage_move)
ensure
exclusive_lease.cancel
end
else
# If there is an ungoing storage migration, then the current one should be marked as failed
repository_storage_move.do_fail!
# Use exclusive lock to prevent multiple storage migrations at the same time
#
# Note: instead of using a randomly generated `uuid`, we provide a worker jid value.
# That will allow to track a worker that requested a lease.
lease_key = [self.class.name.underscore, container_id].join(':')
exclusive_lease = Gitlab::ExclusiveLease.new(lease_key, uuid: jid, timeout: LEASE_TIMEOUT)
lease = exclusive_lease.try_obtain
# A special case
# Sidekiq can receive an interrupt signal during the processing.
# It kills existing workers and reschedules their jobs using the same jid.
# But it can cause a situation when the migration is only half complete (see https://gitlab.com/gitlab-org/gitlab/-/issues/429049#note_1635650597)
#
# Here we detect this case and release the lock.
uuid = Gitlab::ExclusiveLease.get_uuid(lease_key)
exclusive_lease.cancel if uuid == jid
if lease
begin
update_repository_storage(repository_storage_move)
ensure
exclusive_lease.cancel
end
else
update_repository_storage(repository_storage_move)
# If there is an ungoing storage migration, then the current one should be marked as failed
repository_storage_move.do_fail!
# A special case
# Sidekiq can receive an interrupt signal during the processing.
# It kills existing workers and reschedules their jobs using the same jid.
# But it can cause a situation when the migration is only half complete (see https://gitlab.com/gitlab-org/gitlab/-/issues/429049#note_1635650597)
#
# Here we detect this case and release the lock.
uuid = Gitlab::ExclusiveLease.get_uuid(lease_key)
exclusive_lease.cancel if uuid == jid
end
end
......
---
name: use_lock_for_update_repository_storage
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/136169
rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/431198
milestone: '16.6'
type: development
group: group::source code
default_enabled: false
......@@ -96,18 +96,6 @@
expect(repository_storage_move.reload).to be_failed
end
end
context 'when feature flag "use_lock_for_update_repository_storage" is disabled' do
before do
stub_feature_flags(use_lock_for_update_repository_storage: false)
end
it 'ignores lock and calls the update repository storage service' do
expect(service).to receive(:execute)
subject
end
end
end
end
end
......@@ -172,18 +160,6 @@
expect(repository_storage_move.reload).to be_failed
end
end
context 'when feature flag "use_lock_for_update_repository_storage" is disabled' do
before do
stub_feature_flags(use_lock_for_update_repository_storage: false)
end
it 'ignores lock and calls the update repository storage service' do
expect(service).to receive(:execute)
subject
end
end
end
end
end
......
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册