helm chart jh gitlab 从 s3 备份大文件报错

Summary

  • 客户在 helm chart gitlab jh 中备份 s3 上的大文件时频繁报错:
    • 客户的 helm chart gitlab jh 版本是 v14.1.0
    • lfs 文件大于 50G 时报错频率较高
      • s3 文件总大小 200G 左右
      • k8s node 存储空间大于 1T
    • 客户的 gitlab 和 s3 的区域都是新加坡

Steps to reproduce

  • 配置 helm chart 使用 s3 对象存储
  • 在 s3 中存储大文件(例如:大于 50G)
    • 确保 k8s node 本地存储足够
  • 执行 helm chart 备份命令将 s3 中的文件备份到本地
    • kubectl exec <Toolbox pod name> -it -- backup-utility
  • 修改 Toolbox pod 内的 .s3cfg 文件的两个参数,备份仍然报错:
    • multipart_chunk_size_mb = 256
    • socket_timeout = 3000000

Example Project

What is the current bug behavior?

What is the expected correct behavior?

Relevant logs and/or screenshots

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
An unexpected error has occurred.
Please try reproducing the error using
the latest s3cmd code from the git master
branch found at:
https://github.com/s3tools/s3cmd
and have a look at the known issues list:
https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions
If the error persists, please report the
following lines (removing any private
info as necessary) to:
s3tools-bugs@lists.sourceforge.net
 
 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 
Invoked as: /usr/local/bin/s3cmd --stop-on-error --delete-removed sync s3://gitlab-lfs-20210714/ /srv/gitlab/tmp/lfs/
Problem: <class 'PermissionError: [Errno 1] Operation not permitted: b'/srv/gitlab/tmp/lfs/12/45/a5754ef5c1d0ecefcdcd8caedd0b5d15226f255aa9174570cfebae8e1679'
S3cmd: 2.1.0
python: 3.8.9 (default, Apr 21 2021, 05:01:26)
[GCC 8.3.0]
environment LANG=C.UTF-8
 
Traceback (most recent call last):
File "/usr/local/bin/s3cmd", line 3121, in
rc = main()
File "/usr/local/bin/s3cmd", line 3030, in main
rc = cmd_func(args)
File "/usr/local/bin/s3cmd", line 1900, in cmd_sync
return cmd_sync_remote2local(args)
File "/usr/local/bin/s3cmd", line 1488, in cmd_sync_remote2local
ret, seq, size_transferred = _download(remote_list, seq, remote_count + update_count, size_transferred, dir_cache)
File "/usr/local/bin/s3cmd", line 1454, in _download
raise e
File "/usr/local/bin/s3cmd", line 1441, in _download
os.lchown(deunicodise(dst_file),uid,gid)
PermissionError: [Errno 1] Operation not permitted: b'/srv/gitlab/tmp/lfs/12/45/a5754ef5c1d0ecefcdcd8caedd0b5d15226f255aa9174570cfebae8e1679'
 
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
An unexpected error has occurred.
Please try reproducing the error using
the latest s3cmd code from the git master
branch found at:
https://github.com/s3tools/s3cmd
and have a look at the known issues list:
https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions
If the error persists, please report the
above lines (removing any private
info as necessary) to:
s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

胡鹏 编辑于