Skip to content
代码片段 群组 项目
未验证 提交 b13b3bfa 编辑于 作者: ddavison's avatar ddavison
浏览文件

Add bulk data seed file

Add bulk data seed and test step
Use two jobs
Create Data Seeder docker files and docs
上级 71a429cb
No related branches found
No related tags found
无相关合并请求
...@@ -45,6 +45,60 @@ db:rollback single-db: ...@@ -45,6 +45,60 @@ db:rollback single-db:
- .single-db - .single-db
- .rails:rules:single-db - .rails:rules:single-db
db:migrate:multi-version-upgrade-1:
stage: test
image: ${REGISTRY_HOST}/${REGISTRY_GROUP}/gitlab-build-images/debian-bullseye-ruby-${RUBY_VERSION}:bundler-2.3-docker-${DOCKER_VERSION}
extends:
- .db-job-base
- .use-docker-in-docker
variables:
UPGRADE_STOP: 16.3.6-ee
UPGRADE_STOP_IMAGE: gitlab/gitlab-ee:${UPGRADE_STOP}.0
UPGRADE_STOP_TAG: v${UPGRADE_STOP}
before_script:
# pull, seed, and export data from previous Upgrade Stop
- docker pull "${UPGRADE_STOP_IMAGE}"
- |
docker run \
-d \
-v ./scripts/data_seeder:/opt/gitlab/embedded/service/gitlab-rails/scripts/data_seeder \
-v ./ee/db/seeds/data_seeder:/opt/gitlab/embedded/service/gitlab-rails/ee/db/seeds/data_seeder \
-v ./ee/lib/tasks/gitlab/seed:/opt/gitlab/embedded/service/gitlab-rails/ee/lib/tasks/gitlab/seed \
--name gitlab \
"${UPGRADE_STOP_IMAGE}"
- docker exec gitlab bash -c "cd /opt/gitlab/embedded/service/gitlab-rails; REF='${UPGRADE_STOP_TAG}' . scripts/data_seeder/test_resources.sh"
- |
docker exec gitlab bash -c "cd /opt/gitlab/embedded/service/gitlab-rails; echo \"gem 'gitlab-rspec', path: 'gems/gitlab-rspec'\" >> Gemfile"
- docker exec gitlab bash -c "cd /opt/gitlab/embedded/service/gitlab-rails; ruby scripts/data_seeder/globalize_gems.rb; bundle install"
- docker exec gitlab bash -c "gitlab-ctl reconfigure"
- docker exec gitlab gitlab-rake "ee:gitlab:seed:data_seeder[bulk_data.rb]"
# dump
- docker exec gitlab bash -c "mkdir /tmp/xfer; chown gitlab-psql /tmp/xfer"
- |
docker exec gitlab bash -c " \
runuser -l gitlab-psql -c \"pg_dump -U gitlab-psql -h '/var/opt/gitlab/postgresql' gitlabhq_production | gzip > /tmp/xfer/gitlabhq_production.gz\" \
"
script:
- docker cp gitlab:/tmp/xfer/gitlabhq_production.gz .
artifacts:
paths: ["gitlabhq_production.gz"]
expire_in: 3d
when: manual
allow_failure: true
db:migrate:multi-version-upgrade-2:
stage: test
extends:
- .db-job-base
script:
- gunzip gitlabhq_production.gz
- bundle exec rake db:drop db:create
- apt-get update -qq && apt-get install -y -qq postgresql
- psql -h postgres -U postgres -d gitlabhq_test < gitlabhq_production
- bundle exec rake gitlab:db:configure
needs: ["db:migrate:multi-version-upgrade-1"]
db:migrate:reset: db:migrate:reset:
extends: .db-job-base extends: .db-job-base
script: script:
......
...@@ -14,7 +14,76 @@ FactoryBot already reflects the change. ...@@ -14,7 +14,76 @@ FactoryBot already reflects the change.
## Docker Setup ## Docker Setup
See [Data Seeder Docker Demo](https://gitlab.com/-/snippets/2390362) ### Prerequisites
- Docker installation
### Steps
#### Run a GitLab container
Run and wait for the container to start. The container has started completely when you see the login page at `http://localhost:8080`.
##### With GDK
```shell
$ docker run \
-d \
-v ./scripts/data_seeder:/opt/gitlab/embedded/service/gitlab-rails/scripts/data_seeder \
-v ./ee/db/seeds/data_seeder:/opt/gitlab/embedded/service/gitlab-rails/ee/db/seeds/data_seeder \
-v ./ee/lib/tasks/gitlab/seed:/opt/gitlab/embedded/service/gitlab-rails/ee/lib/tasks/gitlab/seed \
--name gitlab \
gitlab/gitlab-ee:16.7.0-ee.0
```
##### Without GDK
```shell
$ docker run \
--name gitlab \
-d \
gitlab/gitlab-ee:16.7.0-ee.0
```
### Get the root password
If you need to fetch the password for the GitLab instance that was spun up, execute the following command and use the password given by the output:
```shell
$ docker exec gitlab cat /etc/gitlab/initial_root_password
5iveL!fe
```
_If you receive `cat: /etc/gitlab/initialize_root_password: No such file or directory`, please wait for a bit for GitLab to boot and try again._
You can then sign into `http://localhost:8080/users/sign_in` using the credentials: `root / <Password taken from initial_root_password>`
### Import the test resources
Because Seeding uses GitLab test resources and given that the GitLab Docker container is meant to be slim, the container does not ship with test resources by default.
By default, the default GitLab branch `master` is checked out. This means that whatever is "latest" will be checked out. To change this, you can override this ref using the `REF` environment variable.
Execute the following command to provide test resources (namely, FactoryBot Factories) for the Seeder to use.
```ruby
$ docker exec gitlab bash -c "wget -O - https://gitlab.com/gitlab-org/gitlab/-/raw/master/scripts/data_seeder/test_resources.sh | bash"
# OR check out a specific ref
$ docker exec gitlab bash -c "wget -O - https://gitlab.com/gitlab-org/gitlab/-/raw/master/scripts/data_seeder/test_resources.sh | REF=v16.7.0-ee bash"
```
### Seed the data
**IMPORTANT**: This step should not be executed until the container has started completely and you are able to see the login page at `http://localhost:8080`.
```shell
$ docker exec -it gitlab bash -c "cd /opt/gitlab/embedded/service/gitlab-rails; wget -O - https://gitlab.com/gitlab-org/gitlab/-/raw/master/scripts/data_seeder/globalize_gems.rb | ruby; bundle install"
Fetching gems...
$ docker exec -it gitlab gitlab-rake "ee:gitlab:seed:data_seeder[beautiful_data.rb]"
Seeding data for Administrator
..........................
```
## GDK Setup ## GDK Setup
...@@ -31,20 +100,17 @@ ci: migrated ...@@ -31,20 +100,17 @@ ci: migrated
### Run ### Run
The `ee:gitlab:seed:data_seeder` Rake task takes two arguments. `:name` and `:namespace_id`. The `ee:gitlab:seed:data_seeder` Rake task takes one argument. `:file`.
```shell ```shell
$ bundle exec rake "ee:gitlab:seed:data_seeder[data_seeder,1]" $ bundle exec rake "ee:gitlab:seed:data_seeder[beautiful_data.rb]"
Seeding Data for Administrator Seeding data for Administrator
....
``` ```
#### `:name` #### `:file`
Where `:name` is the filename. (This name reflects relative `.rb`, `.yml`, or `.json` files located in `ee/db/seeds/data_seeder`, or absolute paths to seed files.)
#### `:namespace_id` Where `:file` is the file path. (This path reflects relative `.rb`, `.yml`, or `.json` files located in `ee/db/seeds/data_seeder`, or absolute paths to seed files.)
Where `:namespace_id` is the ID of the User or Group Namespace
## Develop ## Develop
...@@ -64,7 +130,7 @@ The Data Seeder uses FactoryBot definitions from `spec/factories` which ... ...@@ -64,7 +130,7 @@ The Data Seeder uses FactoryBot definitions from `spec/factories` which ...
Factories reside in `spec/factories/*` and are fixtures for Rails models found in `app/models/*`. For example, For a model named `app/models/issue.rb`, the factory will Factories reside in `spec/factories/*` and are fixtures for Rails models found in `app/models/*`. For example, For a model named `app/models/issue.rb`, the factory will
be named `spec/factories/issues.rb`. For a model named `app/models/project.rb`, the factory will be named `app/models/projects.rb`. be named `spec/factories/issues.rb`. For a model named `app/models/project.rb`, the factory will be named `app/models/projects.rb`.
There are three parsers that the GitLab Data Seeder supports. Ruby, YAML, and JSON. Three parsers currently exist that the GitLab Data Seeder supports. Ruby, YAML, and JSON.
### Ruby ### Ruby
...@@ -74,8 +140,9 @@ The `DataSeeder` class contains the following instance variables defined upon se ...@@ -74,8 +140,9 @@ The `DataSeeder` class contains the following instance variables defined upon se
- `@seed_file` - The `File` object. - `@seed_file` - The `File` object.
- `@owner` - The owner of the seed data. - `@owner` - The owner of the seed data.
- `@name` - The name of the seed. This is the seed filename without the extension. - `@name` - The name of the seed. This is the seed file name without the extension.
- `@group` - The root group that all seeded data is created under. - `@group` - The root group that all seeded data is created under.
- `@logger` - The logger object to log output. Logging output may be found in `log/data_seeder.log`.
```ruby ```ruby
# frozen_string_literal: true # frozen_string_literal: true
...@@ -83,6 +150,8 @@ The `DataSeeder` class contains the following instance variables defined upon se ...@@ -83,6 +150,8 @@ The `DataSeeder` class contains the following instance variables defined upon se
class DataSeeder class DataSeeder
def seed def seed
my_group = create(:group, name: 'My Group', path: 'my-group-path', parent: @group) my_group = create(:group, name: 'My Group', path: 'my-group-path', parent: @group)
@logger.info "Created #{my_group.name}" #=> Created My Group
my_project = create(:project, :public, name: 'My Project', namespace: my_group, creator: @owner) my_project = create(:project, :public, name: 'My Project', namespace: my_group, creator: @owner)
end end
end end
...@@ -130,6 +199,25 @@ The JSON Parser allows you to house seed files in JSON format. ...@@ -130,6 +199,25 @@ The JSON Parser allows you to house seed files in JSON format.
} }
``` ```
### Logging
When running the Data Seeder, the default level of logging is set to "information".
You can override the logging level by specifying `GITLAB_LOG_LEVEL=<level>`.
```shell
$ GITLAB_LOG_LEVEL=debug bundle exec rake "ee:gitlab:seed:data_seeder[beautiful_data.rb]"
Seeding data for Administrator
......
$ GITLAB_LOG_LEVEL=warn bundle exec rake "ee:gitlab:seed:data_seeder[beautiful_data.rb]"
Seeding data for Administrator
......
$ GITLAB_LOG_LEVEL=error bundle exec rake "ee:gitlab:seed:data_seeder[beautiful_data.rb]"
......
```
### Taxonomy of a Factory ### Taxonomy of a Factory
Factories consist of three main parts - the **Name** of the factory, the **Traits** and the **Attributes**. Factories consist of three main parts - the **Name** of the factory, the **Traits** and the **Attributes**.
......
# frozen_string_literal: true
require 'ffaker'
class DataSeeder
# @example bundle exec rake "ee:gitlab:seed:data_seeder[bulk_data.rb]"
# @example GITLAB_LOG_LEVEL=debug bundle exec rake "ee:gitlab:seed:data_seeder[bulk_data.rb]"
def seed
build_super_group_labels
build_subgroups
end
private
def uuid
SecureRandom.uuid
end
# Generate a random number
# @return [Integer] random number
def random_number
rand(1..3)
end
def random_future_date
random_number.days.from_now
end
def random_past_date
random_number.days.ago
end
def random_text
FFaker::Lorem.paragraph
end
# @return [Array<Symbol>] random traits
def random_traits_for(factory)
FactoryBot.factories.find(factory).defined_traits.map(&:name).sample(rand(0..2)).map(&:to_sym)
end
# Build Group Labels in the Supergroup
def build_super_group_labels
random_number.times do
build(:group_label, group: @group, title: uuid, &:save)
end
end
# Build subgroups in the Supergroup
def build_subgroups
random_number.times do
build(:group, name: uuid, parent: @group) do |subgroup|
next unless subgroup.save
build_group_labels(subgroup)
build_milestones(subgroup)
build_epics(subgroup)
build_projects(subgroup)
end
end
end
# Build Group Labels for a Group
# @param [Group] group
def build_group_labels(group)
build(:group_label, group: group, title: uuid, &:save)
end
# Build Milestones for a Group
# @param [Group] group
def build_milestones(group)
build(:milestone, :on_group, *random_traits_for(:milestone), title: uuid, group: group, &:save)
end
# Build Epics for a Group
# @param [Group] group
def build_epics(group)
random_number.times do
build(:epic, *random_traits_for(:epic), group: group, author: @owner, &:save)
end
end
# Build Projects
# @param [Group] subgroup
def build_projects(subgroup)
random_number.times do
build(:project, *random_traits_for(:project), path: uuid, group: subgroup) do |project|
project.description = random_text
next unless project.save
build_project_labels(project)
build_issues(project)
build_merge_requests(project)
end
end
end
# Build Project Labels
# @param [Project] project
def build_project_labels(project)
build(:label, project: project, title: uuid) do |label|
label.description = random_text
label.save
end
end
# Build Issues for a Project
# @param [Project] project
def build_issues(project)
random_number.times do
build(:issue, *random_traits_for(:issue), project: project, author: @owner) do |issue|
issue.description = random_text
issue.due_date = random_future_date
next unless issue.save
# Assign random Super Group Labels to issues
issue.labels << @group.labels.sample(rand(0..@group.labels.count))
# Assign random Group Labels to issues
issue.labels << project.group.labels.sample(rand(0..project.group.labels.count))
# Assign random Project Labels to issues
issue.labels << project.labels.sample(rand(0..project.labels.count))
assign_random_weight(issue)
assign_random_milestone(issue)
# Notes
random_number.times do
create(:note, noteable: issue, project: project)
end
end
end
end
# Build Merge Requests for a Project
# @param [Project] project
def build_merge_requests(project)
random_number.times do
build(:merge_request, *random_traits_for(:merge_request), source_project: project,
author: @owner) do |merge_request|
merge_request.assignee = @owner
# Assign random Super group labels
merge_request.labels << @group.labels.sample(rand(0..@group.labels.count))
# Assign random Group labels
merge_request.labels << project.group.labels.sample(rand(0..project.group.labels.count))
# Assign random Project labels
merge_request.labels << project.labels.sample(rand(0..project.labels.count))
merge_request.description = random_text
merge_request.save
end
end
end
# Assign a random Weight to an Issue
# @param [Issue] issue
def assign_random_weight(issue)
create(
:resource_weight_event,
issue: issue,
user: @owner,
weight: random_number,
created_at: random_past_date
)
end
# Assign a random Milestone to an Issue
# @param [Issue] issue
def assign_random_milestone(issue)
create(
:resource_milestone_event,
issue: issue,
milestone: issue.project.group.milestones.sample,
created_at: random_past_date,
action: 'add'
)
end
end
# frozen_string_literal: true # frozen_string_literal: true
require 'ostruct' require 'ostruct'
require 'rspec'
require Rails.root.join('spec/support/helpers/stub_method_calls')
require Rails.root.join('spec/support/factory_bot')
module Gitlab module Gitlab
module DataSeeder module DataSeeder
...@@ -8,6 +11,14 @@ class << self ...@@ -8,6 +11,14 @@ class << self
# Seed test data using GitLab Data Seeder # Seed test data using GitLab Data Seeder
# @param [String] seed_file the full-path of the seed file to load (.yml, .rb) # @param [String] seed_file the full-path of the seed file to load (.yml, .rb)
def seed(owner, seed_file) def seed(owner, seed_file)
FactoryBot.define do
after(:create) do |resource|
Gitlab::DataSeeder::Logger.info(seeding: "#<#{resource.class.name}>")
print '.'
end
end
case File.basename(seed_file) case File.basename(seed_file)
when /\.y(a)?ml(\.erb)?/ when /\.y(a)?ml(\.erb)?/
Parsers::Yaml.new(seed_file, owner).parse Parsers::Yaml.new(seed_file, owner).parse
...@@ -219,6 +230,7 @@ def initialize(seed_file, owner) ...@@ -219,6 +230,7 @@ def initialize(seed_file, owner)
@seed_file = seed_file @seed_file = seed_file
@owner = owner @owner = owner
@name = File.basename(@seed_file, '.rb') @name = File.basename(@seed_file, '.rb')
@logger = Gitlab::DataSeeder::Logger.build
# create the seeded group with a path that is hyphenated and random # create the seeded group with a path that is hyphenated and random
@group = FactoryBot.create(:group, name: @name, @group = FactoryBot.create(:group, name: @name,
...@@ -236,11 +248,22 @@ def parse ...@@ -236,11 +248,22 @@ def parse
seeder.instance_variable_set(:@owner, @owner) seeder.instance_variable_set(:@owner, @owner)
seeder.instance_variable_set(:@name, @name) seeder.instance_variable_set(:@name, @name)
seeder.instance_variable_set(:@group, @group) seeder.instance_variable_set(:@group, @group)
seeder.instance_variable_set(:@logger, @logger)
seeder.seed seeder.seed
end end
end end
end end
end end
class Logger < Gitlab::Logger
def self.file_name_noext
'data_seeder'
end
def self.log_level(fallback: ::Logger::INFO)
ENV.fetch('GITLAB_LOG_LEVEL', fallback)
end
end
end end
end end
...@@ -4,19 +4,19 @@ namespace :ee do ...@@ -4,19 +4,19 @@ namespace :ee do
namespace :gitlab do namespace :gitlab do
namespace :seed do namespace :seed do
# @example # @example
# $ rake "ee:gitlab:seed:data_seeder[path/to/seed/file,12345]" # $ rake "ee:gitlab:seed:data_seeder[path/to/seed/file(.rb,.yml,json)]"
desc 'Seed test data for a given namespace' desc 'Seed data using GitLab Data Seeder'
task :data_seeder, [:co, :namespace_id] => :environment do |_, argv| task :data_seeder, [:file] => :environment do |_, argv|
require 'factory_bot' require Rails.root.join('ee/db/seeds/data_seeder/data_seeder')
require Rails.root.join('ee/db/seeds/data_seeder/data_seeder.rb')
seed_file = Rails.root.join('ee/db/seeds/data_seeder', argv[:co]) seed_file = Rails.root.join('ee/db/seeds/data_seeder', argv[:file])
raise "Seed file `#{seed_file}` does not exist" unless File.exist?(seed_file) raise "Seed file `#{seed_file}` does not exist" unless File.exist?(seed_file)
puts "Seeding demo data for #{Namespace.find(argv[:namespace_id]).name}" admin = User.admins.first
puts "Seeding data for #{admin.name}"
Gitlab::DataSeeder.seed(User.admins.first, seed_file.to_s) Gitlab::DataSeeder.seed(admin, seed_file.to_s)
end end
end end
end end
......
...@@ -38,7 +38,7 @@ def seed ...@@ -38,7 +38,7 @@ def seed
end end
it 'prints a seeding statement' do it 'prints a seeding statement' do
expect { run_rake }.to output(/Seeding demo data/).to_stdout expect { run_rake }.to output(/Seeding data/).to_stdout
end end
it 'prints a done statement' do it 'prints a done statement' do
......
# frozen_string_literal: true
# This script ...
# - Opens a Gemfile
# - Copies the line that contains a specific gem and its version
# - Pastes the copied lines to EOF
#
# ... to pull the gems out of their defined groups (like :development, :test, etc.)
# @note Duplicate entries will be created which will cause Bundler warnings, but this is expected.
# @usage ruby globalize_gems.rb
GEMS_TO_FIND = %w[factory_bot_rails ffaker parallel].freeze
File.open('Gemfile', 'a+') do |file|
lines_added = []
file.each_line do |line|
next unless line.match?(/gem ['"]#{Regexp.union(GEMS_TO_FIND)}["']/)
lines_added << line
puts line
end
lines_added.each { |ln| file.write(ln) }
end
#!/usr/bin/env bash
# this script ...
# - sparse clones the gitlab repo (with a specific ref) only targeting the spec and ee/spec directories.
# - moves the spec and ee/spec directories to the gitlab-rails service directory within Docker.
set -euo pipefail
ref=${REF:-master}
tmp=$(mktemp -d)
git clone --single-branch --branch "$ref" https://gitlab.com/gitlab-org/gitlab.git --no-checkout --depth 1 "${tmp}"
cd "${tmp}"
git sparse-checkout init --cone; git sparse-checkout add spec ee/spec; git checkout
echo "Checked out ${ref}"
mv spec /opt/gitlab/embedded/service/gitlab-rails; mv ee/spec /opt/gitlab/embedded/service/gitlab-rails/ee
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册