spec/support/tmpdir.rb · 495b15c327d9d5f44abdc34366aac92f7d39ad77 · gitlab-cn / GitLab

1 year ago

The rake task `gettext:regenerate` takes around 50 seconds on my machine
to extract _all_ externalized strings from ruby, haml, erb, js and vue
sources. It is implemented in a blocking way and therefore parsing
roughly 20000 files takes a long time.

We are introducing a new tooling script `tooling/bin/gettext_extractor`
which is 3x faster by making the following improvements:

1. We parallelize the extraction of ruby, haml and erb with the
`parallel` gem.
2. Instead of passing files through a parser stack and checking which
files a parser can parse, we directly call the parser for each file
type. The original implementation e.g. checked for every file,
whether it is a glade file (whatever that is), which took a long
time.
3. js and vue files are still parsed by a shell-out to the pre-existing
node script: `scripts/frontend/extract_gettext_all.js`

This new parser is now used under the hood for the rake tasks:
`gettext:regenerate` and `gettext:update_check`.

There is still room for improvement, and we should look into the
following ideas:

1. We currently scan `ee/spec`, which probably should not scan. We still
scan it for now, in order to have parity in results
2. The shell-out to node can be changed to stream the data, rather than
blocking until all frontend files are scanned. However initial tests
have not shown any performance improvements from that.
3. The `HamlParser` probably could use `GetText::RubyParser` instead of
`RubyGettextExtractor` under the hood.
4. We likely can improve parsing performance _a lot_ by adding guard
checks to see whether a file actually contains the literal names of
the gettext methods, e.g. `_(`, `n_(` or `N_(`). We are already doing
that in the Frontend script and improved performance by at least 20
percent there: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/115561#note_1324725274

9d3f4829

历史

Faster gettext extractor

由 Lukas 'Eipi' Eipert 创作于 1 year ago

We are introducing a new tooling script `tooling/bin/gettext_extractor`
which is 3x faster by making the following improvements:

This new parser is now used under the hood for the rake tasks:
`gettext:regenerate` and `gettext:update_check`.

There is still room for improvement, and we should look into the
following ideas:

代码所有者

将用户和群组指定为特定文件更改的核准人。了解更多。