Skip to content
代码片段 群组 项目
  • Lukas 'Eipi' Eipert's avatar
    9d3f4829
    Faster gettext extractor · 9d3f4829
    Lukas 'Eipi' Eipert 创作于
    The rake task `gettext:regenerate` takes around 50 seconds on my machine
    to extract _all_ externalized strings from ruby, haml, erb, js and vue
    sources. It is implemented in a blocking way and therefore parsing
    roughly 20000 files takes a long time.
    
    We are introducing a new tooling script `tooling/bin/gettext_extractor`
    which is 3x faster by making the following improvements:
    
    1. We parallelize the extraction of ruby, haml and erb with the
       `parallel` gem.
    2. Instead of passing files through a parser stack and checking which
       files a parser can parse, we directly call the parser for each file
       type. The original implementation e.g. checked for every file,
       whether it is a glade file (whatever that is), which took a long
       time.
    3. js and vue files are still parsed by a shell-out to the pre-existing
       node script: `scripts/frontend/extract_gettext_all.js`
    
    This new parser is now used under the hood for the rake tasks:
    `gettext:regenerate` and `gettext:update_check`.
    
    There is still room for improvement, and we should look into the
    following ideas:
    
    1. We currently scan `ee/spec`, which probably should not scan. We still
       scan it for now, in order to have parity in results
    2. The shell-out to node can be changed to stream the data, rather than
       blocking until all frontend files are scanned. However initial tests
       have not shown any performance improvements from that.
    3. The `HamlParser` probably could use `GetText::RubyParser` instead of
       `RubyGettextExtractor` under the hood.
    4. We likely can improve parsing performance _a lot_ by adding guard
       checks to see whether a file actually contains the literal names of
       the gettext methods, e.g. `_(`, `n_(` or `N_(`). We are already doing
       that in the Frontend script and improved performance by at least 20
       percent there: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/115561#note_1324725274
    9d3f4829
    历史
    Faster gettext extractor
    Lukas 'Eipi' Eipert 创作于
    The rake task `gettext:regenerate` takes around 50 seconds on my machine
    to extract _all_ externalized strings from ruby, haml, erb, js and vue
    sources. It is implemented in a blocking way and therefore parsing
    roughly 20000 files takes a long time.
    
    We are introducing a new tooling script `tooling/bin/gettext_extractor`
    which is 3x faster by making the following improvements:
    
    1. We parallelize the extraction of ruby, haml and erb with the
       `parallel` gem.
    2. Instead of passing files through a parser stack and checking which
       files a parser can parse, we directly call the parser for each file
       type. The original implementation e.g. checked for every file,
       whether it is a glade file (whatever that is), which took a long
       time.
    3. js and vue files are still parsed by a shell-out to the pre-existing
       node script: `scripts/frontend/extract_gettext_all.js`
    
    This new parser is now used under the hood for the rake tasks:
    `gettext:regenerate` and `gettext:update_check`.
    
    There is still room for improvement, and we should look into the
    following ideas:
    
    1. We currently scan `ee/spec`, which probably should not scan. We still
       scan it for now, in order to have parity in results
    2. The shell-out to node can be changed to stream the data, rather than
       blocking until all frontend files are scanned. However initial tests
       have not shown any performance improvements from that.
    3. The `HamlParser` probably could use `GetText::RubyParser` instead of
       `RubyGettextExtractor` under the hood.
    4. We likely can improve parsing performance _a lot_ by adding guard
       checks to see whether a file actually contains the literal names of
       the gettext methods, e.g. `_(`, `n_(` or `N_(`). We are already doing
       that in the Frontend script and improved performance by at least 20
       percent there: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/115561#note_1324725274
代码所有者
将用户和群组指定为特定文件更改的核准人。 了解更多。