Skip to content
代码片段 群组 项目
未验证 提交 a65928cf 编辑于 作者: Nick Thomas's avatar Nick Thomas
浏览文件

Add a bulk processor for ES incremental updates

Currently, we store bookkeeping information for the elasticsearch index
in sidekiq jobs. There are four types of information:

* Backfill indexing for repositories
* Backfill indexing for database records
* Incremental indexing for repositories
* Incremental indexing for database records

The first three use elasticsearch bulk requests when indexing. The last
does not.

This commit introduces a system that uses bulk requests when indexing
incremental changes to database records. This is done by adding the
bookkeeping information to a Redis ZSET, rather than enqueuing sidekiq
jobs for each change. A Sidekiq cron worker takes batches from the ZSET
and submits them to elasticsearch via the bulk API.

This reduces the responsiveness of indexing slightly, but also reduces
the cost of indexing, both in terms of the load on Elasticsearch, and
the size of the bookkeeping information.

Since we're using a ZSET, we also get deduplication of work for free.
上级 a26c6fae
No related branches found
No related tags found
加载中
显示
941 个添加19 个删除
加载中
0% 加载中 .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册