- 10月 14, 2020
- 10月 07, 2020
- 9月 23, 2020
-
-
由 Rohit Shekhar 创作于
-
由 Jason Gustafson 创作于
This is the core Raft implementation specified by KIP-595: https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum . We have created a separate "raft" module where most of the logic resides. The new APIs introduced in this patch in order to support Raft election and such are disabled in the server until the integration with the controller is complete. Until then, there is a standalone server which can be used for testing the performance of the Raft implementation. See `raft/README.md` for details. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Boyang Chen <boyang@confluent.io> Co-authored-by:
Boyang Chen <boyang@confluent.io> Co-authored-by:
Guozhang Wang <wangguoz@gmail.com>
-
由 Bruno Cadonna 创作于
Currently when a task directory is cleaned manually the message for the state dir cleaner is logged instead of the message for the manual cleanup. This is because the code checks the elapsed time since the last update before it checks whether the cleanup is a manual call. This commit changes the order of the checks. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Walker Carlson <wcarlson@confluent.io>, John Roesler <vvcephei@apache.org>
-
- 9月 22, 2020
-
-
由 A. Sophie Blee-Goldman 创作于
Add tests to bound the performance of the various Streams task assignors when making assignments over large clusters/tasks. Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
-
由 Luke Chen 创作于
Reviewer: Matthias J. Sax <matthias@confluent.io>
-
由 Ismael Juma 创作于
`forKeyValue` invokes `foreachEntry` in Scala 2.13 and falls back to `foreach` in Scala 2.12. This change requires a newer version of scala-collection-compat, so update it to the latest version (2.2.0). Finally, included a minor clean-up in `GetOffsetShell` to use `toArray` before `sortBy` since it's more efficient. Reviewers: Jason Gustafson <jason@confluent.io>, David Jacot <djacot@confluent.io>, José Armando García Sancio <jsancio@users.noreply.github.com>, Chia-Ping Tsai <chia7712@gmail.com>
-
由 Luke Chen 创作于
Fix the `currentStateTimeStamp` doesn't get set in `GROUP_METADATA_VALUE_SCHEMA_V3`, and did a small refactor to use the `GROUP_VALUE_SCHEMAS.size - 1` replace the default hard-coded max version number. Also add test for it. Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
-
- 9月 21, 2020
-
-
由 Chia-Ping Tsai 创作于
There are no checks on the header key so instantiating key (bytes to string) is unnecessary. One implication is that conversion failures will be detected a bit later, but this is consistent with how we handle the header value. **JMH RESULT** 1. ops: +12% 1. The optimization of memory usage is very small as the cost of creating extra ```ByteBuffer``` is almost same to byte array copy (used to construct ```String```). Using large key results in better improvement but I don't think large key is common case. **BEFORE** ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2035938.174 ± 1653.566 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2040.000 ± 0.001 B/op ``` ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 1979193.376 ± 1239.286 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2120.000 ± 0.001 B/op ``` **AFTER** ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2289115.973 ± 2661.856 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 10 200 5 1000 2 thrpt 15 2032.000 ± 0.001 B/op ``` ``` Benchmark (bufferSupplierStr) (bytes) (compressionType) (headerKeySize) (maxBatchSize) (maxHeaderSize) (messageSize) (messageVersion) Mode Cnt Score Error Units RecordBatchIterationBenchmark.measureValidation NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2222625.706 ± 908.358 ops/s RecordBatchIterationBenchmark.measureValidation:·gc.alloc.rate.norm NO_CACHING RANDOM NONE 30 200 5 1000 2 thrpt 15 2040.000 ± 0.001 B/op ``` Reviewers: Ismael Juma <ismael@juma.me.uk>
-
由 Ismael Juma 创作于
We have been testing with the Java 15 release candidate for a few weeks and it has now been declared final. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, David Jacot <djacot@confluent.io>, Lee Dongjin <dongjin@apache.org>
-
- 9月 20, 2020
-
-
由 A. Sophie Blee-Goldman 创作于
Reviewers: Guozhang Wang <wangguoz@gmail.com>
-
- 9月 19, 2020
-
-
由 Zach Zhang 创作于
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
-
由 Tom Bentley 创作于
Currently the docs have HTML ids for each config key. That doesn't work correctly for config keys like bootstrap.servers which occur across producer, consumer, admin configs: We generate duplicate ids. So arrange for each config to prefix the ids it generates with the HTML id of its section heading. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
-
由 Luke Chen 创作于
In the test, we first test removing 1 member from group, and then test removing the other 2 members from group, and it failed sometimes at the 2nd member number assert. After investigation, I found it's because we enabled auto commit for the consumers(default setting), and the removed consumer offset commit will get the `UNKNOWN_MEMBER_ID` error, which will then make the member rejoin. (check ConsumerCoordinator#OffsetCommitResponseHandler) So, that's why after the 2nd members removing, the members will sometimes be not empty. I set the consumer config to disable the auto commit to fix this issue. Thanks. Author: Luke Chen <showuon@gmail.com> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #9062 from showuon/KAFKA-8098
-
- 9月 18, 2020
-
-
由 Bruno Cadonna 创作于
Reviewers: John Roesler <vvcephei@apache.org>
-
由 Jason Gustafson 创作于
This patch fixes a couple problems with the use of the `StructRegistry`. First, it fixes registration so that it is consistently based on the typename of the struct. Previously structs were registered under the field name which meant that fields which referred to common structs resulted in multiple entries. Second, the patch fixes `SchemaGenerator` so that common structs are considered first. Reviewers: Colin P. McCabe <cmccabe@apache.org>
-
由 Justine Olshan 创作于
Reviewers: David Jacot <david.jacot@gmail.com>, Boyang Chen <boyang@confluent.io>, David Arthur <mumrah@gmail.com>
-
- 9月 17, 2020
-
-
由 David Jacot 创作于
-
由 Jason Gustafson 创作于
This patch changes the Fetch response schema to include both the diverging epoch and its end offset rather than just the offset. This allows for more accurate truncation on the follower. This is the schema that was originally specified in KIP-595, but we altered it during the discussion. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
-
- 9月 16, 2020
-
-
由 Bruno Cadonna 创作于
Reviewers: Walker Carlson <wcarlson@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
-
由 Jason Gustafson 创作于
This patch bumps the `Fetch` protocol as specified by KIP-595: https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum. The main differences are the following: - Truncation detection - Leader discovery through the response - Flexible version support The most notable change is truncation detection. This patch adds logic in the request handling path to detect truncation, but it does not change the replica fetchers to make use of this capability. This will be done separately. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
-
由 Bruno Cadonna 创作于
The test StreamsBrokerBounceTest.test_all_brokers_bounce() fails on 2.5 because in the last stage of the test there is only one broker left and the offset commit cannot succeed because the min.insync.replicas of __consumer_offsets is set to 2 and acks is set to all. This causes a time out and extends the closing of the Kafka Streams client to beyond the duration passed to the close method of the client. This affects especially the 2.5 branch since there Kafka Streams commits offsets for each task, i.e., close() needs to wait for the timeout for each task. In 2.6 and trunk the offset commit is done per thread, so close() does only need to wait for one time out per stream thread. I opened this PR on trunk, since the test could also become flaky on trunk and we want to avoid diverging system tests across branches. A more complete solution would be to improve the test by defining a better success criteria. Reviewers: Guozhang Wang <wangguoz@gmail.com>
-
- 9月 15, 2020
-
-
由 David Arthur 创作于
-
由 Ron Dagostino 创作于
Reviewers: Colin P. McCabe <cmccabe@apache.org>
-
由 Chia-Ping Tsai 创作于
`openjdk:8` includes `git` by default, but `openjdk:11` does not. Install `git` explicitly to make it easier to test with newer openjdk versions. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
-
由 David Jacot 创作于
This PR fixes two issues that have been introduced by #9114. - When the metric was switched from Rate to TokenBucket in the ControllerMutationQuotaManager, the metrics were mixed up. That broke the quota update path. - When a quota is updated, the ClientQuotaManager updates the MetricConfig of the KafkaMetric. That update was not reflected into the Sensor so the Sensor was still using the MetricConfig that it has been created with. Reviewers: Anna Povzner <anna@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
-
- 9月 14, 2020
-
-
由 Luke Chen 创作于
In KIP-113, we support replicas movement between log directories. But while the directory change, we forgot to remove the topicPartition offset data in old directory, which will cause there are more than 1 checkpoint copy stayed in the logs for the altered topicPartition. And it'll let the LogCleaner get stuck due to it's possible to always get the old topicPartition offset data from the old checkpoint file. I added one more parameter topicPartitionToBeRemoved in updateCheckpoints() method. So, if the update parameter is None (as before), we'll do the remove action to remove the topicPartitionToBeRemoved data in dir, otherwise, update the data as before. Reviewers: Jun Rao <junrao@gmail.com>
-
由 Luke Chen 创作于
Currently, the source reference are all pointing to the 1.0 version codes, which is obviously wrong. Update to the current dotVersion. Reviewers: John Roesler <vvcephei@apache.org>
-
由 Ismael Juma 创作于
The final release is now out: https://junit.org/junit5/docs/5.7.0/release-notes/index.html Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
-
- 9月 12, 2020
-
-
由 khairy 创作于
Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.com>
-
由 khairy 创作于
Reviewers: Bill Bejeck <bbejeck@apache.org>
-
由 khairy 创作于
Reviewers: Bill Bejeck <bbejeck@apache.org>
-
由 leah 创作于
Add a backwardFetch call to the window store for sliding window processing. While the implementation works with the forward call to the window store, using backwardFetch allows for the iterator to be closed earlier, making implementation more efficient. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
-
- 9月 11, 2020
-
-
由 leah 创作于
Add necessary documentation for KIP-450, adding sliding window aggregations to KStreams Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
-
由 Ismael Juma 创作于
This change sets the groundwork for migrating other modules incrementally. Main changes: - Replace `junit` 4.13 with `junit-jupiter` and `junit-vintage` 5.7.0-RC1. - All modules except for `tools` depend on `junit-vintage`. - `tools` depends on `junit-jupiter`. - Convert `tools` tests to JUnit 5. - Update `PushHttpMetricsReporterTest` to use `mockito` instead of `powermock` and `easymock` (powermock doesn't seem to work well with JUnit 5 and we don't need it since mockito can mock static methods). - Update `mockito` to 3.5.7. - Update `TestUtils` to use JUnit 5 assertions since `tools` depends on it. Unrelated clean-ups: - Remove `unit` from package names in a few `core` tests. - Replace `try/catch/fail` with `assertThrows` in a number of places. - Tag `CoordinatorTest` as integration test. - Remove unnecessary type parameters when invoking methods and constructors. Tested with IntelliJ and gradle. Verified that the following commands work as expected: * ./gradlew tools:unitTest * ./gradlew tools:integrationTest * ./gradlew tools:test * ./gradlew core:unitTest * ./gradlew core:integrationTest * ./gradlew clients:test Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
-
由 Guozhang Wang 创作于
1. Split the consumer coordinator's REBALANCING state into PREPARING_REBALANCE and COMPLETING_REBALANCE. The first is when the join group request is sent, and the second is after the join group response is received. During the first state we should still not send hb since it shares the same socket with the join group request and the group coordinator has disabled timeout, however when we transit to the second state we should start sending hb in case leader's assign takes long time. This is also for fixing KAFKA-10122. 2. When deciding coordinator#timeToNextPoll, do not count in timeToNextHeartbeat if the state is in UNJOINED or PREPARING_REBALANCE since we would disable hb and hence its timer would not be updated. 3. On the broker side, allow hb received during PREPARING_REBALANCE, return NONE error code instead of REBALANCE_IN_PROGRESS. However on client side, we still need to ignore REBALANCE_IN_PROGRESS if state is COMPLETING_REBALANCE in case it is talking to an old versioned broker. 4. Piggy-backing a log4j improvement on the broker coordinator for triggering rebalance reason, as I found it a bit blurred during the investigation. Also subsumed #9038 with log4j improvements. The tricky part for allowing hb during COMPLETING_REBALANCE is in two parts: 1) before the sync-group response is received, a hb response may have reset the generation; also after the sync-group response but before the callback is triggered, a hb response can still reset the generation, we need to handle both cases by checking the generation / state. 2) with the hb thread enabled, the sync-group request may be sent by the hb thread even if the caller thread did not call poll yet. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>
-
由 John Roesler 创作于
Add debug logs to see when Streams calls poll, process, commit, etc. Reviewers: Walker Carlson <wcarlson@confluent.io>, Guozhang Wang <guozhang@apache.org>
-
由 David Jacot 创作于
Fixes flakiness in `KafkaAdminClientTest` as a result of #8864. Addresses the following flaky tests: - testAlterReplicaLogDirsPartialFailure - testDescribeLogDirsPartialFailure - testMetadataRetries Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
-
由 Jason Gustafson 创作于
The schema specification allows a struct type name to differ from the field name. This works with the generated `Message` classes, but not with the generated JSON converter. The patch fixes the problem, which is that the type name is getting replaced with the field name when the struct is registered in the `StructRegistry`. Reviewers: Colin P. McCabe <cmccabe@apache.org>
-