提交 · v6.1.0-beta200724211454 · Archie Kelly / kafka

7月 25, 2020
- Set Confluent to 6.1.0-beta200724211454, Kafka to 6.1.0-beta200724211454. · d63f0d19
  由 Confluent Jenkins Bot 创作于 4年前
  
  v6.1.0-beta200724211454
  
  d63f0d19
7月 22, 2020
- Merge branch 'master' of https://github.com/confluentinc/kafka into trunk · cd74e9f5
  由 Confluent Jenkins Bot 创作于 4年前
  
  cd74e9f5
7月 21, 2020

MINOR; Move quota integration tests to using the new quota API. (#8954) · b63fecb6
由 David Jacot 创作于 4年前
```
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
```
b63fecb6

KAFKA-9432:(follow-up) Set `configKeys` to null in `describeConfigs()` to make... · c38825ab

由 Manikumar Reddy 创作于 4年前

KAFKA-9432:(follow-up) Set `configKeys` to null in `describeConfigs()` to make it backward compatible with older Kafka versions.

- After #8312, older brokers are returning empty configs,  with latest `adminClient.describeConfigs`.  Old brokers  are receiving empty configNames in `AdminManageer.describeConfigs()` method. Older brokers does not handle empty configKeys. Due to this old brokers are filtering all the configs.
- Update ClientCompatibilityTest to verify describe configs
- Add test case to test describe configs with empty configuration Keys

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #9046 from omkreddy/KAFKA-9432

c38825ab

KAFKA-10279; Allow dynamic update of certificates with additional SubjectAltNames (#9044) · 6162a153
由 Rajini Sivaram 创作于 4年前
```
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
```
6162a153

KAFKA-10189: reset event queue time histogram when queue is empty (#8935) · f77e250b

由 Jeff Kim 创作于 4年前

add a timeout for event queue time histogram;
reset eventQueueTimeHist when the controller event queue is empty;

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Igor Soarez <i@soarez.me>, Jun Rao <junrao@gmail.com>

f77e250b

KAFKA-9161: add docs for KIP-441 and KIP-613 and other configs that need fixing (#9027) · 9b30276e

由 A. Sophie Blee-Goldman 创作于 4年前

Add docs for KIP-441 and KIP-613.
Fixed some miscellaneous unrelated issues in the docs:
* Adds some missing configs to the Streams config docs: max.task.idle.ms,topology.optimization, default.windowed.key.serde.inner.class, and default.windowed.value.serde.inner.class
* Defines the previously-undefined default windowed serde class configs, including choosing a default (null) and giving them a doc string, so the yshould nwo show up in the auto-generated general Kafka config docs
* Adds a note to warn users about the rocksDB bug that prevents setting a strict capacity limit and counting write buffer memory against the block cache

Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>

9b30276e

Merge branch 'trunk' of github.com:apache/kafka into trunk-sync-20-july · bcb9f6a1
由 David Mao 创作于 4年前

bcb9f6a1

7月 20, 2020

KAFKA-10295: Wait for connector recovery in test_bounce (#9043) · f4944ee4
由 Greg Harris 创作于 4年前
```
Signed-off-by: Greg Harris <gregh@confluent.io>
```
f4944ee4

KAFKA-10286: Connect system tests should wait for workers to join group (#9040) · 5a2a7c63

由 Greg Harris 创作于 4年前

Currently, the system tests `connect_distributed_test` and `connect_rest_test` only wait for the REST api to come up.
The startup of the worker includes an asynchronous process for joining the worker group and syncing with other workers.
There are some situations in which this sync takes an unusually long time, and the test continues without all workers up.
This leads to flakey test failures, as worker joins are not given sufficient time to timeout and retry without waiting explicitly.

This changes the `ConnectDistributedTest` to wait for the Joined group message to be printed to the logs before continuing with tests. I've activated this behavior by default, as it's a superset of the checks that were performed by default before.

This log message is present in every version of DistributedHerder that I could find, in slightly different forms, but always with `Joined group` at the beginning of the log message. This change should be safe to backport to any branch.

Signed-off-by: Greg Harris <gregh@confluent.io>
Author: Greg Harris <gregh@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>

5a2a7c63

MINOR: Enable broker/client compatibility tests for 2.5.0 release · b02fa534

由 Manikumar Reddy 创作于 4年前

- Add missing broker/client compatibility tests for 2.5.0 release

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #9041 from omkreddy/compat

b02fa534

7月 19, 2020
- MINOR: Fixed some resource leaks. (#8922) · cd6850b4
  由 Leonard Ge 创作于 4年前
  
  Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
  cd6850b4
7月 18, 2020

MINOR: Improved code quality for various files. (#9037) · b988de28
由 Leonard Ge 创作于 4年前
```
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
```
b988de28

MINOR: log4j migration to confluent repackaged version (#362) · e5d9b925

由 Nitesh Mor 创作于 4年前

Context: log4j v1 has reached end of life many years ago, and is affected by CVE-2019-17571
Confluent repackaged version of log4j fixes the security vulnerabilities.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jeff Kim <jeff.kim@confluent.io>

e5d9b925

ST-3402: Fix to get artifactory settings for preview release builds (#364) · beb3fd9e
由 elismaga 创作于 4年前

beb3fd9e

KAFKA-10223; Use NOT_LEADER_OR_FOLLOWER instead of non-retriable... · 9c8f75c4

由 Rajini Sivaram 创作于 4年前

KAFKA-10223; Use NOT_LEADER_OR_FOLLOWER instead of non-retriable REPLICA_NOT_AVAILABLE for consumers (#8979)

Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames NOT_LEADER_FOR_PARTITION to NOT_LEADER_OR_FOLLOWER and uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException.
    - ALTER_REPLICA_LOG_DIRS continues to return REPLICA_NOT_AVAILABLE. Retained this for compatibility since this request never returned NOT_LEADER_FOR_PARTITION earlier. 
   -  MetadataRequest version 0 also returns REPLICA_NOT_AVAILABLE as topic-level error code for compatibility. Newer versions filter these out and return Errors.NONE, so didn't change this.
   - Partition responses in MetadataRequest return REPLICA_NOT_AVAILABLE to indicate that one of the replicas is not available. Did not change this since NOT_LEADER_FOR_PARTITION is not suitable in this case.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Bob Barrett <bob.barrett@confluent.io>

9c8f75c4

MINOR: Improve log4j for per-consumer assignment (#8997) · 715df0d2

由 Guozhang Wang 创作于 4年前

Add log4j entry summarizing the assignment (previous owned and assigned) at the consumer level.

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>

715df0d2

MINOR: Fix flaky system test assertion after static member fencing (#9033) · 6d2c7802

由 Jason Gustafson 创作于 4年前

The test case `OffsetValidationTest.test_fencing_static_consumer` fails periodically due to this error:
```
Traceback (most recent call last):
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 134, in run
    data = self.run_test()
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/tests/runner_client.py", line 192, in run_test
    return self.test_context.function(self.test)
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/venv/lib/python2.7/site-packages/ducktape-0.7.8-py2.7.egg/ducktape/mark/_mark.py", line 429, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/jenkins/workspace/system-test-kafka_2.6/kafka/tests/kafkatest/tests/client/consumer_test.py", line 257, in test_fencing_static_consumer
    assert len(consumer.dead_nodes()) == num_conflict_consumers
AssertionError
```
When a consumer stops, there is some latency between when the shutdown is observed by the service and when the node is added to the dead nodes. This patch fixes the problem by giving some time for the assertion to be satisfied.

Reviewers: Boyang Chen <boyang@confluent.io>

6d2c7802

7月 17, 2020
- MINOR: Filter out quota configs for ConfigCommand using --bootstrap-server (#9030) · 99c64822
  由 Rajini Sivaram 创作于 4年前
  
  Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, David Jacot <djacot@confluent.io>, Ron Dagostino <rdagostino@confluent.io>
  99c64822
- Merge branch 'apache/kafka/trunk' into 'master'. · 11baf82d
  由 Brian Byrne 创作于 4年前
  
  11baf82d
- KAFKA-10174: Prefer --bootstrap-server for configs command in ducker tests (#8948) · 796fae25
  由 vinoth chandar 创作于 4年前
  
  Reviewers: Colin P. McCabe <cmccabe@apache.org>
  796fae25
7月 16, 2020

KAFKA-9666; Don't increase producer epoch when trying to fence if the log append fails (#8239) · 2eeae09c

由 Bob Barrett 创作于 4年前

When fencing producers, we currently blindly bump the epoch by 1 and write an abort marker to the transaction log. If the log is unavailable (for example, because the number of in-sync replicas is less than min.in.sync.replicas), we will roll back the attempted write of the abort marker, but still increment the epoch in the transaction metadata cache. During periods of prolonged log unavailability, producer retires of InitProducerId calls can cause the epoch to be increased to the point of exhaustion, at which point further InitProducerId calls fail because the producer can no longer be fenced. With this patch, we track whenever we have failed to write the bumped epoch, and when that has happened, we don't bump the epoch any further when attempting to fence. This is safe because the in-memory epoch is still causes old producers to be fenced.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>

2eeae09c

KAFKA-10257 system test kafkatest.tests.core.security_rolling_upgrade_test fails (#9021) · 598a0d16

由 Chia-Ping Tsai 创作于 4年前

security_rolling_upgrade_test may change the security listener and then restart Kafka servers. has_sasl and has_ssl get out-of-date due to cached _security_config. This PR offers a simple fix that we always check the changes of port mapping and then update the sasl/ssl flag.

Reviewers:  Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

598a0d16

MINOR: update required MacOS version (#9025) · 934033bd

由 Matthias J. Sax 创作于 4年前

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>

934033bd

MINOR: Add ApiMessageTypeGenerator (#9002) · e5335bcb

由 Colin Patrick McCabe 创作于 4年前

Previously, we had some code hard-coded to generate message type classes
for RPCs.  We might want to generate message type classes for other
things as well, so make it more generic.

Reviewers: Boyang Chen <boyang@confluent.io>

e5335bcb

7月 15, 2020

ST-3402: Refactored Jenkinsfile to get secrets from Vault instead of Jenkins... · 0cc0279d
由 elismaga 创作于 4年前
```
ST-3402: Refactored Jenkinsfile to get secrets from Vault instead of Jenkins credential store (#361)
```
0cc0279d

MINOR: Remove call to Exit.exit() to prevent infinite recursion in Connect... · f699bd98

由 Arjun Satish 创作于 4年前

MINOR: Remove call to Exit.exit() to prevent infinite recursion in Connect integration tests (#9015)

If we call org.apache.kafka.common.utils.Exit#exit(int code) with code=0, the current implementation will go into an infinite recursion and kill the VM with a stack overflow error. This happens only in integration tests because of the overrides of shutdown procedures and this commit addresses this issue by removing the redundant call to Exit#exit.

Reviewers: Lucas Bradstreet <lucas@confluent.io>, Konstantine Karantasis <k.karantasis@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>

f699bd98

7月 14, 2020

KAFKA-10192: Wait for REST API to become available before testing blocked connectors (#8928) · cf69bdc7

由 Chris Egerton 创作于 4年前

The `testBlockInConnectorStop` test is failing semi-frequently on Jenkins. It's difficult to verify the cause without complete logs and I'm unable to reproduce locally, but I suspect the cause may be that the Connect worker hasn't completed startup yet by the time the test begins and so the initial REST request to create a connector times out with a 500 error. This isn't an issue for normal tests but we artificially reduce the REST request timeout for these tests as some requests are meant to exhaust that timeout.

The changes here use a small hack to verify that the worker has started and is ready to handle all types of REST requests before tests start by querying the REST API for a non-existent connector.

Reviewers: Boyang Chan <boyang@confluent.io>, Konstantine Karantasis <k.karantasis@gmail.com>

cf69bdc7

KAFKA-10240: Suppress WakeupExceptions during sink task shutdown (#9003) · 4cd6dfc8

由 Chris Egerton 创作于 4年前

A benign `WakeupException` can be thrown by a sink task's consumer if the task is scheduled for shutdown by the worker. This is caught and handled gracefully if the exception is thrown when calling `poll` on the consumer, but not if calling `commitSync`, which is invoked by a task during shutdown and also when its partition assignment is updated.

If thrown during a partition assignment update, the `WakeupException` is caught and handled gracefully as part of the task's `iteration` loop. If thrown during shutdown, however, it is not caught and instead leads to the misleading log message "Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted.".

These changes catch the `WakeupException` during shutdown and handle it gracefully with a `TRACE`-level log message.

A unit test is added to verify this behavior by simulating a thrown `WakeupException` during `Consumer::commitSync`, running through the `WorkerSinkTask::execute` method, and confirming that it does not throw a `WakeupException` itself.

Reviewers: Greg Harris <gregh@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <k.karantasis@gmail.com>

4cd6dfc8

KAFKA-10002; Improve performances of StopReplicaRequest with large number of... · 43d43e6c

由 David Jacot 创作于 4年前

KAFKA-10002; Improve performances of StopReplicaRequest with large number of partitions to be deleted (#8672)

Update checkpoint files once for all deleted partitions instead of updating them for each deleted partitions. With this, a stop replica requests with 2000 partitions to be deleted takes ~2 secs instead of ~40 secs previously.

Refactor the checkpointing methods to not compute the logsByDir all the time. It is now reused as much as possible.

Refactor the exception handling. Some checkpointing methods were handling IOException but the underlying write process already catches them and throws KafkaStorageException instead.

Reduce the logging in the log cleaner manager. It does not log anymore when a partition is deleted as it is not a useful information.

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

43d43e6c

KAFKA-10044 Deprecate ConsumerConfig#addDeserializerToConfig and Prod… (#9013) · ffdec02e

由 Chia-Ping Tsai 创作于 4年前

deprecate ConsumerConfig#addDeserializerToConfig and ProducerConfig#addSerializerToConfig.

Create internal use cases instead: appendDeserializerToConfig and appendSerializerToConfig

Reviewers: Boyang Chen <boyang@confluent.io>

ffdec02e

KAFKA-6453: Document how timestamps are computed for aggregations and joins (#9009) · d945b287
由 Jim Galasyn 创作于 4年前
```
Reviewer: Matthias J. Sax <matthias@confluent.io>
```
d945b287

7月 13, 2020

MINOR; Provide static constants with lowest and highest supported versions... · c0bcabcc

由 David Jacot 创作于 4年前

MINOR; Provide static constants with lowest and highest supported versions alongside with the schema (#8916)

Reviewers: Colin P. McCabe <cmccabe@apache.org>

c0bcabcc

MINOR; KafkaAdminClient#describeLogDirs should not fail all the futures when... · f40aa33d

由 David Jacot 创作于 4年前

MINOR; KafkaAdminClient#describeLogDirs should not fail all the futures when only one call fails (#8998)

Reviewers: Colin P. McCabe <cmccabe@apache.org>

f40aa33d

7月 12, 2020
- KAFKA-10247: Correctly reset state when task is corrupted (#8994) · cec5f377
  由 John Roesler 创作于 4年前
  
  Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
  cec5f377
- KAFKA-10262: Ensure that creating task directory is thread safe (#9010) · f2db8d53
  由 Matthias J. Sax 创作于 4年前
  
  Reviewers: A. Sophie Blee-Goldman <sohpie@confluent.io>, John Roesler <john@confluent.io>
  f2db8d53
7月 11, 2020
- KAFKA-10263: Do not assign standby for revoking stateless tasks (#9005) · f209b3c5
  由 Guozhang Wang 创作于 4年前
  
  Also piggy-back a small fix to use TreeMap other than HashMap to preserve iteration ordering. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
  f209b3c5
- Merge remote-tracking branch 'ak/trunk' into ccs_merge_from_ak_trunk_2020_07_10 · 69d04e3b
  由 Steve Rodrigues 创作于 4年前
  
  69d04e3b
7月 10, 2020

KAFKA-10249: don't try to read un-checkpointed offsets of in-memory stores (#8996) · 888f83c2

由 A. Sophie Blee-Goldman 创作于 4年前

Fixes an asymmetry in which we avoid writing checkpoints for non-persistent stores, but still expected to read them, resulting in a spurious TaskCorruptedException.

Reviewers: Matthias J. Sax <mjsax@apache.org>, John Roesler <vvcephei@apache.org>

888f83c2

MINOR: Restore stream-table duality description (#8995) · 7e668482
由 Jim Galasyn 创作于 4年前
```
Reviewers: Matthias J. Sax <matthias@confluent.io>
```
7e668482