提交 · v0.4415.0-7.4.0-0-ce · Archie Kelly / kafka

12月 20, 2022

chore: minor version bump v0.4415.0-7.4.0-0-ce [ci skip] · 10605ddd
由 ConfluentSemaphore 创作于 2年前

v0.4415.0-7.4.0-0-ce

10605ddd

cherry-pick: KAFKA-14379: Consumer should refresh preferred read replica on update metadata (#8324) · 9cefc619

由 Artem Livshits 创作于 2年前

This cherry pick contains 2 commits:

-
https://github.com/apache/kafka/commit/7d1e37ea7eed97a69f82c4013256ca81d035d2ec
-
https://github.com/apache/kafka/commit/43f39c2e602bc718609ba34ad15087810f76d382

There were some conflicts in unit tests because Kafka doesn't have
observers.

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers. Please delete this
explanatory text.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.
Please delete this explanatory text.*

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

### Merge requirements

**Branch protections have been put into place which will prevent PRs
from being merged to master when the build is failing.**

Please review the build in jenkins. If your own change is to blame,
please fix as required.
Be careful to not simply retry the build if you suspect your change has
made tests flakier or added flaky tests.

If you suspect another change is the cause of failures or you're unclear
about the cause, please read

https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.

9cefc619

chore: minor version bump v0.4414.0-7.4.0-0-ce [ci skip] · 344e7754
由 ConfluentSemaphore 创作于 2年前

v0.4414.0-7.4.0-0-ce

344e7754

MINOR: remove stray log reference on stop replica (#8320) · 7c8cdf0a

由 Yang Yu 创作于 2年前

When handling a stop replica request, we should also remove the stray
log reference since the underlining log dir will be deleted.

7c8cdf0a

12月 19, 2022
- chore: minor version bump v0.4413.0-7.4.0-0-ce [ci skip] · bc7561e9
  由 ConfluentSemaphore 创作于 2年前
  
  v0.4413.0-7.4.0-0-ce
  
  bc7561e9
- KAFKA-14367; Add `DeleteGroups` to the new `GroupCoordinator` interface (#12858) · 3400ecff
  由 David Jacot 创作于 2年前
  
  This patch adds `deleteGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it. Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit f8556fe7)
  3400ecff
- KAFKA-14367; Add `DescribeGroups` to the new `GroupCoordinator` interface (#12855) · f36131d5
  由 David Jacot 创作于 2年前
  
  This patch adds `describeGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it. Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit 4a9c0fa4)
  f36131d5
- MINOR: Small refactor in KafkaApis.handleHeartbeatRequest (#12978) · 0c3daef6
  由 David Jacot 创作于 2年前
  
  This is a small follow-up to https://github.com/apache/kafka/pull/12848. Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit 2935a520)
  0c3daef6
- MINOR: Small refactor in DescribeGroupsResponse (#12970) · 56cf744a
  由 David Jacot 创作于 2年前
  
  This patch does a few cleanups: * It removes `DescribeGroupsResponse.fromError` and pushes its logic to `DescribeGroupsRequest.getErrorResponse` to be consistent with how we implemented the other requests/responses. * It renames `DescribedGroup.forError` to `DescribedGroup.groupError`. The patch relies on existing tests. Reviewers: Mickael Maison <mickael.maison@gmail.com> (cherry picked from commit f9a09fdd)
  56cf744a
- chore: minor version bump v0.4412.0-7.4.0-0-ce [ci skip] · 2e87ff03
  由 ConfluentSemaphore 创作于 2年前
  
  v0.4412.0-7.4.0-0-ce
  
  2e87ff03
- Merge branch 'master' of github.com:confluentinc/ce-kafka into sync-upstream-30-nov-2022 · c31a2a7a
  由 Rittika Adhikari 创作于 2年前
  
  c31a2a7a
- chore: minor version bump v0.4411.0-7.4.0-0-ce [ci skip] · b693289b
  由 ConfluentSemaphore 创作于 2年前
  
  v0.4411.0-7.4.0-0-ce
  
  b693289b
- chore: delete project_onprem.yml · faed99dd
  由 Confluent Jenkins Bot 创作于 2年前
  
  faed99dd
- Merge remote-tracking branch 'origin' into sync-upstream-30-nov-2022 · 34a4cf05
  由 Rittika Adhikari 创作于 2年前
  
  34a4cf05
12月 18, 2022
- chore: minor version bump v0.4410.0-7.4.0-0-ce [ci skip] · c7495a16
  由 ConfluentSemaphore 创作于 2年前
  
  v0.4410.0-7.4.0-0-ce
  
  c7495a16
- [KENGINE-304] Avoid choosing degraded replica as preferred read replica (#8263) · 1212ce0f
  由 Calvin Liu 创作于 2年前
  
  When the replica manager decides a preferred replica, we should avoid choosing a degraded replica. https://confluentinc.atlassian.net/browse/KENGINE-304
  1212ce0f
- chore: minor version bump v0.4409.0-7.4.0-0-ce [ci skip] · ab7ad06a
  由 ConfluentSemaphore 创作于 2年前
  
  v0.4409.0-7.4.0-0-ce
  
  ab7ad06a
- chore: delete project_onprem.yml · e7362cbc
  由 Confluent Jenkins Bot 创作于 2年前
  
  e7362cbc
12月 17, 2022

chore: minor version bump v0.4408.0-7.4.0-0-ce [ci skip] · 3071d37d
由 ConfluentSemaphore 创作于 2年前

v0.4408.0-7.4.0-0-ce

3071d37d

KCFUN-602: Implement the basic producer id throttle mechanism (#8109) · 693caf1f

由 Anastasia Vela 创作于 2年前

This PR adds the implementation for producer id throttling. The feature
is gated by a feature flag which defaults to not enabled the throttling.
Throttling is done in the ProduceRequest pipeline before appending to
log. There is a check to ensure the producer id count is below the
throttling threshold. If it's below the threshold, we append to log like
normally. If it's above the threshold, we send the error response
REQUEST_TIME_OUT back to the client and throttle them for some time
period by muting the channel. Once the throttle time is exceeded, the
messages will be processed again checking that the quota has not been
exceeded.
Recording to the count/rate metrics is done in the ProducerStateManager
to keep track of the expired producer ids as well. So as soon as
producer ids expire, the count decrements. And when a producer id is
inserted into the map, the count will increment. This means the
ProducerIdQuotaManager needed to be ported into the Producer
StateManager, which is done in this PR.
This PR also adds the following configs:
- `confluent.producer.id.throttle.enable` to gate whether we want
producer id throttling to occur. Defaults to not enabled.
- `confluent.producer.id.quota.manager.enable` to gate whether the quota
manager will be initialized or not. Defaults to not enabled.

[JIRA](https://confluentinc.atlassian.net/browse/KCFUN-602)

The following tests were added:
- MultiTenantQuotaIntegrationTest.java - integration tests to: 1) test
the pipeline updates the metric accordingly and throttles as expected,
and 2) test that the rate quota is dynamic
- DynamicBrokerConfigTest - test that the producer id quota rate is
dynamic
- KafkaProducerTest.java - test client's behavior when encountering the
REQUEST_TIME_OUT exception

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

### Merge requirements

**Branch protections have been put into place which will prevent PRs
from being merged to master when the build is failing.**

If you suspect another change is the cause of failures or you're unclear
about the cause, please read

https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.

693caf1f

chore: minor version bump v0.4407.0-7.4.0-0-ce [ci skip] · c545db8c
由 ConfluentSemaphore 创作于 2年前

v0.4407.0-7.4.0-0-ce

c545db8c
chore: delete project_onprem.yml · 1749e779
由 Confluent Jenkins Bot 创作于 2年前

1749e779
chore: minor version bump v0.4406.0-7.4.0-0-ce [ci skip] · 8ac7fbf4
由 ConfluentSemaphore 创作于 2年前

v0.4406.0-7.4.0-0-ce

8ac7fbf4

[KCFUN-689] Validate Server cets in ConfluentTrustManager (#8162) · 7e7dac66

由 Zhongyin Zhang 创作于 2年前

### About
The ConfluentTrustManager currently only support client cert validation.
In order to support inter-broker ssl, we need to add the ability for it
to validate server certs, and make the behavior configurable so that it
can be used as needed depending on the use case.

### Major changes in this PR
Add server verification in ConfluentTrustManager
Make Confluent Host Suffix configurable

The [engineer One Page
](https://confluentinc.atlassian.net/wiki/spaces/K/pages/2936971376/Using+a+Custom+TrustManager+for+Internal+TLS)
describes the detail about this change.

[Jira](https://confluentinc.atlassian.net/browse/KCFUN-689)

### Testing

- The Confluent domain suffix is configurable
- The Host name validation is disabled under client mode
- Validate the inter broker ssl handshake

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

### Merge requirements

**Branch protections have been put into place which will prevent PRs
from being merged to master when the build is failing.**

Please review the build in jenkins. If your own change is to blame,
please fix as required.
Be careful to not simply retry the build if you suspect your change has
made tests flakier or added flaky tests.

If you suspect another change is the cause of failures or you're unclear
about the cause, please read

https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.

7e7dac66

chore: minor version bump v0.4405.0-7.4.0-0-ce [ci skip] · 18830b2f
由 ConfluentSemaphore 创作于 2年前

v0.4405.0-7.4.0-0-ce

18830b2f
KGLOBAL-2442: Disallow cluster link deletion when mirror topics are in... · 0ac040dd
由 Sanjana Kaundinya 创作于 2年前
```
KGLOBAL-2442: Disallow cluster link deletion when mirror topics are in PENDING_STOPPED state (#8264)
```
0ac040dd
chore: minor version bump v0.4404.0-7.4.0-0-ce [ci skip] · d6435564
由 ConfluentSemaphore 创作于 2年前

v0.4404.0-7.4.0-0-ce

d6435564

KSTORAGE-2577: Dump headers of tier state files using DumpTierPartitionState tool (#8295) · 8e1c4736

由 Kowshik Prakasam 创作于 2年前

Modified the `DumpTierPartitionState` to be able to print (to standard
out) the headers (in JSON format) of all checkpointed tier state files
under a provided root log directory.

**Test:**
Built the tool and ran it to test 2 cases:

1. Ran the tool against all tier state files from one of the Kafka
brokers in tier soak. Tool worked fine, and `jq` command was able to
prettify its output meaning that the JSON was valid. Also tried
introducing few errors and the tool still worked fine printing the
errors correctly to stderr while still printing the JSON output for the
remaining valid partitions.

2. As a regression test, ran the tool against a single user partition's
log directory. The tool behaved just like it was prior to this PR and
printed all contents of the tier state file.

8e1c4736

KGLOBAL-1852: Reject cluster link request on Confluent Cloud unauthen… (#7747) · bbc789a1

由 chern 创作于 2年前

…ticated listener

Customers can set cluster link bootstrap server to an unauthenticated
listener of the source cluster. There are network connectivity if the
source and destination cluster are in the same network. This is bad
because through cluster link, customers can access another cluster
without authentication on Confluent Cloud. To mitigate this, we disallow
bootstrap server that has localhost or site local address + list of
unauthenticated ports. The downside is we have to update new IP address
ranges and unauthenticated ports used for Confluent Cloud.

To solve this problem permanently, Confluent Cloud brokers should reject
cluster linking requests on unauthenticated listener. The code is
stricter as it only allows SASL_SSL, which is the only option for cloud
currently. The change introduces ConfluentCloudBrokerInterceptor which
will be used by unauthenticated listeners on Confluent Cloud.

After destination cluster detecting such scenario, destination cluster
will fail the cluster link.

bbc789a1

chore: minor version bump v0.4402.0-7.4.0-0-ce [ci skip] · 12a1b28e
由 ConfluentSemaphore 创作于 2年前

v0.4402.0-7.4.0-0-ce

12a1b28e

12月 16, 2022

"Added describe permission to DataSteward and DataDiscovery" (#8290) · 5f5952e0
由 Ashish Malgawa 创作于 2年前
```
For Catalog RBAC DS and DD needs Describe Permission on the topic, also
they need to permission to see the lineage.
```
5f5952e0
chore: minor version bump v0.4401.0-7.4.0-0-ce [ci skip] · 218eb3e9
由 ConfluentSemaphore 创作于 2年前

v0.4401.0-7.4.0-0-ce

218eb3e9
chore: delete project_onprem.yml · fe9adfb3
由 Confluent Jenkins Bot 创作于 2年前

fe9adfb3
chore: update repo semaphore project · 7de408f1
由 Confluent Jenkins Bot 创作于 2年前

7de408f1
chore: minor version bump v0.4400.0-7.4.0-0-ce [ci skip] · ee4f9c44
由 ConfluentSemaphore 创作于 2年前

v0.4400.0-7.4.0-0-ce

ee4f9c44

[KPERF-457] Slowlog Sampling (#8128) · 3e0c47f5

由 yuyli 创作于 2年前

Today, we log PRODUCE/FETCH and FOLLOWER FETCH requests with latencies
slower than P99. On clusters with very high request rate (3-4k) this
results in a significant number of logs/events per second. Instead we
can add sampling to slow logs which can ensure that we log a fixed
number(in our case 48) of slow log requests per minute.

We also update the way of calculating slowLog threshold in this PR. More
details can be found in [this wiki
page](https://confluentinc.atlassian.net/wiki/spaces/CNKAF/pages/2890301648/Slow+log+sampling)


**Here are the manual test result when log into the broker to update
dynamic config**

1. update the `SLOW_LOG_THRESHOLD_OVERRIDE` (updated succesfully)
<img width="1035" alt="Screen Shot 2022-12-02 at 1 11 44 PM"
src="https://user-images.githubusercontent.com/112504334/205387000-7c2d2a6d-1c01-480d-a80b-3ce02635cad9.png">
2. update the `MIN_P99_SLOW_LOG_THRESHOLD ` (updated succesfully)
<img width="1008" alt="Screen Shot 2022-12-02 at 1 11 58 PM"
src="https://user-images.githubusercontent.com/112504334/205387130-871cf394-2048-4868-9814-9d5ef958f397.png">


### Committer Checklist (excluded from commit message)
- [x] Verify design and implementation 
- [x] Verify test coverage and CI build status
- [x] Verify documentation (including upgrade notes)

### Merge requirements

**Branch protections have been put into place which will prevent PRs
from being merged to master when the build is failing.**

Please review the build in jenkins. If your own change is to blame,
please fix as required.
Be careful to not simply retry the build if you suspect your change has
made tests flakier or added flaky tests.

If you suspect another change is the cause of failures or you're unclear
about the cause, please read

https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.

3e0c47f5

12月 15, 2022

chore: minor version bump v0.4399.0-7.4.0-0-ce [ci skip] · 97bb4d41
由 ConfluentSemaphore 创作于 2年前

v0.4399.0-7.4.0-0-ce

97bb4d41

MINOR: Cherry-pick "Increase timeout, correct error message returned for... · 38f89062

由 Stanislav Kozlovski 创作于 2年前

MINOR: Cherry-pick "Increase timeout, correct error message returned for addBroker test" (#7320) (#8306)

This patch cherry-picks 86b5fe7, originally committed only to the 7.3.x
branch. The commit increases the allowed timeout for the addBroker
system test and conditionally gives it a greater timeout if the test is
doing a rolling restart of every broker. At the time, through inspecting
the tests, it was found out that the given 300s timeout was
insufficient. The timeout is now respectively bumped to 900s and 450s
depending on whether the brokers are being restarted or not.
Additionally, a minor bug is fixed in the error message log which
previously wouldn't log the last addition status seen

Co-authored-by: Aishwarya Gune <aishwarya@confluent.io>

38f89062

chore: minor version bump v0.4398.0-7.4.0-0-ce [ci skip] · 43c7d2cd
由 ConfluentSemaphore 创作于 2年前

v0.4398.0-7.4.0-0-ce

43c7d2cd
KAFKALESS-1124: Size metric for SbcEventQueue (#7299) · 887e1820
由 Daniel Gospodinow 创作于 2年前
```
SBC event queue size metric
```
887e1820