- 12月 20, 2022
-
-
由 Artem Livshits 创作于
This cherry pick contains 2 commits: - https://github.com/apache/kafka/commit/7d1e37ea7eed97a69f82c4013256ca81d035d2ec - https://github.com/apache/kafka/commit/43f39c2e602bc718609ba34ad15087810f76d382 There were some conflicts in unit tests because Kafka doesn't have observers. *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Please delete this explanatory text.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Please delete this explanatory text.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ### Merge requirements **Branch protections have been put into place which will prevent PRs from being merged to master when the build is failing.** Please review the build in jenkins. If your own change is to blame, please fix as required. Be careful to not simply retry the build if you suspect your change has made tests flakier or added flaky tests. If you suspect another change is the cause of failures or you're unclear about the cause, please read https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.
-
由 Yang Yu 创作于
When handling a stop replica request, we should also remove the stray log reference since the underlining log dir will be deleted.
- 12月 19, 2022
-
-
由 David Jacot 创作于
This patch adds `deleteGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it. Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit f8556fe7)
-
由 David Jacot 创作于
This patch adds `describeGroups` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it. Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit 4a9c0fa4)
-
由 David Jacot 创作于
This is a small follow-up to https://github.com/apache/kafka/pull/12848. Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io> (cherry picked from commit 2935a520)
-
由 David Jacot 创作于
This patch does a few cleanups: * It removes `DescribeGroupsResponse.fromError` and pushes its logic to `DescribeGroupsRequest.getErrorResponse` to be consistent with how we implemented the other requests/responses. * It renames `DescribedGroup.forError` to `DescribedGroup.groupError`. The patch relies on existing tests. Reviewers: Mickael Maison <mickael.maison@gmail.com> (cherry picked from commit f9a09fdd)
-
由 Rittika Adhikari 创作于
-
由 Confluent Jenkins Bot 创作于
-
由 Rittika Adhikari 创作于
- 12月 18, 2022
-
-
由 Calvin Liu 创作于
When the replica manager decides a preferred replica, we should avoid choosing a degraded replica. https://confluentinc.atlassian.net/browse/KENGINE-304
-
由 Confluent Jenkins Bot 创作于
- 12月 17, 2022
-
-
由 Anastasia Vela 创作于
This PR adds the implementation for producer id throttling. The feature is gated by a feature flag which defaults to not enabled the throttling. Throttling is done in the ProduceRequest pipeline before appending to log. There is a check to ensure the producer id count is below the throttling threshold. If it's below the threshold, we append to log like normally. If it's above the threshold, we send the error response REQUEST_TIME_OUT back to the client and throttle them for some time period by muting the channel. Once the throttle time is exceeded, the messages will be processed again checking that the quota has not been exceeded. Recording to the count/rate metrics is done in the ProducerStateManager to keep track of the expired producer ids as well. So as soon as producer ids expire, the count decrements. And when a producer id is inserted into the map, the count will increment. This means the ProducerIdQuotaManager needed to be ported into the Producer StateManager, which is done in this PR. This PR also adds the following configs: - `confluent.producer.id.throttle.enable` to gate whether we want producer id throttling to occur. Defaults to not enabled. - `confluent.producer.id.quota.manager.enable` to gate whether the quota manager will be initialized or not. Defaults to not enabled. [JIRA](https://confluentinc.atlassian.net/browse/KCFUN-602) The following tests were added: - MultiTenantQuotaIntegrationTest.java - integration tests to: 1) test the pipeline updates the metric accordingly and throttles as expected, and 2) test that the rate quota is dynamic - DynamicBrokerConfigTest - test that the producer id quota rate is dynamic - KafkaProducerTest.java - test client's behavior when encountering the REQUEST_TIME_OUT exception ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ### Merge requirements **Branch protections have been put into place which will prevent PRs from being merged to master when the build is failing.** Please review the build in jenkins. If your own change is to blame, please fix as required. Be careful to not simply retry the build if you suspect your change has made tests flakier or added flaky tests. If you suspect another change is the cause of failures or you're unclear about the cause, please read https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.
-
由 Confluent Jenkins Bot 创作于
-
由 Zhongyin Zhang 创作于
### About The ConfluentTrustManager currently only support client cert validation. In order to support inter-broker ssl, we need to add the ability for it to validate server certs, and make the behavior configurable so that it can be used as needed depending on the use case. ### Major changes in this PR Add server verification in ConfluentTrustManager Make Confluent Host Suffix configurable The [engineer One Page ](https://confluentinc.atlassian.net/wiki/spaces/K/pages/2936971376/Using+a+Custom+TrustManager+for+Internal+TLS) describes the detail about this change. [Jira](https://confluentinc.atlassian.net/browse/KCFUN-689) ### Testing - The Confluent domain suffix is configurable - The Host name validation is disabled under client mode - Validate the inter broker ssl handshake ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ### Merge requirements **Branch protections have been put into place which will prevent PRs from being merged to master when the build is failing.** Please review the build in jenkins. If your own change is to blame, please fix as required. Be careful to not simply retry the build if you suspect your change has made tests flakier or added flaky tests. If you suspect another change is the cause of failures or you're unclear about the cause, please read https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.
-
由 Sanjana Kaundinya 创作于
KGLOBAL-2442: Disallow cluster link deletion when mirror topics are in PENDING_STOPPED state (#8264)
-
由 Kowshik Prakasam 创作于
Modified the `DumpTierPartitionState` to be able to print (to standard out) the headers (in JSON format) of all checkpointed tier state files under a provided root log directory. **Test:** Built the tool and ran it to test 2 cases: 1. Ran the tool against all tier state files from one of the Kafka brokers in tier soak. Tool worked fine, and `jq` command was able to prettify its output meaning that the JSON was valid. Also tried introducing few errors and the tool still worked fine printing the errors correctly to stderr while still printing the JSON output for the remaining valid partitions. 2. As a regression test, ran the tool against a single user partition's log directory. The tool behaved just like it was prior to this PR and printed all contents of the tier state file.
-
由 chern 创作于
…ticated listener Customers can set cluster link bootstrap server to an unauthenticated listener of the source cluster. There are network connectivity if the source and destination cluster are in the same network. This is bad because through cluster link, customers can access another cluster without authentication on Confluent Cloud. To mitigate this, we disallow bootstrap server that has localhost or site local address + list of unauthenticated ports. The downside is we have to update new IP address ranges and unauthenticated ports used for Confluent Cloud. To solve this problem permanently, Confluent Cloud brokers should reject cluster linking requests on unauthenticated listener. The code is stricter as it only allows SASL_SSL, which is the only option for cloud currently. The change introduces ConfluentCloudBrokerInterceptor which will be used by unauthenticated listeners on Confluent Cloud. After destination cluster detecting such scenario, destination cluster will fail the cluster link.
- 12月 16, 2022
-
-
由 Ashish Malgawa 创作于
For Catalog RBAC DS and DD needs Describe Permission on the topic, also they need to permission to see the lineage.
-
由 Confluent Jenkins Bot 创作于
-
由 Confluent Jenkins Bot 创作于
-
由 yuyli 创作于
Today, we log PRODUCE/FETCH and FOLLOWER FETCH requests with latencies slower than P99. On clusters with very high request rate (3-4k) this results in a significant number of logs/events per second. Instead we can add sampling to slow logs which can ensure that we log a fixed number(in our case 48) of slow log requests per minute. We also update the way of calculating slowLog threshold in this PR. More details can be found in [this wiki page](https://confluentinc.atlassian.net/wiki/spaces/CNKAF/pages/2890301648/Slow+log+sampling) **Here are the manual test result when log into the broker to update dynamic config** 1. update the `SLOW_LOG_THRESHOLD_OVERRIDE` (updated succesfully) <img width="1035" alt="Screen Shot 2022-12-02 at 1 11 44 PM" src="https://user-images.githubusercontent.com/112504334/205387000-7c2d2a6d-1c01-480d-a80b-3ce02635cad9.png"> 2. update the `MIN_P99_SLOW_LOG_THRESHOLD ` (updated succesfully) <img width="1008" alt="Screen Shot 2022-12-02 at 1 11 58 PM" src="https://user-images.githubusercontent.com/112504334/205387130-871cf394-2048-4868-9814-9d5ef958f397.png"> ### Committer Checklist (excluded from commit message) - [x] Verify design and implementation - [x] Verify test coverage and CI build status - [x] Verify documentation (including upgrade notes) ### Merge requirements **Branch protections have been put into place which will prevent PRs from being merged to master when the build is failing.** Please review the build in jenkins. If your own change is to blame, please fix as required. Be careful to not simply retry the build if you suspect your change has made tests flakier or added flaky tests. If you suspect another change is the cause of failures or you're unclear about the cause, please read https://confluentinc.atlassian.net/wiki/spaces/KAFKA/pages/2719875296/ce-kafka+build+stability+unblocking+PR+merges.
-
- 12月 15, 2022
-
-
由 Stanislav Kozlovski 创作于
MINOR: Cherry-pick "Increase timeout, correct error message returned for addBroker test" (#7320) (#8306) This patch cherry-picks 86b5fe7, originally committed only to the 7.3.x branch. The commit increases the allowed timeout for the addBroker system test and conditionally gives it a greater timeout if the test is doing a rolling restart of every broker. At the time, through inspecting the tests, it was found out that the given 300s timeout was insufficient. The timeout is now respectively bumped to 900s and 450s depending on whether the brokers are being restarted or not. Additionally, a minor bug is fixed in the error message log which previously wouldn't log the last addition status seen Co-authored-by: Aishwarya Gune <aishwarya@confluent.io>
-
由 Daniel Gospodinow 创作于
SBC event queue size metric