- 9月 27, 2022
-
-
由 Stanislav Kozlovski 创作于
The aggregate method in RawMetricValues was a bit too complex, this refactors it into a smaller method which focuses on deciding the value/extrapolation method for a particular metric.
-
由 ssumit33 创作于
- Reducing the build pipeline timeout from 3 to 2 hours so to reduce unnecessary waiting and building up of jenkins build queue.
-
由 Feng Min 创作于
-
由 Ashish Malgawa 创作于
* Added control Plane role bindings * Added operator control Plane role SR bindings * Updated unit tests * Added Ability to see other users and service account to ResourceOwner
-
由 Feng Min 创作于
-
由 Cong Ding 创作于
We have seen partial upload in GCP because of the following events in order: 1. putSegment function is called and uploads the segment to GCS 2. The upload in event 1 throws an exception and only partial data has been uploaded 3. WriteChannel.close() is called and run in the background 4. The tier archiver state machine retries upload and calls putSegment 5. Event 4 succeeded and UploadComplete metadata is written 6. The close function in line 3 finished and overwrites the file uploaded in events 4 and 5 7. Future reads see partial segment uploaded in event 6 The fix in this function uses Storage.BlobWriteOption.doesNotExist() flag to address the above issue because event 6 will fail the doesNotExist check. In another situation, if event 6 happens before event 5, event 6 will succeed while event 5 will fail and throw an exception. Note that the success of event 6 will not trigger an UploadComplete metadata. The failure of event 5 will trigger a call of storage.delete which deletes the file created in event 6 then the state machine will retry again with the doesNotExist flag.
-
由 Colin P. McCabe 创作于
Confluent's CreateTopicPolicy tries to limit the number of partitions that can be created per tenant and also globally per cluster. Unfortunately, prior to this PR, those limits could be sidestepped when in KRaft mode by creating a batch with several operations inside of it. The limit would be applied as if each element in the batch were the only operation that was pending. So, for example, if the tenant had 5 partitions left, they could create a batch creating topics foo, bar, and baz, each with 5 partitions. This would let them exceed the limit by 10 partitions in this example. This PR fixes the bug by adding "pending" update state, which is updated as the quorum controller works its way through the batch. This pending state tracks how many partitions per tenant (and so forth) were created by operations earlier in the batch, so that they can be counted against the current operation being processed. The reason for add the concept of pending state rather than simply updating the hard state immediately is that topic and partition creations are not durable until the metadata has been committed to the log. So updating the hard state immediately could allow them to get out of sync. The pending state used here avoids this problem, since it is cleared at the beginning of each createTopics or createPartitions operation. This PR also fixes a performance regression introduced by KCFUN-386, which added some code that iterates over every partition in the cluster owned by a tenant, on each create operation. We need to avoid O(num_partitions) operations. In QuorumController#renounce, fix a divergence from AK that arose from an earlier cherry-pick. Co-authored-by: Sanjana Kaundinya <skaundinya@confluent.io> Co-authored-by: Ron Dagostino <rdagostino@confluent.io>
-
由 Sanjana Kaundinya 创作于
-
由 Feng Min 创作于
-
由 David Arthur 创作于
-
由 Niket 创作于
This commit adds KRaft monitoring related metrics to the Kafka docs (docs/ops.html). Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>
-
由 Colin Patrick McCabe 创作于
Previously, BrokerRegistration#toString sould throw an exception, terminating metadata replay, because the sorted() method is used on an entry set rather than a key set. Reviewers: David Arthur <mumrah@gmail.com>
-
由 dengziming 创作于
This test was removed in #11667 since UpdateFeatures is not properly handled in KRaft mode, now we can bring it back since UpdateFeatures is properly handled after #12036. Reviewers: Luke Chen <showuon@gmail.com>
-
The boostrap.checkpoint files should include a control record batch for the SnapshotHeaderRecord at the start of the file. It should also include a control record batch for the SnapshotFooterRecord at the end of the file. The snapshot header record is important because it versions the rest of the bootstrap file. Reviewers: David Arthur <mumrah@gmail.com>
-
由 Colin Patrick McCabe 创作于
Reviewers: David Arthur <mumrah@gmail.com>
-
由 Colin Patrick McCabe 创作于
kafka-features.sh must exit with a non-zero error code on error. We must do this in order to catch regressions like KAFKA-13990. Reviewers: David Arthur <mumrah@gmail.com>
-
由 Colin Patrick McCabe 创作于
This PR adds support to kafka-features.sh for the --metadata flag, as specified in KIP-778. This flag makes it possible to upgrade to a new metadata version without consulting a table mapping version names to short integers. Change --feature to use a key=value format. FeatureCommandTest.scala: make most tests here true unit tests (that don't start brokers) in order to improve test run time, and allow us to test more cases. For the integration test part, test both KRaft and ZK-based clusters. Add support for mocking feature operations in MockAdminClient.java. upgrade.html: add a section describing how the metadata.version should be upgraded in KRaft clusters. Add kraft_upgrade_test.py to test upgrades between KRaft versions. Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@gmail.com> Conflicts: fix conflict in upgrade.html
-
由 Colin Patrick McCabe 创作于
The main changes here are ensuring that we always have a metadata.version record in the log, making ˘sure that the bootstrap file can be used for records other than the metadata.version record (for example, we will want to put SCRAM initialization records there), and fixing some bugs. If no feature level record is in the log and the IBP is less than 3.3IV0, then we assume the minimum KRaft version for all records in the log. Fix some issues related to initializing new clusters. If there are no records in the log at all, then insert the bootstrap records in a single batch. If there are records, but no metadata version, process the existing records as though they were metadata.version 3.3IV0 and then append a metadata version record setting version 3.3IV0. Previously, we were not clearly distinguishing between the case where the metadata log was empty, and the case where we just needed to add a metadata.version record. Refactor BootstrapMetadata into an immutable class which contains a 3-tuple of metadata version, record list, and source. The source field is used to log where the bootstrap metadata was obtained from. This could be a bootstrap file, the static configuration, or just the software defaults. Move the logic for reading and writing bootstrap files into BootstrapDirectory.java. Add LogReplayTracker, which tracks whether the log is empty. Fix a bug in FeatureControlManager where it was possible to use a "downgrade" operation to transition to a newer version. Do not store whether we have seen a metadata version or not in FeatureControlManager, since that is now handled by LogReplayTracker. Introduce BatchFileReader, which is a simple way of reading a file containing batches of snapshots that does not require spawning a thread. Rename SnapshotFileWriter to BatchFileWriter to be consistent, and to reflect the fact that bootstrap files aren't snapshots. QuorumController#processBrokerHeartbeat: add an explanatory comment. Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io> Merge notes: - re-enable KRaftClusterTest#testUpdateMetadataVersion. - Confluent brokers now register that they support both metadata.version and confluent.metadata.version. Without this, rolling upgrade from AK is not possible. - Storage tool now supports only one version number parsing pathway. Confluent versions begin with "CP-". - Renamed MetadataVersion.CONFLUENT_METATA_VERSION to MetadataVersion.CONFLUENT_FEATURE_NAME to be consistent with MetadataVersion.FEATURE_NAME in AK. - Changed the way the controller calls controllerMetrics.recordControllerLoadTime after activation a bit. - Added MetadataVersionTest.testFromConfluentVersionString - Add support for mocking old versions in BootstrapMetadata, since at least one Confluent test needs it.
-
由 Feng Min 创作于
-
由 Feng Min 创作于
-
由 andymg3 创作于
- 9月 24, 2022
-
-
由 Aishwarya Gune 创作于
* Remove version from RawMetricType and its usages
-
由 Feng Min 创作于
Notable conflicts. * KAFKA-13850( https://github.com/confluentinc/kafka/commit/19581effbf9265db318f855cc37f6b0526c1b544) and KMETA-290(https://github.com/confluentinc/ce-kafka/commit/53cabfd3a80f3533e49489ee31fd3879b7d52454) are not compatible. KAFKA-13850 ends up mostly dropped. * A bunch of upstream connect cleanup conflict with our own non-ak files in ce-kafka Conflicts to be solved: both modified: clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java both modified: clients/src/main/java/org/apache/kafka/common/requests/AllocateProducerIdsResponse.java both modified: clients/src/main/java/org/apache/kafka/common/requests/DescribeQuorumResponse.java both modified: clients/src/test/java/org/apache/kafka/clients/admin/KafkaAdminClientTest.java both modified: connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java both modified: connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java both modified: connect/runtime/src/test/java/org/apache/kafka/connect/runtime/AbstractHerderTest.java both modified: connect/runtime/src/test/java/org/apache/kafka/connect/util/ConnectUtilsTest.java both modified: core/src/main/java/kafka/server/builders/LogManagerBuilder.java both modified: core/src/main/scala/kafka/controller/KafkaController.scala both modified: core/src/main/scala/kafka/log/LogManager.scala both modified: core/src/main/scala/kafka/raft/RaftManager.scala both modified: core/src/main/scala/kafka/server/BrokerServer.scala both modified: core/src/main/scala/kafka/server/ControllerApis.scala both modified: core/src/main/scala/kafka/server/ControllerServer.scala both modified: core/src/main/scala/kafka/server/DynamicBrokerConfig.scala both modified: core/src/main/scala/kafka/server/KafkaConfig.scala both modified: core/src/main/scala/kafka/server/KafkaServer.scala both modified: core/src/main/scala/kafka/server/ReplicationQuotaManager.scala both modified: core/src/main/scala/kafka/server/Server.scala both modified: core/src/main/scala/kafka/tools/DumpLogSegments.scala both modified: core/src/main/scala/kafka/tools/StorageTool.scala both modified: core/src/main/scala/kafka/utils/KafkaScheduler.scala both modified: core/src/test/java/kafka/test/ClusterInstance.java both modified: core/src/test/java/kafka/test/junit/RaftClusterInvocationContext.java both modified: core/src/test/java/kafka/testkit/KafkaClusterTestKit.java both added: core/src/test/scala/integration/kafka/api/ProducerIdExpirationTest.scala both modified: core/src/test/scala/integration/kafka/api/TransactionsExpirationTest.scala both modified: core/src/test/scala/integration/kafka/network/DynamicConnectionQuotaTest.scala both modified: core/src/test/scala/integration/kafka/server/DynamicBrokerReconfigurationTest.scala both modified: core/src/test/scala/integration/kafka/server/KRaftClusterTest.scala both modified: core/src/test/scala/integration/kafka/server/MetadataVersionIntegrationTest.scala both modified: core/src/test/scala/integration/kafka/server/QuorumTestHarness.scala both modified: core/src/test/scala/other/kafka/StressTestLog.scala both modified: core/src/test/scala/other/kafka/TestLinearWriteSpeed.scala both modified: core/src/test/scala/unit/kafka/admin/AddPartitionsTest.scala both added: core/src/test/scala/unit/kafka/admin/MetadataQuorumCommandTest.scala both modified: core/src/test/scala/unit/kafka/controller/ControllerContextTest.scala both modified: core/src/test/scala/unit/kafka/coordinator/group/GroupCoordinatorTest.scala both modified: core/src/test/scala/unit/kafka/coordinator/group/GroupMetadataManagerTest.scala both modified: core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala both modified: core/src/test/scala/unit/kafka/integration/UncleanLeaderElectionTest.scala both modified: core/src/test/scala/unit/kafka/log/AbstractLogCleanerIntegrationTest.scala both modified: core/src/test/scala/unit/kafka/log/BrokerCompressionTest.scala both modified: core/src/test/scala/unit/kafka/log/LocalLogTest.scala both modified: core/src/test/scala/unit/kafka/log/LogCleanerManagerTest.scala both modified: core/src/test/scala/unit/kafka/log/LogCleanerTest.scala both modified: core/src/test/scala/unit/kafka/log/LogConcurrencyTest.scala both modified: core/src/test/scala/unit/kafka/log/LogLoaderTest.scala both modified: core/src/test/scala/unit/kafka/log/LogManagerTest.scala both modified: core/src/test/scala/unit/kafka/log/LogTestUtils.scala deleted by us: core/src/test/scala/unit/kafka/log/UnifiedLogTest.scala both modified: core/src/test/scala/unit/kafka/network/SocketServerTest.scala both added: core/src/test/scala/unit/kafka/server/AllocateProducerIdsRequestTest.scala both modified: core/src/test/scala/unit/kafka/server/ControllerApisTest.scala both modified: core/src/test/scala/unit/kafka/server/DynamicBrokerConfigTest.scala both modified: core/src/test/scala/unit/kafka/server/KafkaRaftServerTest.scala both modified: core/src/test/scala/unit/kafka/server/ReplicationQuotasTest.scala both modified: core/src/test/scala/unit/kafka/server/metadata/BrokerMetadataListenerTest.scala both modified: core/src/test/scala/unit/kafka/tools/DumpLogSegmentsTest.scala both modified: core/src/test/scala/unit/kafka/tools/StorageToolTest.scala both modified: core/src/test/scala/unit/kafka/utils/SchedulerTest.scala both modified: core/src/test/scala/unit/kafka/utils/TestUtils.scala both modified: docs/upgrade.html deleted by them: metadata/src/main/java/org/apache/kafka/controller/BootstrapMetadata.java both modified: metadata/src/main/java/org/apache/kafka/controller/ClusterControlManager.java both modified: metadata/src/main/java/org/apache/kafka/controller/FeatureControlManager.java both modified: metadata/src/main/java/org/apache/kafka/controller/QuorumController.java both modified: metadata/src/main/java/org/apache/kafka/image/FeaturesImage.java both modified: metadata/src/test/java/org/apache/kafka/controller/ClusterControlManagerTest.java both modified: metadata/src/test/java/org/apache/kafka/controller/FeatureControlManagerTest.java both modified: metadata/src/test/java/org/apache/kafka/controller/QuorumControllerTest.java both modified: metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java both added: retry_zinc both modified: server-common/src/test/java/org/apache/kafka/server/common/MetadataVersionTest.java both modified: shell/src/main/java/org/apache/kafka/shell/MetadataNodeManager.java both modified: shell/src/test/java/org/apache/kafka/shell/MetadataNodeManagerTest.java both modified: tests/kafkatest/services/kafka/kafka.py
-
由 Akhilesh C 创作于
KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads. (#12628) (#7535) Fixes an issue with StandardAuthorizer#authorize that allowed inconsistent results. The underlying concurrent data structure (ConcurrentSkipListMap) had weak consistency guarantees. This meant that a concurrent update to the authorizer data could result in the authorize function processing ACL updates out of order. This patch replaces the concurrent data structures with regular non-thread safe equivalents and uses a read/write lock for thread safety and strong consistency. Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, Luke Chen <showuon@gmail.com> Conflicts: metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java Changes to be committed: modified: metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAcl.java modified: metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizer.java modified: metadata/src/main/java/org/apache/kafka/metadata/authorizer/StandardAuthorizerData.java
-
由 Eric Sirianni 创作于
Add new TelemetrySubmitter role. This role grants privilege to submit metrics to the Confluent Telemetry Receiver (https://collector.telemetry.confluent.cloud). The role will be used by BYOC automation as part of provisioning a Service Account & API Key that will be injected into BYOC clusters TelemetryReporter configuration.
-
由 Stanislav Kozlovski 创作于
-
由 Stanislav Kozlovski 创作于
MINOR: Simplify SBC's Entity interface and the rest of the metric (aggregator,sample) interfaces (#7521) This patch does one simple thing - it simplifies the generic configurable Entity<G> interface to an Entity. The reasoning is that all of our entities (including the test ones) use String, and for 2+ years we have never had reason to change this. Even recently while we've been extending the entities by adding a ReplicaEntity, we still use a String. It is unlikely we will need to use another type anytime soon, and the simplification benefits are high. By doing that, a lot of other valuable downstream simplifications happen - MetricSampleAggregator, AggregationOptions, MetricSample, MetricSampleCompleteness, MetricAggregationResult - all used a complex <G, E> generic interface, where G was always String.
- 9月 23, 2022
-
-
由 gbadoni 创作于
* Added metrics to monitor bad timestamps * format change * format changes * Incorporating Matt's review comment * format * Converted to Option