Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP MINOR: Copy over apache/kafka/3.6 docs into master #586

Open
wants to merge 1 commit into
base: asf-site
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions 36/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
<!--//#include virtual="../includes/_docs_banner.htm" -->

<h1>Documentation</h1>
<h3>Kafka 3.6 Documentation</h3>
<h3>Kafka 3.4 Documentation</h3>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is stale

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apache/kafka@4302653 is missing in AK/3.6.x

Copy link
Member

@mimaison mimaison Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem right. This is the 3.6 documentation so it should be Kafka 3.6.

Prior releases: <a href="/07/documentation.html">0.7.x</a>,
<a href="/08/documentation.html">0.8.0</a>,
<a href="/081/documentation.html">0.8.1.X</a>,
Expand All @@ -54,12 +54,10 @@ <h3>Kafka 3.6 Documentation</h3>
<a href="/26/documentation.html">2.6.X</a>,
<a href="/27/documentation.html">2.7.X</a>,
<a href="/28/documentation.html">2.8.X</a>,
<a href="/30/documentation.html">3.0.X</a>,
<a href="/31/documentation.html">3.1.X</a>,
<a href="/32/documentation.html">3.2.X</a>,
<a href="/33/documentation.html">3.3.X</a>,
<a href="/34/documentation.html">3.4.X</a>,
<a href="/35/documentation.html">3.5.X</a>.
<a href="/30/documentation.html">3.0.X</a>.
<a href="/31/documentation.html">3.1.X</a>.
<a href="/32/documentation.html">3.2.X</a>.
<a href="/33/documentation.html">3.3.X</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the 3.6 docs, so it should point to all previous releases including 3.4 and 3.5. Why are we removing them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I don't intend to - I captured it in https://github.com/apache/kafka-site/pull/586/files#r1499488695 - that commit is missing.


<h2 class="anchor-heading"><a id="gettingStarted" class="anchor-link"></a><a href="#gettingStarted">1. Getting Started</a></h2>
<h3 class="anchor-heading"><a id="introduction" class="anchor-link"></a><a href="#introduction">1.1 Introduction</a></h3>
Expand Down
4 changes: 2 additions & 2 deletions 36/generated/connect_metrics.html
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[2023-09-15 00:40:42,725] INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics:693)
[2023-09-15 00:40:42,729] INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics:703)
[2024-02-22 11:02:50,169] INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics:693)
[2024-02-22 11:02:50,170] INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics:703)
<table class="data-table"><tbody>
<tr>
<td colspan=3 class="mbeanName" style="background-color:#ccc; font-weight: bold;">kafka.connect:type=connect-worker-metrics</td></tr>
Expand Down
2 changes: 1 addition & 1 deletion 36/generated/connect_rest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ info:
name: Apache 2.0
url: https://www.apache.org/licenses/LICENSE-2.0.html
title: Kafka Connect REST API
version: 3.6.1
version: 3.6.2-SNAPSHOT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be snapshot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again we don't want this change. The docs should cover the last released version for 3.6, hence 3.6.1.

paths:
/:
get:
Expand Down
8 changes: 4 additions & 4 deletions 36/generated/streams_config.html
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ <h4><a id="state.dir"></a><a id="streamsconfigs_state.dir" href="#streamsconfigs
<p>Directory location for state store. This path must be unique for each streams instance sharing the same underlying filesystem.</p>
<table><tbody>
<tr><th>Type:</th><td>string</td></tr>
<tr><th>Default:</th><td>/var/folders/1w/r49gc42j1ml6ddw0lhlvt9pw0000gn/T//kafka-streams</td></tr>
<tr><th>Default:</th><td>/var/folders/z6/tv_ggjzd3v3b5vl2jy2bscph0000gp/T//kafka-streams</td></tr>
<tr><th>Valid Values:</th><td></td></tr>
<tr><th>Importance:</th><td>high</td></tr>
</tbody></table>
Expand Down Expand Up @@ -285,7 +285,7 @@ <h4><a id="topology.optimization"></a><a id="streamsconfigs_topology.optimizatio
<table><tbody>
<tr><th>Type:</th><td>string</td></tr>
<tr><th>Default:</th><td>none</td></tr>
<tr><th>Valid Values:</th><td>org.apache.kafka.streams.StreamsConfig$$Lambda$17/0x0000000800094840@5f341870</td></tr>
<tr><th>Valid Values:</th><td>org.apache.kafka.streams.StreamsConfig$$Lambda$21/0x0000000800084000@59ec2012</td></tr>
<tr><th>Importance:</th><td>medium</td></tr>
</tbody></table>
</li>
Expand Down Expand Up @@ -541,11 +541,11 @@ <h4><a id="state.cleanup.delay.ms"></a><a id="streamsconfigs_state.cleanup.delay
</li>
<li>
<h4><a id="upgrade.from"></a><a id="streamsconfigs_upgrade.from" href="#streamsconfigs_upgrade.from">upgrade.from</a></h4>
<p>Allows upgrading in a backward compatible way. This is needed when upgrading from [0.10.0, 1.1] to 2.0+, or when upgrading from [2.0, 2.3] to 2.4+. When upgrading from 3.3 to a newer version it is not required to specify this config. Default is `null`. Accepted values are "0.10.0", "0.10.1", "0.10.2", "0.11.0", "1.0", "1.1", "2.0", "2.1", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8", "3.0", "3.1", "3.2", "3.3", "3.4" (for upgrading from the corresponding old version).</p>
<p>Allows upgrading in a backward compatible way. This is needed when upgrading from [0.10.0, 1.1] to 2.0+, or when upgrading from [2.0, 2.3] to 2.4+. When upgrading from 3.3 to a newer version it is not required to specify this config. Default is `null`. Accepted values are "0.10.0", "0.10.1", "0.10.2", "0.11.0", "1.0", "1.1", "2.0", "2.1", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8", "3.0", "3.1", "3.2", "3.3", "3.4", "3.5(for upgrading from the corresponding old version).</p>
<table><tbody>
<tr><th>Type:</th><td>string</td></tr>
<tr><th>Default:</th><td>null</td></tr>
<tr><th>Valid Values:</th><td>[null, 0.10.0, 0.10.1, 0.10.2, 0.11.0, 1.0, 1.1, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 3.0, 3.1, 3.2, 3.3, 3.4]</td></tr>
<tr><th>Valid Values:</th><td>[null, 0.10.0, 0.10.1, 0.10.2, 0.11.0, 1.0, 1.1, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5]</td></tr>
<tr><th>Importance:</th><td>low</td></tr>
</tbody></table>
</li>
Expand Down
2 changes: 1 addition & 1 deletion 36/js/templateData.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,6 @@ limitations under the License.
var context={
"version": "36",
"dotVersion": "3.6",
"fullDotVersion": "3.6.1",
"fullDotVersion": "3.6.2-SNAPSHOT",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be snapshot, I should check out 3.6.1 and build from there I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we want this to stay 3.6.1

"scalaVersion": "2.13"
};
90 changes: 11 additions & 79 deletions 36/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -1453,8 +1453,8 @@ <h4 class="anchor-heading"><a id="remote_jmx" class="anchor-link"></a><a href="#
</tr>
<tr>
<td>Byte in rate from other brokers</td>
<td>kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesInPerSec</td>
<td>Byte in (from the other brokers) rate across all topics.</td>
<td>kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesInPerSec,topic=([-.\w]+)</td>
<td>Byte in (from the other brokers) rate per topic. Omitting 'topic=(...)' will yield the all-topic rate.</td>
</tr>
<tr>
<td>Controller Request rate from Broker</td>
Expand Down Expand Up @@ -1537,8 +1537,8 @@ <h4 class="anchor-heading"><a id="remote_jmx" class="anchor-link"></a><a href="#
</tr>
<tr>
<td>Byte out rate to other brokers</td>
<td>kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesOutPerSec</td>
<td>Byte out (to the other brokers) rate across all topics</td>
<td>kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesOutPerSec,topic=([-.\w]+)</td>
<td>Byte out (to the other brokers) rate per topic. Omitting 'topic=(...)' will yield the all-topic rate.</td>
</tr>
<tr>
<td>Rejected byte rate</td>
Expand Down Expand Up @@ -3984,95 +3984,27 @@ <h5 class="anchor-heading"><a id="tiered_storage_config_topic" class="anchor-lin
If unset, The value in <code>retention.ms</code> and <code>retention.bytes</code> will be used.
</p>

<h4 class="anchor-heading"><a id="tiered_storage_config_ex" class="anchor-link"></a><a href="#tiered_storage_config_ex">Quick Start Example</a></h4>

<p>Apache Kafka doesn't provide an out-of-the-box RemoteStorageManager implementation. To have a preview of the tiered storage
feature, the <a href="https://github.com/apache/kafka/blob/trunk/storage/src/test/java/org/apache/kafka/server/log/remote/storage/LocalTieredStorage.java">LocalTieredStorage</a>
implemented for integration test can be used, which will create a temporary directory in local storage to simulate the remote storage.
</p>

<p>To adopt the `LocalTieredStorage`, the test library needs to be built locally</p>
<pre># please checkout to the specific version tag you're using before building it
# ex: `git checkout 3.6.1`
./gradlew clean :storage:testJar</pre>
<p>After build successfully, there should be a `kafka-storage-x.x.x-test.jar` file under `storage/build/libs`.
Next, setting configurations in the broker side to enable tiered storage feature.</p>
<h4 class="anchor-heading"><a id="tiered_storage_config_ex" class="anchor-link"></a><a href="#tiered_storage_config_ex">Configurations Example</a></h4>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

81773f6 was not done in AK 3.6


<p>Here is a sample configuration to enable tiered storage feature in broker side:
<pre>
# Sample Zookeeper/Kraft broker server.properties listening on PLAINTEXT://:9092
remote.log.storage.system.enable=true

# Setting the listener for the clients in RemoteLogMetadataManager to talk to the brokers.
# Please provide the implementation for remoteStorageManager. This is the mandatory configuration for tiered storage.
# remote.log.storage.manager.class.name=org.apache.kafka.server.log.remote.storage.NoOpRemoteStorageManager
# Using the "PLAINTEXT" listener for the clients in RemoteLogMetadataManager to talk to the brokers.
remote.log.metadata.manager.listener.name=PLAINTEXT

# Please provide the implementation info for remoteStorageManager.
# This is the mandatory configuration for tiered storage.
# Here, we use the `LocalTieredStorage` built above.
remote.log.storage.manager.class.name=org.apache.kafka.server.log.remote.storage.LocalTieredStorage
remote.log.storage.manager.class.path=/PATH/TO/kafka-storage-x.x.x-test.jar

# These 2 prefix are default values, but customizable
remote.log.storage.manager.impl.prefix=rsm.config.
remote.log.metadata.manager.impl.prefix=rlmm.config.

# Configure the directory used for `LocalTieredStorage`
# Note, please make sure the brokers need to have access to this directory
rsm.config.dir=/tmp/kafka-remote-storage

# This needs to be changed if number of brokers in the cluster is more than 1
rlmm.config.remote.log.metadata.topic.replication.factor=1

# Try to speed up the log retention check interval for testing
log.retention.check.interval.ms=1000
</pre>
</p>

<p>Following <a href="#quickstart_startserver">quick start guide</a> to start up the kafka environment.
Then, create a topic with tiered storage enabled with configs:

<pre>
# remote.storage.enable=true -> enables tiered storage on the topic
# local.retention.ms=1000 -> The number of milliseconds to keep the local log segment before it gets deleted.
Note that a local log segment is eligible for deletion only after it gets uploaded to remote.
# retention.ms=3600000 -> when segments exceed this time, the segments in remote storage will be deleted
# segment.bytes=1048576 -> for test only, to speed up the log segment rolling interval
# file.delete.delay.ms=10000 -> for test only, to speed up the local-log segment file delete delay

bin/kafka-topics.sh --create --topic tieredTopic --bootstrap-server localhost:9092 \
--config remote.storage.enable=true --config local.retention.ms=1000 --config retention.ms=3600000 \
--config segment.bytes=1048576 --config file.delete.delay.ms=1000
<p>After broker is started, creating a topic with tiered storage enabled, and a small log time retention value to try this feature:
<pre>bin/kafka-topics.sh --create --topic tieredTopic --bootstrap-server localhost:9092 --config remote.storage.enable=true --config local.retention.ms=1000
</pre>
</p>

<p>Try to send messages to the `tieredTopic` topic to roll the log segment:</p>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should be kept?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find the commit that introduced this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'd keep the larger section we currently have in the docs.


<pre>
bin/kafka-producer-perf-test.sh --topic tieredTopic --num-records 1200 --record-size 1024 --throughput -1 --producer-props bootstrap.servers=localhost:9092
</pre>

<p>Then, after the active segment is rolled, the old segment should be moved to the remote storage and get deleted.
This can be verified by checking the remote log directory configured above. For example:
</p>

<pre> > ls /tmp/kafka-remote-storage/kafka-tiered-storage/tieredTopic-0-jF8s79t9SrG_PNqlwv7bAA
00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.index
00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.snapshot
00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.leader_epoch_checkpoint
00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.timeindex
00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.log
</pre>

<p>Lastly, we can try to consume some data from the beginning and print offset number, to make sure it will successfully fetch offset 0 from the remote storage.</p>

<pre>bin/kafka-console-consumer.sh --topic tieredTopic --from-beginning --max-messages 1 --bootstrap-server localhost:9092 --property print.offset=true</pre>

<p>Please note, if you want to disable tiered storage at the cluster level, you should delete the tiered storage enabled topics explicitly.
Attempting to disable tiered storage at the cluster level without deleting the topics using tiered storage will result in an exception during startup.</p>

<pre>bin/kafka-topics.sh --delete --topic tieredTopic --bootstrap-server localhost:9092</pre>

<p>After topics are deleted, you're safe to set <code>remote.log.storage.system.enable=false</code> in the broker configuration.</p>

<h4 class="anchor-heading"><a id="tiered_storage_limitation" class="anchor-link"></a><a href="#tiered_storage_limitation">Limitations</a></h4>

<p>While the early access release of Tiered Storage offers the opportunity to try out this new feature, it is important to be aware of the following limitations:
Expand Down
16 changes: 16 additions & 0 deletions 36/streams/developer-guide/dsl-api.html
Original file line number Diff line number Diff line change
Expand Up @@ -2818,6 +2818,7 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
(leftValue, rightValue) -&gt; &quot;left=&quot; + leftValue + &quot;, right=&quot; + rightValue, /* ValueJoiner */
Joined.keySerde(Serdes.String()) /* key */
.withValueSerde(Serdes.Long()) /* left value */
.withGracePeriod(Duration.ZERO) /* grace period */
);

// Java 7 example
Expand All @@ -2830,6 +2831,7 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
},
Joined.keySerde(Serdes.String()) /* key */
.withValueSerde(Serdes.Long()) /* left value */
.withGracePeriod(Duration.ZERO) /* grace period */
);</code></pre>
<p>Detailed behavior:</p>
<ul>
Expand All @@ -2849,6 +2851,12 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
<li>When the table is <a class="reference internal" href="#versioned-state-stores"><span class="std std-ref">versioned</span></a>,
the table record to join with is determined by performing a timestamped lookup, i.e., the table record which is joined will be the latest-by-timestamp record with timestamp
less than or equal to the stream record timestamp. If the stream record timestamp is older than the table's history retention, then the record is dropped.</li>
<li>To use the grace period, the table needs to be <a class="reference internal" href="#versioned-state-stores"><span class="std std-ref">versioned</span></a>.
This will cause the stream to buffer for the specified grace period before trying to find a matching record with the right timestamp in the table.
The case where the grace period would be used for is if a record in the table has a timestamp less than or equal to the stream record timestamp but arrives after the stream record.
If the table record arrives within the grace period the join will still occur.
If the table record does not arrive before the grace period the join will continue as normal.
</li>
</ul>
<p class="last">See the semantics overview at the bottom of this section for a detailed description.</p>
</td>
Expand All @@ -2872,6 +2880,7 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
(leftValue, rightValue) -&gt; &quot;left=&quot; + leftValue + &quot;, right=&quot; + rightValue, /* ValueJoiner */
Joined.keySerde(Serdes.String()) /* key */
.withValueSerde(Serdes.Long()) /* left value */
.withGracePeriod(Duration.ZERO) /* grace period */
);

// Java 7 example
Expand All @@ -2884,6 +2893,7 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
},
Joined.keySerde(Serdes.String()) /* key */
.withValueSerde(Serdes.Long()) /* left value */
.withGracePeriod(Duration.ZERO) /* grace period */
);</code></pre>
<p>Detailed behavior:</p>
<ul>
Expand All @@ -2906,6 +2916,12 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
<li>When the table is <a class="reference internal" href="#versioned-state-stores"><span class="std std-ref">versioned</span></a>,
the table record to join with is determined by performing a timestamped lookup, i.e., the table record which is joined will be the latest-by-timestamp record with timestamp
less than or equal to the stream record timestamp. If the stream record timestamp is older than the table's history retention, then the record that is joined will be <code class="docutils literal"><span class="pre">null</span></code>.</li>
<li>To use the grace period, the table needs to be <a class="reference internal" href="#versioned-state-stores"><span class="std std-ref">versioned</span></a>.
This will cause the stream to buffer for the specified grace period before trying to find a matching record with the right timestamp in the table.
The case where the grace period would be used for is if a record in the table has a timestamp less than or equal to the stream record timestamp but arrives after the stream record.
If the table record arrives within the grace period the join will still occur.
If the table record does not arrive before the grace period the join will continue as normal.
</li>
</ul>
<p class="last">See the semantics overview at the bottom of this section for a detailed description.</p>
</td>
Expand Down
Loading