Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge FreeBSD 2024-03-08 #2176

Merged
merged 207 commits into from
Aug 3, 2024
Merged

Conversation

bsdjhb
Copy link
Collaborator

@bsdjhb bsdjhb commented Aug 3, 2024

PR for CI

robn and others added 30 commits February 15, 2024 11:44
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
I think I can say with some confidence that anyone making a new storage
type in 2023 is doing their own thing with compression, not this.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Always do them on the heap, and when we know how much we need, only that
much.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
We're about to have different kinds of things that we'll compare on key,
so generalise this function to support that.

(It actually worked fine because of the way the casts work out, but it
requires the key to be at the start of the object so the cast through
ddt_entry_t works, and even then it reads strangely for anything that's
not a ddt_entry_t).

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
We want to add other kinds of dedup-related objects and keep stats for
them. This makes those functions easier to use from outside ddt.c.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
It was a weird and confusing name, because it wasn't actually returning
the number of DVAs in the entry (as in, in the value/phys part) but the
maximum number of possible DVAs in a BP generated from the entry, based
on the encrypt bit in the key. This is unlike the similarly named
BP_GET_NDVAS, which really does return the number of DVAs.

Since its only used in this one place, and for a specific purpose, it
seemed more sensible to just write it in-place and remove the name.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Just to make it easier to know which bits to pay attention to.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Mostly for consistency, so the reader is less likely to wonder why these
things look different.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Things get confused when there's more than one name for a thing.

Note that we don't do this for ddt_object_t, ddt_histogram_t and
ddt_stat_t because they're part of the public ZFS interface.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
ddt_get_dedup_histogram() was actually checking it, just in an extremely
cursed way. ddt_get_dedup_object_stats() wasn't, but wasn't being called
from a dangerous place so no one noticed.

These checks are necessary, because spa_ddt[] is not populated until
spa_load(), but the spa can exist before that, while being created, and
as vdevs and metaslabs are initialised the space accounting functions
will be called to update pool space counts.

Probably the whole create path doesn't need to go asking for space
accounting from metadata subsystems until after the pool is created.
This will at least catch misuse.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Store objects store keys and values, so have them take those types and
nothing more. This way, they don't need to be concerned about the "kind"
of entry being operated on; the dispatch layer can take care of the
appropriate conversions.

This adds a "contains" op to see if a particular entry exists without
loading it, which makes a couple of things easier to do; in particular,
it allows us to avoid an allocation in ddt_class_contains().

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Nothing uses it.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Only a single bit is needed to track entry state, and definitely not two
whole bytes. Some light refactoring in ddt_lookup() is needed to support
this, but it reads a lot better now.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Most values in zio_checksum can never be used for dedup, partly because
the dedup= property only offers a limited list, but also some values (eg
ZIO_CHECKSUM_OFF) aren't real and will never be seen.

A true flag would be better than a hardcoded list, but thats more
cleanup elsewhere than I want to do right now.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Closes #15887
Skip cross filesystem block cloning tests on FreeBSD if running
less than version 14.0.  Cross filesystem copy_file_range() was
added in FreeBSD 14.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15901
On Linux block devices used for vdevs will by partitioned.  The block
device must be large enough for an 64M partition starting at offset
of 2048 sectors (part1), and a second 64M reserved partition at the
end of the device (part9).

This commit adds a capacity check when creating the GPT label to
immediately detect a device which is too small.  With the existing
code this would be caught slightly latter when attempting to use
the partition.  Catching it sooner let's us print a more useful error.

Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #15898
Follow up to 99495ba which
accidentally introduce this regression.

Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Quartz <[email protected]>
Closes #15907
-Wformat-truncation looks for places where the return code of snprintf()
is unchecked and the provided buffer might be too short. This is based
on a heuristic that can change between compiler versions.

It has been seen to get this wrong in ddt_object_name(), leading to
DDT_NAMELEN being increased somewhat arbitrarily.

There's no good reason to have this warning enabled, so here we disable
it everywhere. Truncation may be undesirable, but snprintf() is
guaranteed to emit a trailing null, so at worst we get a short string,
not a buffer overrun.

Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #15908
This is the buffer size passed to ddt_object_name(), to expand the
DMU_POOL_DDT format. That format inserts the table checksum, class and
type names, which as I write this are max 6, 9 and 3, respectively.

Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #15908
It is possible that on-disk filesystem format causes allocation of
buffers of size larger than maxbcachebuf.  Currently, getblkx() and
indirectly bufkva_alloc() panic in that situation.

It is more useful to return an error instead, allowing the system to
continue running.

PR:	277414
Reported by:	Robert Morris <[email protected]>
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
This function is used by netlink(9) only.  The netlink(9) taskqueue thread
runs in the vnet of the socket whose request the thread is processing
right now.  This is a correct vnet and resetting it to vnet0 is incorrect.
If the function is to be used by any other caller in addition to
netlink(9), it would be caller's responsiblity to provide correct vnet(9).

Reviewed by:		melifaro, dchagin
Differential Revision:	https://reviews.freebsd.org/D44191
PR:			277286
Reviewed by:	emaste, kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D44195
similar to Apple _POSIX_SPAWN_DISABLE_ASLR

Reviewed by:	emaste, kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D44195
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Correct skb_queue_tail to queue the buffer at the tail of the skbuff.
The skbuff is a circular doubly-linked list, and we call with a pointer
to the head of the list.  Thus queueing before the head gives us a
queueing at the tail.

As a motivating factor, the current behaviour (queueing at the head) was
causing frequent kernel panics from my RTL8822BE wireless card, which
uses the rtw88 driver.  Interrupts can cause buffers to be added to the
rtwdev c2h_queue while the queue is being drained in rtw_c2h_work.
Queueing at the head would leave the nascent entry in the linked list
pointing to the old, now freed, memory for the buffer being processed.
When rtw_c2h_work is next called, we try reading this and so panic.

Reviewed by:	emaste, bz
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D44192
hostap MLME uses Linux data structures and definitions not available
in FreeBSD. The ability for hostapd to select the frequency (channel)
depends Linux MLME, though strictly it's not required. Work around the
Linux MLME requirement to configure device frequency.

The detailed description is: hostapd will only set the channel (frequency)
when Linux MLME is configured. Enabling NEED_AP_MLME will result in
numerous build errors due do Linux data structures and definitions not
available under FreeBSD. The code to set the frequency from the selected
channel is only within the NEED_AP_MLME code path because without MLME,
hostapd_get_hw_features() is an inline that always returns -1 whereas with
MLME hostapd_get_hw_features() will obtain hardware features from the
kernel. Until such time we simply set the frequency as configured.

PR:		276375
MFC after:	1 month
On this platform early console access is possible via SBI. Follow recent
changes to EARLY_PRINTF option and give it a named constant.

Update the commented option in GENERIC so that it compiles.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D44100
cp890582 and others added 28 commits August 3, 2024 12:56
…` Command

The 1G speed on DAC medium is incorrectly labeled as 1000baseT, it
should be 1000baseCX. Updated the label accordingly.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42952
Enabled User Configuration of 1G Speed on Wh+ SFP28 Port with AOC
cable.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42953
Increasing the maximum configurable MTU from 9000 to 9600 to
align with the firmware's capability of handling an MTU up to 9600.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42954
Update Firmware Header to Latest Version 1.10.2.136.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42955
…HOR Controller

The newly added port extended hardware statistics are now accessible to
users through the sysctl interface. Also, Few obsolete stats are removed
and few stats are renamed.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42957
This update enables the display of pluggable module information
to users via the ifconfig utility.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42958
Add support for 50G, 100G and 200G PAM4 support

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42959
The firmware lacks support for manually setting 1G and 10G baseT speeds.
However, the driver can enable auto speed masks to achieve automatic configuration
at these speeds.

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42960
if_bnxt: Integrate AOC Cable Support into Current 40G PHY Speed

Reviewed by:            imp
Approved by:            imp
Differential revision:  https://reviews.freebsd.org/D42956
It's unclear to me that any of these symbols ever existed.  The ones
I've spot checked are only mentioned in the initial Citrus iconv import
(commit ad30f8e) and this code hasn't changed much over time.

Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D44183
These symbols aren't present on i386 so don't try to expose them.

Given the structure of quad/Makefile.inc, it might make more sense to
have per-arch symbol maps here, but this is sufficent to build with
WITHOUT_UNDEFINED_VERSION on i386.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D44243
The assembly implementation was removed in 2006 (commit 3c03c70).

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D44242
I'd forgotten that we have to adjust the stderr tests from
upstream. Remove the OK files. Also remove system-status.*.  These
restore the fixes I made in 517e52b which were lost when I imported
the last version of awk.

Also, force LANG to be C.UTF-8 when testing to ensure that stray lang
settings don't fail tests.

Sponsored by:		Netflix
These files were bogusly added when I imported awk 2nd edition.

Sponsored by:		Netflix
When doing mbuf queueing, the packet filter hooks in ether_demux(),
ip_input(), and ip6_input() are by-passed. This means that the packet
filters don't process incoming packets, which might result in
connection failures. For example bypassing the TCP sequence number
validation will result in dropping valid packets.
Please note that this patch is only disabling mbuf queueing, not LRO.

Reported by:		Herbert J. Skuhra
Reviewed by:		glebius, rrs, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D43769
Add the IP, UDP, and TCP receive static probes to the code path,
which avoids if_input.

Reviewed by:		rrs, markj
MFC after:		1 week`
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D43727
The msi address contains apic id. The code in vmbus_pcib_map_msi()
treats it as cpu id, which could cause mis-configuration of msix
IRQs, leading to missing interrupts for SRIOV devices. This happens
when apic id is not the same as cpu id on certain large VM sizes
with multiple numa domains in Azure. Fix this issue by correctly
mapping apic ids to cpu ids.

On vPCI version before 1.4, it only supports up to 64 vcpus
for msi/msix interrupt. This change also adds a check and returns
error if the vcpu_id is greater than 63.

Reported by:	NetApp
Tested by:	whu
MFC after:	1 week
This is standard practice for clock drivers that register clocks
dynamically. Nothing else uses the CLK_DEBUG macro.

The result is that the name and frequency of the fixed clock is printed
for a verbose boot, which may aid in debugging.

Reviewed by:	manu
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D44265
We may attach several of these devices, but there is no meaningful
information added to dmesg. For example:

  ofwbus0: <Open Firmware Device Tree>
  clk_fixed0: <Fixed clock> on ofwbus0
  clk_fixed1: <Fixed clock> on ofwbus0
  clk_fixed2: <Fixed clock> on ofwbus0
  clk_fixed3: <Fixed clock> on ofwbus0
  clk_fixed4: <Fixed clock> on ofwbus0
  clk_fixed5: <Fixed clock> on ofwbus0
  clk_fixed6: <Fixed clock> on ofwbus0
  clk_fixed7: <Fixed clock> on ofwbus0
  clk_fixed8: <Fixed clock> on ofwbus0
  clk_fixed9: <Fixed clock> on ofwbus0
  clk_fixed10: <Fixed clock> on ofwbus0
  clk_fixed11: <Fixed clock> on ofwbus0

To reduce this noise, quiet the devices for by default. For verbose
boot, the message will be emitted.

Reviewed by:	manu
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D44266
If the call to clknode_get_freq() returns an error (unlikely), report
this, rather than printing the error code as the clock frequency.

If the clock has no parent (e.g. a fixed reference clock), print "none"
rather than "(NULL)(-1)". This is a more human-legible presentation of the
same information.

Reviewed by:	manu
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D44267
Trying to probe+attach the child device at the point it is added comes
before the syscon handle is set up (if relevant). It will therefore be
unavailable to the attach method which is expecting it, and the first
attempt to attach the device will fail.

Just rely on the call to bus_generic_attach() at the end of the function
to perform probe+attach of dev's children.

Reviewed by:	manu
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D44268
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
to allow overwrite global default if needed.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
crunchgen generates a foo.lo for each binary it will end up crunching
into the final product.  While they have a dependency on the libs that
are used to link them, nothing will force relinking if the set of libs
needed to link them is changed.  Because of this, incremental builds may
not be possible if one builds a version of, e.g., rescue/ with a broken
set of libs specified for a project -- a subsequent fix won't be rolled
in cleanly, it will require purging the rescue/ objdir.

This is a bit crude, but the foo.mk we generate doesn't actually get
regenerated all that often in practice, so a spurious relink for the
vast majority of crunched objects won't actually happen all that often.

Reviewed by:	bapt, emaste, imp
Differential Revision:	https://reviews.freebsd.org/D43869
@bsdjhb bsdjhb merged commit cfb1786 into CTSRD-CHERI:dev Aug 3, 2024
29 checks passed
@bsdjhb bsdjhb deleted the merge-freebsd-20240308 branch August 3, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.