Commit Graph

1455 Commits

Author SHA1 Message Date
Alexander Motin 026fe79646 Speculative prefetch for reordered requests
Before this change speculative prefetcher was able to detect a stream
only if all of its accesses are perfectly sequential.  It was easy to
implement and is perfectly fine for single-threaded applications.
Unfortunately multi-threaded network servers, such as iSCSI, SMB or
NFS usually have plenty of threads and may often reorder requests,
preventing successful speculation and prefetch.

This change allows speculative prefetcher to detect streams even if
requests are reordered by introducing a list of 9 non-contiguous
ranges up to 16MB ahead of current stream position and filling the
gaps as more requests arrive.  It also allows stream to proceed
even with holes up to a certain configurable threshold (25%).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16022
2024-04-19 10:13:38 -07:00
Bi11 a4978d2605 zdb: Fix false leak report for BRT objects
Fix a misreport in 'zdb -d' where it falsely marked
BRT objects as leaked.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Yuxin Wang <yuxinwang9999@gmail.com>
Closes #15882
2024-02-12 17:03:17 -08:00
Dex Wood a6f6c881ff Add Ntfy notification support to ZED
This commit adds the zed_notify_ntfy() function and hooks it
into zed_notify(). This will allow ZED to send notifications
to ntfy.sh or a self-hosted Ntfy service, which can be received
on a desktop or mobile device. It is configured with ZED_NTFY_TOPIC,
ZED_NTFY_URL, and ZED_NTFY_ACCESS_TOKEN variables in zed.rc.

Reviewed-by: @classabbyamp
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dex Wood <slash2314@gmail.com>
Closes #15584
2024-02-12 14:32:11 -08:00
Rob N 36116b4612 zfs list: add '-t fs' and '-t vol' options (#15883)
Because "filesystem" and "volume" are just too long!

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15864
(cherry picked from commit a5a725440b)
2024-02-12 14:04:27 -08:00
Cameron Harr 40e20d808c Add 'zpool status -e' flag to see unhealthy vdevs
When very large pools are present, it can be laborious to find
reasons for why a pool is degraded and/or where an unhealthy vdev
is. This option filters out vdevs that are ONLINE and with no errors
to make it easier to see where the issues are. Root and parents of
unhealthy vdevs will always be printed.

Testing:
ZFS errors and drive failures for multiple vdevs were simulated with
zinject.

Sample vdev listings with '-e' option
- All vdevs healthy
    NAME        STATE     READ WRITE CKSUM
    iron5       ONLINE       0     0     0

- ZFS errors
    NAME        STATE     READ WRITE CKSUM
    iron5       ONLINE       0     0     0
      raidz2-5  ONLINE       1     0     0
        L23     ONLINE       1     0     0
        L24     ONLINE       1     0     0
        L37     ONLINE       1     0     0

- Vdev faulted
    NAME        STATE     READ WRITE CKSUM
    iron5       DEGRADED     0     0     0
      raidz2-6  DEGRADED     0     0     0
        L67     FAULTED      0     0     0  too many errors

- Vdev faults and data errors
    NAME        STATE     READ WRITE CKSUM
    iron5       DEGRADED     0     0     0
      raidz2-1  DEGRADED     0     0     0
        L2      FAULTED      0     0     0  too many errors
      raidz2-5  ONLINE       1     0     0
        L23     ONLINE       1     0     0
        L24     ONLINE       1     0     0
        L37     ONLINE       1     0     0
      raidz2-6  DEGRADED     0     0     0
        L67     FAULTED      0     0     0  too many errors

- Vdev missing
    NAME        STATE     READ WRITE CKSUM
    iron5       DEGRADED     0     0     0
      raidz2-6  DEGRADED     0     0     0
        L67     UNAVAIL      3     1     0

- Slow devices when -s provided with -e
    NAME        STATE     READ WRITE CKSUM  SLOW
    iron5       DEGRADED     0     0     0     -
      raidz2-5  DEGRADED     0     0     0     -
        L10     FAULTED      0     0     0     0  external device fault
        L51     ONLINE       0     0     0    14

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Cameron Harr <harr1@llnl.gov>
Closes #15769
2024-02-08 15:22:16 -08:00
Mauricio Faria de Oliveira 9bb8d26bd5 zed: fix typo in variable ZED_POWER_OFF_ENCLO*US*RE_SLOT_ON_FAULT
Replace ENCLO_US_RE with ENCLO_SU_RE in the name of the variable.

Note this changes the user-visible string in zed.rc, thus might
break current users with the wrong string, but it's ~2 months
since zfs-2.2.0 tag is out, thus should not be widespread yet.

Mechanical change:

    $ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
    cmd/zed/zed.d/zed.rc
    cmd/zed/zed.d/statechange-slot_off.sh

    $ sed -i 's/ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT/<linebreak>
                ZED_POWER_OFF_ENCLOSURE_SLOT_ON_FAULT/g' \
      cmd/zed/zed.d/zed.rc \
      cmd/zed/zed.d/statechange-slot_off.sh

    $ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
    $

Fixes 11fbcacf37
("zed: Add zedlet to power off slot when drive is faulted")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Closes #15651
2024-02-08 15:22:16 -08:00
Mark Johnston 9ef15845f5 Fix the FreeBSD userspace build (#15716)
- Mark some parameters to zpool_power*() as unused.
- Add a stub zpool_disk_wait().

Fixes: a9520e6e5 ("zpool: Add slot power control, print power status")

Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-01-30 13:33:36 -08:00
Tony Hutter 69142125d7 zpool: Add slot power control, print power status
Add `zpool` flags to control the slot power to drives.  This assumes
your SAS or NVMe enclosure supports slot power control via sysfs.

The new `--power` flag is added to `zpool offline|online|clear`:

    zpool offline --power <pool> <device>    Turn off device slot power
    zpool online --power <pool> <device>     Turn on device slot power
    zpool clear --power <pool> [device]      Turn on device slot power

If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power'
option is automatically implied for `zpool online` and `zpool clear`
and does not need to be passed.

zpool status also gets a --power option to print the slot power status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mart Frauenlob <AllKind@fastest.cc>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15662
2024-01-29 15:12:06 -08:00
Tony Hutter 59112ca27d zed: misc vdev_enc_sysfs_path fixes
There have been rare cases where the VDEV_ENC_SYSFS_PATH value that zed
gets passed is stale.  To mitigate this, dynamically check the sysfs
path at the time of zed event processing, and use the dynamic value if
possible.  Note that there will be other times when we can not
dynamically detect the sysfs path (like if a disk disappears) and have
to rely on the old value for things like turning on the fault LED.  That
is to say, we can't just blindly use the dynamic path in every case.

Also:
	- Add enclosure sysfs entry when running 'zpool add'
	- Fix 'slot' and 'enc' zpool.d scripts for nvme

Reviewed-by: Don Brady <dev.fs.zfs@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15462
2024-01-29 15:11:56 -08:00
Rob N 0606ce2055 zpool wait: print timestamp before the header
list, status and iostat all display the -T timestamp before the header,
but wait showed it after. Make it be like the others.

Reported-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15825
2024-01-29 14:53:29 -08:00
Stefan Lendl 4db88c37cc fix(mount): do not truncate shares not zfs mount
When running zfs share -a resetting the exports.d/zfs.exports makes
sense the get a clean state.
Truncating was also called with zfs mount which would not populate the
file again.
Add test to verify shares persist after mount -a.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Stefan Lendl <s.lendl@proxmox.com>
Closes #15607 
Closes #15660
2024-01-29 14:53:29 -08:00
Rich Ercolani 7bccf98a73 Make zdb -R scale less poorly
zdb -R with :d tries to use gzip decompression 9 times per size.
There's absolutely no reason for that, they're all the same
decompressor.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15726
2024-01-29 14:53:29 -08:00
Rich Ercolani 4d4972ed98 Stop wasting time on malloc in snprintf_zstd_header
Profiling zdb -vvvvv on datasets with a lot of zstd blocks, we find
ourselves spending quite a lot of time on malloc/free, because we
allocate a 16M abd each call, and never free it, so we're leaking
16M per call as well.

This seems sub-optimal. So let's just keep the buffer around and
reuse it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15721
2024-01-29 14:53:29 -08:00
Alexander Motin dea2d3c6cd zdb: Dump encrypted write and clone ZIL records
Block pointers are not encrypted in TX_WRITE and TX_CLONE_RANGE
records, so we can dump them, that may be useful for debugging.

Related to #15543.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15629
2024-01-08 16:11:39 -08:00
Martin Matuška eb34de04d7 zdb: fix printf() length for uint64_t devid
Bug introduced in 213d682967.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #15606
2023-11-29 13:08:25 -08:00
Akash B e4985bf5a1 zdb: Fix zdb '-O|-r' options with -e/exported zpool
zdb with '-e' or exported zpool doesn't work along with
'-O' and '-r' options as we process them before '-e' has
been processed.

Below errors are seen:

~> zdb -e pool-mds65/mdt65 -O oi.9/0x200000009:0x0:0x0
failed to hold dataset 'pool-mds65/mdt65': No such file or directory

~> zdb -e pool-oss0/ost0 -r file1 /tmp/filecopy1 -p.
failed to hold dataset 'pool-oss0/ost0': No such file or directory
zdb: internal error: No such file or directory

We need to make sure to process '-O|-r' options after the
'-e' option has been processed, which imports the pool to
the namespace if it's not in the cachefile.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #15532
2023-11-28 12:56:43 -08:00
Rob Norris e96675a7b1 zdb: show BRT statistics and dump its contents
Same idea as the dedup stats, but for block cloning.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15541
2023-11-28 12:56:43 -08:00
Umer Saleem f1659cc782 ZTS: Test for all known zpool feature sets
zpool_create_features_007_pos only tested for compat-2020 feature
set. It would be useful to test for all known features sets. If
any additional feature is found enabled that is not present in
compatibility list or feature set, it should be caught and
reported earlier.

This commit also removes encryption from openzfsonosx-1.8.1
compatibility list. Encryption enables bookmark_v2, since it is
a dependency of encryption, but not listed in openzfsonoxx-1.8.1
compatibility list.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15505
2023-11-16 14:23:03 -08:00
Umer Saleem f0ffcc3adc Remove obsolete_counts from grub2 compatibility list
PR#15459 add all read-only compatible zpool features to grub2
compatibility list. 'obsolete_counts' is a read-only features that
depends on 'device_removal' feature which is not read-only and
is marked as ZFEATURE_FLAG_MOS. Creating a pool with grub2
compatibility enables 'device_removal' feature as well, which is
not desired.

This commit removes the 'obsolete_counts' feature from
grub2 compatibility list, as GRUB only supports read-only
compatible features.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15499
2023-11-16 14:23:03 -08:00
Umer Saleem 1c7048357d Add all read-only compatible zpool features to grub2 compatibility
GRUB opens the boot pool in read-only mode. All read-only
compatible features for zpool can be enabled and added to
grub2 compatibility, as GRUB does not open the boot-pool
for write.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15459
2023-11-16 14:23:03 -08:00
ofthesun9 86c3ed40e1 "ARC prefetch metadata accesses:" appears twice in the output.
The first occurrence should be "ARC prefetch data accesses:"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: ofthesun9 <olivier@ofthesun.net>
Closes #15427
2023-11-06 16:47:14 -08:00
Don Brady 0bcd1151f0 Fix ZED auto-replace for VDEVs using by-id paths
The change is simple -- restore the original code so that the VDEV 
path is updated when using by-id paths.  The more challenging part 
was to devise a second ZTS test, that would test auto-replace for 
'by-id' and help prevent a future regression.

With that new test, we can now do an A|B test with , and without, 
the fix to confirm that auto-replace for by-id paths works. The 
existing auto-replace test, functional/fault/auto_replace_001_pos, 
will confirm that we didn't break auto-replace for 'by-vdev' paths.

In the original functional/fault/auto_replace_001_pos test, the disk 
wipe (using dd) was not effective in removing the partitioning since 
the kernel was never informed of the wipe.

Added a call to wipefs(8) so that the kernel is informed and ZED will 
re-partition the device.
    
Added a validation step that the re-partitioning occurred by
confirming  that the GPT partition UUID changes.

Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Don Brady <don.brady@klarasystems.com>
Closes #15363
2023-11-06 16:45:14 -08:00
Tony Hutter 78fd79eacd Add zfs_prepare_disk script for disk firmware install
Have libzfs call a special `zfs_prepare_disk` script before a disk is
included into the pool.  The user can edit this script to add things
like a disk firmware update or a disk health check.  Use of the script
is totally optional. See the zfs_prepare_disk manpage for full details.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15243
2023-11-06 16:45:07 -08:00
Rob Norris 33d7c2d165 import: require force when cachefile hostid doesn't match on-disk
Previously, if a cachefile is passed to zpool import, the cached config
is mostly offered as-is to ZFS_IOC_POOL_TRYIMPORT->spa_tryimport(), and
the results are taken as the canonical pool config and handed back to
ZFS_IOC_POOL_IMPORT.

In the course of its operation, spa_load() will inspect the pool and
build a new config from what it finds on disk. However, it then
regenerates a new config ready to import, and so rightly sets the hostid
and hostname for the local host in the config it returns.

Because of this, the "require force" checks always decide the pool is
exported and last touched by the local host, even if this is not true,
which is possible in a HA environment when MMP is not enabled. The pool
may be imported on another head, but the import checks still pass here,
so the pool ends up imported on both.

(This doesn't happen when a cachefile isn't used, because the pool
config is discovered in userspace in zpool_find_import(), and that does
find the on-disk hostid and hostname correctly).

Since the systemd zfs-import-cache.service unit uses cachefile imports,
this can lead to a system returning after a crash with a "valid"
cachefile on disk and automatically, quietly, importing a pool that has
already been taken up by a secondary head.

This commit causes the on-disk hostid and hostname to be included in the
ZPOOL_CONFIG_LOAD_INFO item in the returned config, and then changes the
"force" checks for zpool import to use them if present.

This method should give no change in behaviour for old userspace on new
kernels (they won't know to look for the new config items) and for new
userspace on old kernels (the won't find the new config items).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #15290
2023-10-07 09:08:20 -07:00
Umer Saleem 8015e2ea66 Add '-u' - nomount flag for zfs set
This commit adds '-u' flag for zfs set operation. With this flag,
mountpoint, sharenfs and sharesmb properties can be updated
without actually mounting or sharing the dataset.

Previously, if dataset was unmounted, and mountpoint property was
updated, dataset was not mounted after the update. This behavior
is changed in #15240. We mount the dataset whenever mountpoint
property is updated, regardless if it's mounted or not.

To provide the user with option to keep the dataset unmounted and
still update the mountpoint without mounting the dataset, '-u'
flag can be used.

If any of mountpoint, sharenfs or sharesmb properties are updated
with '-u' flag, the property is set to desired value but the
operation to (re/un)mount and/or (re/un)share the dataset is not
performed and dataset remains as it was before.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15322
2023-10-03 15:41:46 -07:00
Umer Saleem c53bc3837c Improve the handling of sharesmb,sharenfs properties
For sharesmb and sharenfs properties, the status of setting the
property is tied with whether we succeed to share the dataset or
not. In case sharing the dataset is not successful, this is
treated as overall failure of setting the property. In this case,
if we check the property after the failure, it is set to on.

This commit updates this behavior and the status of setting the
share properties is not returned as failure, when we fail to
share the dataset.

For sharenfs property, if access list is provided, the syntax
errors in access list/host adresses are not validated until after
setting the property during postfix phase while trying to
share the dataset. This is not correct, since the property has
already been set when we reach there.

Syntax errors in access list/host addresses are validated while
validating the property list, before setting the property and
failure is returned to user in this case when there are errors
in access list.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15240
2023-10-03 15:41:46 -07:00
Umer Saleem e9dc31c74e Update the behavior of mountpoint property
There are some inconsistencies in the handling of mountpoint
property. This commit updates the behavior and makes it
consistent.

If mountpoint property is set when dataset is unmounted, this
would update the mountpoint property. The mountpoint could be
valid or invalid in this case. Setting the mountpoint property
would result in success in this case. Dataset would still be
unmounted here.

On the other hand, if dataset is mounted and mountpoint
property is updated to something invalid where mount cannot be
successful, for example, setting the mountpoint inside a readonly
directory. This would unmount the dataset, set the mountpoint
property to requested value and tries to mount the dataset. The
mount operation returns error and this error is treated as
overall failure of setting the property while the property is
actually set.

To make the behavior consistent in case dataset is mounted or
unmounted, we should try to mount the dataset whenever mountpoint
property is updated. This would result in mounting the datasets
if canmount property is set to on, regardless if the dataset was
previously unmounted.

The failure in mount operation while setting the mountpoint
property should not be treated as failure, since the property is
actually set now to user requested value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15240
2023-10-03 15:41:46 -07:00
George Amanakis 608741d062 Report ashift of L2ARC devices in zdb
Commit 8af1104f does not actually store the ashift of cache devices in
their label. However, in order to facilitate reporting the ashift
through zdb, we enable this in the present commit. We also document
how the retrieval of the ashift is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #15331
2023-10-03 09:06:07 -07:00
Paul Dagnelie 0ce1b2ca19 Invoke zdb by guid to avoid import errors
The problem that was occurring is basically that a device was removed 
by ztest and replaced with another device. It was then reguided. The 
import then failed because there were two possible imports with the 
same name; one with the new guid, and one with the old. This can 
happen because the label writes from the device removal/replacement 
can be subject to ztest's error injection. 

The other ways to fix this would be to change the error injection to 
not trigger on removals (which may not be technically feasible), or 
to change the import code to not report configurations that are so 
short on devices (which would potentially have unpleasant end-user 
effects when trying to recover from data losses/device configuration 
issues).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #15298
2023-09-28 14:28:21 -07:00
Alexander Motin 0aabd6b482 ZIL: Avoid dbuf_read() in ztest_get_data()
While working on similar patches for zfs and zvol in #15153 I've
forgot about ztest.  Update it also so that we test the same code
paths as use in production.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15301
2023-09-28 14:28:21 -07:00
Rob N 3af63683fe cmd: add 'help' subcommand to zpool and zfs
'program help subcommand' is a reasonably common pattern for
multifunction command-line programs. This commit adds support for that
style to the zpool and zfs commands.

When run as 'zpool help [<topic>]' or 'zfs help [<topic>]', executes the
'man' program on the PATH with the most likely manpage name for the
requested topic: "zpool-<topic>" or "zfs-<topic>" for subcommands, or
"zpool<topic>" or "zfs<topic>" for the "concepts" and "props" topics.
If no topic is supplied, uses the top "zpool" or "zfs" pages.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #15288
2023-09-22 16:13:20 -07:00
Paul Dagnelie 9aa1a2878e Fix incorrect expected error in ztest
There is an occasional ztest failure that looks like ztest: attach 
(/var/tmp/zloop-run/ztest.13a 570425344, draid1-1-0 532152320, 1) 
returned 22, expected 95. This is because the value that we return 
is EINVAL, but expected_error is set incorrectly.

Change the expected_error value to match both the comment and the 
actual error value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #15295
2023-09-22 16:13:20 -07:00
Tony Hutter 54c6fbd378 zed: Allow autoreplace and fault LEDs for removed vdevs
Allow zed to autoreplace vdevs marked as REMOVED.  Also update
statechange-led zedlet to toggle fault LEDs for REMOVED vdevs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15281
2023-09-19 08:52:06 -07:00
Umer Saleem 32949f2560 Relax error reporting in zpool import and zpool split
For zpool import and zpool split, zpool_enable_datasets is called
to mount and share all datasets in a pool. If there is an error
while mounting or sharing any dataset in the pool, the status of
import or split is reported as failure. However, the changes do
show up in zpool list.

This commit updates the error reporting in zpool import and zpool
split path. More descriptive messages are shown to user in case
there is an error during mount or share. Errors in mount or share
do not effect the overall status of zpool import and zpool split.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15216
2023-09-02 10:30:38 -07:00
Serapheim Dimitropoulos 63159e5bda checkstyle: fix action failures
Reviewed-by: Don Brady <dev.fs.zfs@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #15220
2023-09-01 09:33:33 -07:00
Brian Behlendorf e99e684b33 zed: update zed.d/statechange-slot_off.sh
The statechange-slot_off.sh zedlet which was added in #15200
needed to be installed so it's included by the packages.

Additional testing has also shown that multiple retries are
often needed for the script to operate reliably.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15210
2023-08-27 08:25:42 -07:00
Tony Hutter d19304ffee zed: Add zedlet to power off slot when drive is faulted
If ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT is enabled in zed.rc, then
power off the drive's slot in the enclosure if it becomes FAULTED.
This can help silence misbehaving drives.  This assumes your drive
enclosure fully supports slot power control via sysfs.

Reviewed-by: @AllKind
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15200
2023-08-25 13:33:40 -07:00
Alexander Motin df8c9f351d ZIL: Second attempt to reduce scope of zl_issuer_lock.
The previous patch #14841 appeared to have significant flaw, causing
deadlocks if zl_get_data callback got blocked waiting for TXG sync.  I
already handled some of such cases in the original patch, but issue
 #14982 shown cases that were impossible to solve in that design.

This patch fixes the problem by postponing log blocks allocation till
the very end, just before the zios issue, leaving nothing blocking after
that point to cause deadlocks.  Before that point though any sleeps are
now allowed, not causing sync thread blockage.  This require slightly
more complicated lwb state machine to allocate blocks and issue zios
in proper order.  But with removal of special early issue workarounds
the new code is much cleaner now, and should even be more efficient.

Since this patch uses null zios between write, I've found that null
zios do not wait for logical children ready status in zio_ready(),
that makes parent write to proceed prematurely, producing incorrect
log blocks.  Added ZIO_CHILD_LOGICAL_BIT to zio_wait_for_children()
fixes it.

Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15122
2023-08-25 11:58:44 -07:00
Rob N 12f2b1f65e zdb: include cloned blocks in block statistics
This gives `zdb -b` support for clone blocks.

Previously, it didn't know what clones were, so would count their space
allocation multiple times and then report leaked space (or, in debug,
would assert trying to claim blocks a second time).

This commit fixes those bugs, and reports the number of clones and the
space "used" (saved) by them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Closes #15123
2023-08-02 08:52:40 -07:00
Brian Behlendorf b9aa32ff39 zed: Reduce log noise for large JBODs
For large JBODs the log message "zfs_iter_vdev: no match" can
account for the bulk of the log messages (over 70%).  Since this
message is purely informational and not that useful we remove it.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #15086
Closes #15094
2023-07-26 08:46:58 -07:00
Tony Hutter 8c81c0b05d zed: Fix zed ASSERT on slot power cycle
We would see zed assert on one of our systems if we powered off a
slot.  Further examination showed zfs_retire_recv() was reporting
a GUID of 0, which in turn would return a NULL nvlist.  Add
in a check for a zero GUID.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15084
2023-07-21 16:35:12 -07:00
Alexander Motin fa7b2390d4
Do not report bytes skipped by scan as issued.
Scan process may skip blocks based on their birth time, DVA, etc.
Traditionally those blocks were accounted as issued, that caused
reporting of hugely over-inflated numbers, having nothing to do
with actual disk I/O.  This change utilizes never used field in
struct dsl_scan_phys to account such skipped bytes, allowing to
report how much data were actually scrubbed/resilvered and what
is the actual I/O speed.  While formally it is an on-disk format
change, it should be compatible both ways, so should not need a
feature flag.

This should partially address the same issue as c85ac731a0, but
from a different perspective, complementing it.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #15007
2023-06-30 08:47:13 -07:00
Arshad Hussain 6052060c13
Don't use hard-coded 'size' value in snprintf()
This patch changes the passing of "size" to snprintf
from hard-coded (openended) to sizeof(errbuf). This
is bringing to standard with rest of the code where-
ever 'errbuf' is used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #15003
2023-06-30 08:37:26 -07:00
Mateusz Piotrowski 62ace21a14
zdb: Add missing poolname to -C synopsis
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Sponsored-by: Klara Inc.
Closes #15014
2023-06-29 10:54:43 -07:00
Alexander Motin 70ea484e3e
Finally drop long disabled vdev cache.
It was a vdev level read cache, designed to aggregate many small
reads by speculatively issuing bigger reads instead and caching
the result.  But since it has almost no idea about what is going
on with exception of ZIO_FLAG_DONT_CACHE flag set by higher layers,
it was found to make more harm than good, for which reason it was
disabled for the past 12 years.  These days we have much better
instruments to enlarge the I/Os, such as speculative and prescient
prefetches, I/O scheduler, I/O aggregation etc.

Besides just the dead code removal this removes one extra mutex
lock/unlock per write inside vdev_cache_write(), not otherwise
disabled and trying to do some work.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #14953
2023-06-09 12:40:55 -07:00
Alexander Motin 90ccfd426d
Improve l2arc reporting in arc_summary.
- Do not report L2ARC as FAULTED in presence of in-flight writes.
- Report read and write I/Os, bytes and errors.
- Remove few numbers not important to average user.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #12304 
Closes #14946
2023-06-09 10:14:05 -07:00
Alexander Motin b3ad3f48d9
Use list_remove_head() where possible.
... instead of list_head() + list_remove().  On FreeBSD the list
functions are not inlined, so in addition to more compact code
this also saves another function call.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #14955
2023-06-09 10:12:52 -07:00
Rob Norris 8653f1de48 zdb: add -B option to generate backup stream
This is more-or-less like `zfs send`, but specifying the snapshot by its
objset id for situations where it can't be referenced any other way.

Sponsored-By: Klara, Inc.
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: WHR <msl0000023508@gmail.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #14642
2023-06-05 11:54:42 -07:00
Mike Swanson 365bae0eab
Add compatibility symlinks for FreeBSD 12.{3,4} and 13.{0,1,2}
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mike Swanson <mikeonthecomputer@gmail.com>
Closes #14902
2023-05-26 15:37:15 -07:00
Colm d3e0138a3d
Adding new read-only compatible zpool features to compatibility.d/grub2
GRUB2 is compatible with all "read-only compatible" features,
so it is safe to add new features of this type to the grub2
compatibility list. We generally want to include all compatible
features, to minimize the differences between grub2-compatible
pools and no-compatibility pools.

Adding new properties `livelist` and `zpool_checkpoint` accordingly.

Also adding them to the man page which references this file as an
example, for consistency.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Colm Buckley <colm@tuatha.org>
Closes #14893
2023-05-26 10:04:19 -07:00