1. The theory (zpool/zfs/du/ls – different tools for different results)
(main source of information -> here)
zpool: shows the total bytes of storage available in the pool (physical disk capacity).
How many bytes are in use on the storage device? How many unallocated bytes are there?
Use case: you want to upgrade your storage to get more room.
zfs: shows the total bytes of storage available to the filesystem, disk space minus ZFS redundancy metadata overhead (usable space available).
If I have to ship this filesystem to another box (uncompressed and without deduplication) how many bytes is that?
Use case: you need to know whether accounting or engineering is using more space.
du: shows the total bytes of storage space used by a directory, after compression and deduplication is taken into account.
How many bytes are used to store the contents of the files in this directory?
Use case: you look at a spare or compressed file and want to know how many bytes are allocated for it.
ls -l: shows the total bytes of storage currently used to store a file, after compression, dedupe, thin-provisioning, sparseness, etc.
How many bytes are addressable in this file?
Use case: you plan to email someone a file and want to know if it will fit in the 10MB quota
df
Use the zpool / zfs commands to identify available pool space and available file system space. “df” doesn’t understand descendent filesystems, whether snapshots exist, nor deduplication-aware.
2. Let’s see some practical situations
a) Mirror, 2 disks x 500m
# mkfile 500m /dev/dsk/diskA # mkfile 500m /dev/dsk/diskB # zpool create datapool_mirror mirror diskA diskB # zpool list datapool_mirror NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT datapool_mirror 492M 91K 492M 0% 1.00x ONLINE - # zfs list datapool_mirror NAME USED AVAIL REFER MOUNTPOINT datapool_mirror 91K 460M 31K /datapool_mirror
Mirroring 2+ devices means the data is replicated in an identical fashion across all components of a mirror. A mirror with N disks of size X can hold X bytes and can withstand (N-1) devices failing before data integrity is compromised. (zpool(1M))
Since the disks are mirrored we have 492M disk space.
Also, 460M is the amount you can store on the disks.
But why “zpool list” shows 492M free and zfs list shows 460M available?
Short answer: internal accounting, differences in raidz configuration.
Long answer: The physical space can be different from the total amount of space that any contained datasets can actually use. The amount of space used in a raidz configuration depends on the characteristics of the data being written. In addition, ZFS reserves some space for internal accounting that the zfs command takes into account, but the zpool command does not. For non-full pools of a reasonable size, these effects should be invisible. For small pools, or pools that are close to being completely full, these discrepancies may become more noticeable.
b) Raid-Z, 4 disks x 500m
# mkfile 500m /dev/dsk/disk1 # mkfile 500m /dev/dsk/disk2 # mkfile 500m /dev/dsk/disk3 # mkfile 500m /dev/dsk/disk4 # zpool create datapool raidz disk1 disk2 disk3 disk4
Check our Diskpools article for further explanations on what Raid-Z means.
This setup provides single parity configuration (one disk can fail), setup also called a 3+1 (3 data disks + 1 parity).
Now let’s take a look at the space usage.
# zpool list datapool NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT datapool 1.92G 163K 1.92G 0% 1.00x ONLINE -
The 1.92G SIZE is the disk space you have.
# zfs list datapool NAME USED AVAIL REFER MOUNTPOINT datapool 122K 1.41G 43.4K /datapool
The 1.41G AVAIL is how much you can store.
Basically zpool list shows you how much disk space you have and zfs list (or df) shows you how much you can store.
A raidz group with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised. (zpool(1M))
In this case: N=4, X=500M, P=1 => (4-1)*500M=1.5G, factoring in the filesystem overhead we arrive at 1.41G
c) Raid-Z2, 8 disks x 502.11G
# zpool list tank NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 3.62T 1.85T 1.78T 50% 1.00x ONLINE - # zpool list tank NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 3.62T 1.85T 1.78T 50% 1.00x ONLINE -
N=8 X=502.11G P=2
(N-P)*X = 3T (+620G ~17%)
3. The REFER field
(main source of information -> here)
REFER identifies the amount of data accessible by the dataset (that could be shared by other datasets in the pool). A snapshot (and a clone initially), will refer to the same data as its source.
USED identifies the amount of space consumed by the dataset and its descendants. It takes into account the reservation of any descendant datasets.
# zfs list test1 NAME USED AVAIL REFER MOUNTPOINT test1 82.3G 329G 61.4G /test1 # zfs list -t snapshot -r test1 NAME USED AVAIL REFER MOUNTPOINT test1@snapshot1 20.9G - 74.5G - test1@snapshot2 25.7M - 61.3G -
Here we can see that the dataset /test1 has 61.4G referred. It also has 2 snapshots: 20.9G and 25.7M. Total: 82.3G
Deleting snapshot1 will delete all.
A good method on finding the snapshots’ used space is using this script:
$ ./snapspace.sh mainpool/storage
SNAPSHOT OLDREFS UNIQUE UNIQUE%
zfs-auto-snap_monthly-2011-11-14-18h59 34.67G 11.0G 31%
zfs-auto-snap_monthly-2011-12-14-18h59 33.41G 8.95G 26%
zfs-auto-snap_monthly-2012-02-09-16h05 123.75G 70.3G 56%
OLDREFS is how much of the referenced space of the snapshot is not referenced by the current filesystem. UNIQUE is the amount that is referenced by that snapshot and nothing else (the used property), and finally I thought it would be nice to have the unique amount as a percentage of OLDREFS.
The script:
#!/bin/bash
if (($# < 1))
then
echo "usage: $0 <filesystem>"
exit 1
fi
if [[ $1 == *@* || $1 == /* ]]
then
echo "Snapshots and paths are not supported"
echo "usage: $0 <filesystem>"
exit 1
fi
echo "SNAPSHOT OLDREFS \tUNIQUE\tUNIQUE%"
fullref=`zfs get -Hp referenced "$1" | awk '{print $3}'`
for snap in `zfs list -Hd 1 -t snapshot -o name "$1" | cut -f2- -d@`
do
snapref=`zfs get -Hp referenced "$1"@"$snap" | awk '{print $3}'`
snapwritten=`zfs get -Hp written@"$snap" "$1" | awk '{print $3}'`
olddata=$((snapref + snapwritten - fullref))
snapused=`zfs get -H used "$1"@"$snap" | awk '{print $3}'`
snapusedraw=`zfs get -Hp used "$1"@"$snap" | awk '{print $3}'`
suffix=""
divisor="1"
if ((olddata > 1024))
then
suffix="K"
divisor="1024"
testnum=`echo "$olddata/$divisor" | bc`
fi
if ((testnum > 1024))
then
suffix="M"
divisor="(1024*1024)"
testnum=`echo "$olddata/$divisor" | bc`
fi
if ((testnum > 1024))
then
suffix="G"
divisor="(1024*1024*1024)"
testnum=`echo "$olddata/$divisor" | bc`
fi
if ((testnum > 1024))
then
suffix="T"
divisor="(1024*1024*1024*1024)"
testnum=`echo "$olddata/$divisor" | bc`
fi
if ((testnum > 1024))
then
suffix="P"
divisor="(1024*1024*1024*1024*1024)"
fi
displaydata=`echo "scale = 2; $olddata/$divisor" | bc -l`
if ((olddata > 0))
then
displaypercent=`echo "100*$snapusedraw/$olddata" | bc`
else
displaypercent=0
fi
chars=`echo "$snap" | wc -m | awk '{print $1}'`
spacing=""
while ((++chars < 44))
do
spacing="$spacing "
done
echo "$snap $spacing $displaydata$suffix \t$snapused\t$displaypercent%"
done
4. Freeing space
Due to ZFS snapshots sometimes removing a file from a full file system will not free the desired space. In order to do this you need to remove all the snapshots that reference the file.
From http://docs.oracle.com/cd/E23823_01/html/819-5461/gbciq.html:
When a snapshot is created, its disk space is initially shared between the snapshot and the file system, and possibly with previous snapshots. As the file system changes, disk space that was previously shared becomes unique to the snapshot, and thus is counted in the snapshot’s used property. Additionally, deleting snapshots can increase the amount of disk space unique to (and thus used by) other snapshots.
Note: As a result of this, deleting a file can actually consume more disk space because the file is kept on the snapshot and the new version of the directory needs to be created.
In the making
zfs list -o space
- setting reservation and refreservation
- http://docs.oracle.com/cd/E19253-01/819-5461/gbdbb/index.html
- snapshots: how much they occupy http://lildude.co.uk/zfs-cheatsheet
- quota: how a disk with quota has AVAIL < total AVAIL because he can’t grow to fill the pool while others can
- http://lildude.co.uk/zfs-cheatsheet
- http://docs.oracle.com/cd/E19082-01/817-2271/gazud/index.html