What is RAIDZ
RAID Z – the technology of combining data storage devices into a single storage developed by the Sun Company.
The technology has many features in common with regular RAID;
however, it tightly bounds to the ZFS filesystem, which is the only one that can be used on the RAIDZ volumes.
RAIDZ vs RAID
Although the RAIDz technology is broadly similar to the regular RAID technology,
there are still significant differences. The main difference is that RAID-Z does not have a fixed block size – it can be different in each striped row.
Other differences are mainly related to the fact that the technology is used exclusively with the ZFS filesystem,
which in turn has many features like copy-on-write allowing to completely get rid of the RAID system vulnerability.
In addition, ZFS verifies each metadata block with a checksum while conventional RAID systems can't do this by design
– they just do not know which blocks are used for metadata.
Conjunction with ZFS
Unlike traditional RAID layouts, all RAIDz layouts are developed to be used with ZFS filesystem, which, on the one hand,
narrows the scope of these arrays, on the other hand, allows creating more diverse and interesting from the performance
and reliability aspects data storage systems.
-
Copy-on-write. ZFS is a copy-on-write filesystem meaning that it creates a new copy of a metadata record rather than modifies a current one.
Such a behavior results in a certain overhead in disk space even for a non-redundant layout like a striped ZFS pool.
-
Metadata redundancy. All metadata records are stored minimum in two copies regardless of the array layout used
which again leads to the disk space overhead as compared to a regular RAID formatted to a non copy-in-write filesystem like NTFS.
-
Metadata checksumming. ZFS uses checksumming of all metadata by default,
which allows detecting a data corruption and automatically repairing metadata, if possible.
With checksumming and metadata redundancy, even in non-redundant layouts like striped ZFS pool you get
a certain level of protection.
-
Compression. Since ZFS filesystem extensively produces metadata records due to both copy-on-write specifics and obligatory redundancy for metadata,
it also faces a problem of disk space overhead, which is partially solved by using the built-in compression.
Anyway, ZFS disk arrays have a lot more disk space overhead compared to regular RAIDs.
Blocks
-
Block size. While traditional RAIDs operate with a block size set at the moment of RAID creation,
RAIDz arrays use striping technique with arbitrary block size which is specified for each row of data blocks independently.
More than that, striping technique is used even for mirror pools which are traditionally not striped.
-
Block placement pattern. There are no specific data and parity block placement patterns in ZFS RAIDs, no left and right layouts,
no synchronous or synchronous arrays like in typical RAID5 or RAID6.
The ZFS driver operates at the row level choosing a block placement pattern based on the free blocks in the system
and size of the data to be written in a particular row.
-
Columns. While data blocks in traditional RAIDs are placed according to a specific pattern with number of columns
being one of the main parameter, ZFS-based RAIDs use arbitrary number of columns, which is set for each data row independently.
Nested RAIDz levels
ZFS environment allows creating almost all possible combinations of RAIDs many of which you cannot
get within a traditional RAID variety. For example, you can create 10-way mirrors,
triple-parity RAIDs, or several RAIDZ2, which then combine into stripe or mirror array.
Complexity
The main drawback of ZFS RAIDz layouts is theirs insane complexity.
While with a traditional RAIDs, you have a certain configuration with specific parameters which are the same for the disk set,
with RAIDz you deal with a specific configuration, literally for each data row. All this makes ZFS recovery extremely complex.