Author: Gergő Fándly
Date: 2024-11-28
Recently, I decided to get a big upgrade for my home NAS, so I had to set up a RAID array once again. Since I’m not doing this on a daily basis, I tend to forget the commands, so I have a document with all the instructions. I think that it might be helpful for some of you, so I polished it up a bit.
I’m going to cover the following aspects of the process:
For obvious reasons, you need multiple drives to be able to create a RAID array. The minimum number of drives depends on the desired RAID level (more on that later).
If you’re like me and you’re setting up your NAS to serve your home or your small business, you probably don’t want to go with enterprise-grade hard drives. In fact, even some enterprises don’t use enterprise-grade HDDs. One such renowned company is Backblaze, which opted to use consumer-grade drives in their data centers. They even publish regular reports detailing the reliability of their drives. That data is a great source to find the most reliable drives that stand well even in the demanding environment of a datacenter.
As for me, I tend to prefer Seagate drives, but this mostly comes down to the fact that in my region they have the broadest product range and suppliers usually maintain a reasonable stock of these drives.
As for drive sizes? It highly depends on your specific requirements, but I would recommend against buying anything with a capacity of less than 2TB. If you want the lowest $/GB ratio, you should buy 8TB drives, but as a balance between $/GB and total price, choosing 4TB drives might be the best path.
And now the big question: “Should I buy drives made specifically for NAS?” – And the answer would be “yes, if you can”. Those drives are usually not much more expensive than general-purpose HDDs, but (at least theoretically) they are built much sturdier and often have higher performance. In my case, the price difference between Seagate Barracuda (general-purpose) and IronWolf (NAS specific) was about 15%, which was totally worth it.
I am using 4TB 5400RPM Seagate IronWolf (ST4000VN006) drives in my system.
You might ask why do I need an HBA (Host Bus Adapter). Well, most of the motherboards come with only 4 or 6 SATA ports, and instead of buying a much more expensive motherboard with more ports, you could buy an HBA for much less. When I originally built my NAS I bought the cheapest motherboard, since I don’t need any of that premium on-board audio or 4 “extreme durability” x16 PCIe slots. It only has 4 SATA3 ports, but the price of the motherboard and the HBA combined is less than the price of a motherboard with 8 ports. And it also supports 20 drives (16 drives through the HBA and 4 through the motherboard).
What differentiates a RAID card from a HBA, is that a RAID card does all the RAID operations on-board, while a HBA acts like additional SATA ports for your motherboard.
Based on ServeTheHome’s recommendation your best pick would be an LSI card with the SAS 3200 or 3008 chip. The cards with the newer 3200 chip (LSI SAS 9305-16i for example) tend to be a bit more expensive.
I would recommend choosing the LSI SAS 9305-16i if you can find it for a good price or the LSI SAS 9300-8i card. You can also choose the 9300-16i, but be aware that they are in fact two separate 9300-8i cards placed on a single board (this is very important when updating the firmware).
Both of these cards come with SFF-8643 connectors. One of these ports can be split into 4 SATA connectors. If you’re buying the card from Ebay there is frequently an option to add these cables for several bucks.
Which leads us to my next point: where to buy these? From Ebay. Look for refurbished ones.
First of all: these cards are not passively cooled. They are intended for use in a server chassis where you have high static pressure fans at the front of the case. You might be able to run them without active cooling, but at extremely high temperatures (after about 5 minutes it will literally burn your finger). If you have a 3D printer, you can print a shroud that clips onto the heatsink, but if you don’t have one, you can use cable ties to secure a 120mm fan onto the heatsink. Just make sure in that case that the fan is blowing toward the card. I have this setup in my home system and the temps are at about 45 degrees Celsius, which is more than fine.
You might even try to repaste the card, but it is too much effort in my opinion for those extra 3-4 degrees.
When installing the card you may notice that some of them have a PCIe power connector. My advice would be to connect them. But if you don’t have a spare cable coming from your PSU, that should be fine as well. At the company where I work, we have multiple 9300-16i cards and none of them have their power connector plugged in, yet they’ve been working fine for the last several years.
Once you’ve installed the card the first thing you should do is to update its firmware. The cards absolutely must be flashed in IT (initiator target) mode, and even if they are already in IT mode, the firmware is likely an ancient one.
For the 9305 or later cards you can download the latest firmware from Broadcom’s site, as for the 9300 cards you have to get the firmware upgrade from TrueNAS since the 16.00.12.00 firmware is not officially published.
Once you have the latest firmware, you have to get the tool used for flashing them: sas3flash
. It’s already bundled in TrueNAS, but if you’re running Debian for example, you can get it from here. You should unzip it and find the binary relevant to your system (linux_x64
for example).
Once you have both the required firmware and the flash utility, you can flash the firmware:
sas3flash -o -f firmware_file.bin
Remember when I said that the 9300-16i card is actually two cards? This command only flashes one of them. If you have a 9300-16i use the following command to flash both of the controllers:
sas3flash -o -fwall firmware_file.bin
Now just reboot your system and you can start connecting your drives to it.
Everyone is recommending ZFS. You should go with ZFS then. Right?
Well… Yes. If you know what you’re doing, you should absolutely use ZFS. It is by far the most robust solution out there. I’m using it as well at work. But for home, I would not. At least not for the foreseeable future.
Most articles state that the problem is that ZFS requires 1GB of RAM for each TB of raw storage. For 8 4TB drives this means 32GB of RAM just for ZFS. In my opinion, this is not a problem. You can get 2 16GB sticks for extremely cheap, especially if you’re buying them second hand.
The real problem in my opinion is the lack of flexibility in ZFS. Once you create a vdev, you can not add drives to it. If you want to add more drives, you have to create a new vdev and add it to your pool.
Let’s say you have 4 4TB drives in RAID Z1 (RAID 5) and your storage starts to run out. You might want to add a single drive, but this is not possible. If you add a single drive it will become a single point of failure. So the only logical option would be to add another 4 drives, configured in RAID Z1 as well. As a result, you will have 2 separate RAID Z1 vdevs with 4 drives each, which offers less redundancy than a single 8 drive RAID Z2 vdev.
So the conclusion is that if you decide to go with ZFS, you have to plan ahead. When you extend a ZFS pool you typically extend it with a shelf of drives, where this does not pose a problem.
So what else to use? BTRFS also has some of the cool features of ZFS. It also supports checksumming, self-healing, CoW and you can extend it at any time! Great then. Right? Well. No. For RAID 1 it is pretty cool, but for RAID 5 and RAID 6 the support is still experimental and is not recommended for production.
What to use then? I’d recommend going back to the basics and using mdadm
, LVM, and your filesystem of choice (which might be BTRFS for example). It does not offer any of the fancy features that ZFS has, it basically has the same functionality as a hardware RAID controller. But it is extremely flexible: you can add any number of drives anytime, you can change raid levels, all while the array is running.
First of all, we need to install some tools for managing the raid array:
sudo apt install mdadm fdisk
Now we need to format the drives. This step is not strictly necessary, but it is highly recommended. With mdadm you can use the raw disks directly, but this can cause several problems. Mdadm stores the array metadata in the “superblock” at the start of each drive, and if the disk was previously formatted as GPT, some motherboards may delete the RAID superblocks. This might be dodged by ensuring you have “clean” drives, but it’s not worth the risk in my opinion.
The second issue is that 1TB is not always 1TB. I’m not talking about the difference between TB or TiB, but the count of physical sectors on two drives with the same capacity might be different across manufacturers or even series. For example, a 1TB drive from manufacturer A might hold 1 000 000 000 256
bytes and another from manufacturer B might hold exactly 1 000 000 000 000
bytes.
Since RAID has the rule that you have to add drives that are at least as large as the rest of the drives, this might cause problems for you. So as a general rule of thumb, it’s better to leave a bit of unused space (I’m talking about several hundred MBs at most) to ensure that you can add a drive later on, even if it’s marginally smaller.
To do this formatting step we’re going to use fdisk
.
First of all list the available drives:
lsblk
Note the device names of the drives you want to use in your array and repeat the following step for each of them. Be aware that this will destroy the existing partitioning table and you will lose the data you have on the drives!
sudo fdisk /dev/sdx
This will open up the interactive partitioner, where we have to do the following:
g
to create a GPT partition tablen
to create a new partition1
2048
-100M
. This will leave 100MB of empty space on the end of the drivet
to change the partition typeLinux RAID
which has the index 42
or GUID A19D880F-05FC-4D3B-A006-743F0F84911E
. But to go for sure type L
and look it up yourself.p
-100M
for the rest of the drivesw
to write changes to diskNow that you have all your drives ready, decide what RAID level we’re going to use:
To check what capacity will be available for you with each RAID level, you can use this calculator.
Now that you’ve decided what RAID level you’re going to use, create the array itself:
sudo mdadm --create /dev/md0 --level=raid5 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
Let’s break this down:
--create /dev/md0
– create a new raid device--level=raid5
– use RAID level 5--raid-devices=4
– we’re going to use 4 devices in our RAID array. You can set it to less than the number of available devices to have hot spares/dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
– the list of devices to participate in the arrayAnd now we have to wait. It will take from several minutes to long hours to synchronize the drives depending on your specific system. You can already use the array technically, but it’s better to wait for it to finish initialization first.
You can monitor the process by running the following:
cat /proc/mdstat
Once it’s done, you have to persist the RAID configuration:
mdadm --detail --scan | sudo tee /etc/mdadm/mdadm.conf
And update the initramfs to have it available at boot:
sudo update-initramfs -u
From now on you can interact with /dev/md0
as if it was an ordinary hard drive. You can format it, create a filesystem on it, and mount that filesystem.
For formatting I’d recommend LVM:
sudo apt install lvm2
sudo pvcreate /dev/md0
sudo vgcreate vault /dev/md0
sudo lvcreate -L 100G -n media vault
To create a filesystem on it:
sudo mkfs.ext4 /dev/vault/media
And to mount it:
sudo tee /etc/systemd/system/mnt-media.mount <<_EOF
[Mount]
What=/dev/vault/media
Where=/mnt/media
Type=ext4
[Install]
WantedBy=multi-user.target
_EOF
sudo systemctl daemon-reload
sudo systemctl enable mnt-media.mount
sudo systemctl start mnt-media.mount
My primary reason for choosing mdadm over ZFS was that it’s easily extensible. So let’s see how to do it.
First of all format the drive the same way as I described in creating the array.
Now add the drive as a hot spare:
mdadm --add /dev/md0 /dev/sde1
And grow the array:
mdadm --grow /dev/md0 --raid-devices=5
You could also change the RAID level in the same step:
mdadm --grow /dev/md0 --raid-devices=5 --level=raid6
Now you have to wait for the process to finish. You will only be able to access the additional free space once the reshape completes. When it’s done, you have to save the new configuration:
mdadm --detail --scan | sudo tee /etc/mdadm/mdadm.conf
sudo update-initramfs -u
And finally, we can grow the LVM volume and the filesystem itself:
sudo pvresize /dev/md0
sudo lvextend -l +100%FREE /dev/vault/media
sudo resize2fs /dev/vault/media # or the equivalent grow command for your desired filesystem
Comments