As it can be read in my related earlier post: Replacing a failed disk in a mdadm RAID I have a 4 disk RAID 5 setup which I initially populated with 1TB disk WD GREEN (cheap, but not really suited for NAS operation). After a few years I started fill up the file system, so I wanted to grow my RAID by upgrading the disks to WD RED 3TB disks. The WD RED disk are especially tailored to the NAS workload. The workflow of growing the mdadm RAID is done through the following steps:
- Fail, remove and replace each of 1TB disk with a 3TB disk. After each disk I have to wait for the RAID to resync to the new disk.
- I then have to grow the RAID to use all the space on each of the 3TB disks.
- Finally, I have to grow the filesystem to use the available space on the RAID device.
The following is similar to my previous article Replacing a failed disk in a mdadm RAID, but I have included it hear for completness.
Removing the old drive
The enclosure I have does not support hot-swap and the disk have no separate lights for each disk, so I need a way to find out which of the disks to replace. Finding the serial number of the disk is fairly easy:
# hdparm -i /dev/sde | grep SerialNo Model=WDC WD10EARS-003BB1, FwRev=80.00A80, SerialNo=WD-WCAV5K430328
and luckily the Western Digital disks I have came with a small sticker which shows the serial on the disk. So now I know the serial number of the disk I want to replace, so before shutting down and replacing the disk I marked as failed in madam and removed from the raid:
mdadm --manage /dev/md0 --fail /dev/sde1 mdadm --manage /dev/md0 --remove /dev/sde1
Adding the new drive
Having replaced the old disk and inserted the new disk I found the serial on the back and compared it to the serial of /dev/sde to make sure I was about to format the right disk:
# hdparm -i /dev/sde | grep SerialNo Model=WDC WD30EFRX-68EUZN0, FwRev=80.00A80, SerialNo=WD-WMC4N1096166
Partitioning disk over 2TB does not work with MSDOS file table so I needed to use parted (instead of fdisk to partition the disk correctly). The “-a optimal” makes parted use the optimum alignment as given by the disk topology information. This aligns to a multiple of the physical block size in a way that guarantees optimal performance.
# parted -a optimal /dev/sde (parted) mklabel gpt (parted) mkpart primary 2048s 100% (parted) align-check optimal 1 1 aligned (parted) set 1 raid on (parted) print Model: ATA WDC WD30EFRX-68E (scsi) Disk /dev/sde: 3001GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 3001GB 3001GB primary raid (parted) quit Information: You may need to update /etc/fstab.
Now the disk was ready for inclusion in the raid:
mdadm --manage /dev/md0 --add /dev/sde1
Over the next 3 hours I could monitor the rebuild using the following command:
[root@kelvin ~][20:43]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sde1 sdc1 sdb1 sdd1 2930280960 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU] [>....................] recovery = 0.5% (4893636/976760320) finish=176.9min speed=91536K/sec bitmap: 4/8 pages [16KB], 65536KB chunk unused devices: <none>
Now this takes around 3 hours in my case per disk and it is very important to wait for the array to have rebuilt after each replacement. After having replaced all 4 disk and the RAID is resynced I can now continue.
Resize the array to the new maximal size
Now all the disks have been replaced with larger 3TB disk, but the raid device is not using the space yet. To instruct mdadm to use all the available space I issue the following commands:
mdadm --grow /dev/md0 --bitmap none mdadm --grow /dev/md0 --size=max
Now this also takes quite a while to complete – several hours in my case. The RAID is still usable while this is happening.
Resize the filesystem
Finally I had to grow the filesystem to use the new available space on the array. My array is mounted under /home, so I have umount the filesystem first:
To make sure everything is okay I force a check of the filesystem before the resizing:
fsck.ext4 -f /dev/md0
Finally I start the resizing of the file system – this is very quick as majority of the work is done later when the filesystem is mounted again by a process called ext4lazyinit. ext4lazyinit took almost a full day to complete:
Only registered users can comment.
It seems using Raid5 with big hard drives is not recommended due to prevalence of URE relative to drive size. Are you not concerned about using Raid5 with 3TB disks?
I might have done it differently if I had started the server now, but the setup is 5 years old and working fine, so I am not planning to tinker with it now.
The reason I am not to worried is that I have a USB connected 8TB external disk I do rsnapshots backups to every month and a daily offsite backup through rsync to another server.
Why do you disable the bitmap before growing the array? Do you re-enable it afterwards?
I can’t remember it really, so the procedure might have changed since I did it, but I think I read it on https://raid.wiki.kernel.org/index.php/Growing#Adding_partitions
I grew a 25TB to 50 TB this way but without disabled bitmap and it worked, but it took about 10 hours. Looks like disabling bitmap may help with speed, but if it needs to restart during the resync process, bitmap will help it restart and finish more quickly rather than starting over.
I grew my array this week, from 5x 3TB to 5x 8TB. Note that resize2fs 1.42 (standard with Ubuntu) only supports a max of 16TB (32bit addresses). I had to download and compile the 1.43 version as detailed here: https://askubuntu.com/questions/779754/how-do-i-resize-an-ext4-partition-beyond-the-16tb-limit
Otherwise, smooth sailing.
Hi, just wanna say thank you for this, exactly what I needed. 🙂 The tip about the serial number saved me some time.
Thomas you are an absolute champ. While I ultimately decided to backup, reformat and restore, the general process of adding disks and expanding the file system worked great. Thank you for posting these valuable details!