I have been using a quite secure setup for the last couples of years with a 4 drive RAID 6 setup. This setup can tolerate two disk failures without any data loss. Recently though, I have been getting close to the edge of the filesystem and could use some extra space and since I have both monthly backup to an external hard drive and nightly offsite backup I am actually not very afraid of a data loss on a RAID 5 setup. So I have planned to change my 4 disk RAID 6 to a 4 disk RAID 5 without any spares.
A word of caution: Please do not do any of the actions below before a backup has been made.
Changing the raid level
Using mdadm it is very easy to change the raid level. The command below changes the raid level from my previous RAID 6 setup with 4 disks to a RAID 5 with 3 active disks and a spare. This reason for using 3 disks and a spare is that mdadm recommends using having a spare when downgrading. Since I am no hurry this is fine with me. Additionally, the command below saves some critical data to a backup file during the process to ease recovery if anything should go wrong or power should be lost. This should be done to a hard disk that is not a part of the raid. The backup of file is not big. In my case it was about 30 MB and was saved to my root that resides on the system SSD.
mdadm --grow /dev/md0 --level=raid5 --raid-devices=3 --backup-file=/root/mdadm-backupfile
This process can take a long while (around 24 hours in my case), but during the raid is fully functional and it is all done in the background. To monitor the progress use either of the commands below (assuming that your raid device is md0). Please note that this is not output from the actual process, but output from my functional raid 5 when it is all done and I am writing this blog entry:
[root@kelvin ~][22:00]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sde1 sdb1 sdd1 sdc1 2930280960 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] bitmap: 2/8 pages [8KB], 65536KB chunk unused devices: <none>
[root@kelvin ~][22:00]# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Tue Feb 1 23:24:46 2011 Raid Level : raid5 Array Size : 2930280960 (2794.53 GiB 3000.61 GB) Used Dev Size : 976760320 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Thu Aug 29 22:00:58 2013 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : kelvin:0 (local to host kelvin) UUID : 3f335d48:4b43c1d5:6ee5c975:1428eb56 Events : 1186983 Number Major Minor RaidDevice State 0 8 65 0 active sync /dev/sde1 1 8 33 1 active sync /dev/sdc1 4 8 49 2 active sync /dev/sdd1 3 8 17 3 active sync /dev/sdb1
Growing the raid from 3 to 4 active disks
After this operation my raid had 3 active disk and a spare. I would like to expand my raid, so I will convert the spare into an active disk using the following command:
mdadm --grow -n 4 /dev/md0
This process is also done while the raid is active and took about 12 hours. My server actually crashed during the process, probably from a XBMC bug, but the process resumed without a hitch when I came home and rebooted it. 🙂
Resizing the filesystem to the disk
Finally I needed to expand the filesystem (ext4 in my case) to use all the new space. So I changed to init level 1 (single user mode) to make sure that all the users are off the system.
Next I unmounted the raid system
Forced a check to ensure that the filesystem is healthy
e2fsck -f /dev/md0
This took about 5 minutes. Finally I resized my ext4 partition to fill all the available space on md0:
This took about 10 minutes. I could have mounted and changed back to init level 2 (debian multi-usermode), but I decided to restart and had everything up and running shortly after. I now have plenty of space left on my /home partition:
[root@kelvin ~][22:08]# df -hTl Filesystem Type Size Used Avail Use% Mounted on /dev/sda2 ext4 50G 18G 30G 37% / /dev/md0 ext4 2.7T 1.5T 1.3T 56% /home /dev/sda1 ext2 291M 92M 184M 34% /boot ...
As a finally note, I noticed strange activity in my munin logging on the disks after the reboot. All the drives in the raid were utilized at around 5% even though nothing was happening. Using iotop it turned out to be a command called “ext4lazyinit” running in the background:
When lazy_itable_init extended option is passed to mke2fs, it
considerably speed up filesystem creation because inode tables are
not zeroed out, thus contains some old data. When this fs is mounted
filesystem code should initialize (zero out) inode tables…
For purpose of zeroing inode tables it introduces new kernel thread
called ext4lazyinit, which is created on demand and destroyed, when it
is no longer needed…