How much of a file system monger are you?

Have you ever been lost in conversations or threads about one or the other file system? which one is faster? which one is slower? is that feature stable? which file system to use for this or that payload?

I was recently surprised by seeing ext4 as the default file system on a new linux installation. Yes, I know, ext4 has been around for a good while, and it does offer some pretty nifty features. But when it comes to my personal laptop and my data, well, I must confess switching to something newer always sends shrives down my back.

Better performance? Are you sure it's really that important? I'm lucky enough that most of my coding & browsing can fit in RAM. And if I have to recompile the kernel, I can wait that extra minute. Is the additional slowness actually impacting your user experience? and productivity?

Larger files? Never had to store anything that ext2 could not support. Even with a 4Gb file limit, I've only rarely had problems (no, I don't use FAT32, but when dmcrypt/ecryptfs/encfs and friends did not exist, I used for years the good old CFS, which turned out to have a 2Gb file size limit). Less fragmentation? More contiguous blocks? C'mon, how often have you had to worry about the fragmentation of your ext2 file system on your laptop?

What I generally worry about is the safety of my data. I want to be freaking sure that if I lose electric power, forget my laptop in suspend mode or my horrible wireless driver causes a kernel panic I don't lose any data. I don't want no freaking bug in the filesystem to cause any data loss or inconsistency. And of course, I want a good toolset to recover data in case the worst happens (fsck, debug.*fs, recovery tools, ...).

So, what do I do? I stick to older file systems for longer. At least, for as long as the old system is well maintained, and the new system doesn't have something I really really want (like a journal, when they started popping up).

What else? Well, talking about ext.* file system and my setup...

  1. I use data=journal in fstab, whenever possible. For the root partition, be careful that you need to either add the option rootflags=data=journal to your grub configuration, or use something like tune2fs -o journal_data /dev/your/root/device, so that the file system is first mounted with data journaling enabled. If you are curious, the problem stems from the fact that you can't change the journaling mode when the file system is already mounted, the boot process on some distros will fail if you don't follow those steps.
  2. I make sure barriers are enabled. Most modern disks cache your data in an internal memory to be faster. If you lose power, journal or not, that data will be lost. Journal or not, you risk corrupting data. With barrier=1 in fstab you ensure that at least the journal entries are written properly to disk. This again can slow you down, but makes corruption significantly more unlikely.
  3. keep the file systems I don't need to write to read only, with the hope that in case things go wrong, I will be at least able to boot and reduce the surface of damage.
  4. other tunings to reduce battery use.

So, here's what my fstab looks like:

/dev/sda1        /boot   ext2 nodiratime,noatime,ro,nodev,nosuid,noexec,sync 0       2
/dev/mapper/root  /          ext3 nodiratime,noatime,data=journal,errors=remount-ro 0 1
/dev/mapper/opt   /opt       ext3 nodiratime,noatime,data=journal,errors=remount-ro 0 1
/dev/mapper/media /opt/media ext3 nodiratime,noatime,data=journal,errors=remount-ro 0 2 
/dev/mapper/vms   /opt/vms   ext3 nodiratime,noatime,data=journal,errors=remount-ro 0 2

Note that I use LVM underneath. It was hard for me to start, I was fearing an extra layer of indirection and possible caching would have complicated things :), but encryption and snapshotting features sold me, and I've been happily using it for years.

I use a similar setup on a raspberry pi in a small appliance that I cannot properly shutdown, and have been plugging and unplugging it directly without headaches for a while (luckily? maybe).

So, what's next? I'm looking forward for a logged file system or a stable file system that properly supports snapshots. Something like NILFS or maybe btrfs. In the past, I had a script taking snapshots of my LVM partitions periodically and at boot, so if I braoke my system with an update or accidentally removed a file I could easily go back in time.

I gave up on that as LVM snapshots turned out to be fairly buggy from kernel to kernel, not well supported by distributions (had at least one instance of initrd scripts getting confused by the presence of a persistent snapshot, and refusing to boot), and often lead more headaches than advantages, at least for personal use. I will probably give them a shot again in the near future :), but for now, I'm happy as is.


Other posts

  • A simple way to generate snippets in python The Problem Let's say you want to add a search box to your web site to find words within your published content. Or let's say you want to display a l...
  • I/O performance in Python The Problem I am writing a small python script to keep track of various events and messages. It uses a flat file as an index, each record being of th...
  • Cleaning up a CSS Let's say you have a CSS with a few thousand selectors and many many rules. Let's say you want to eliminate the unused rules, how do you do that? I s...
Technology/Python