/File Systems 101

File Systems 101

As a computer science person , you should always know what’s going on with your system and how every bit of data flows in it and since i just mentioned data , one can not ignore file systems.

According to wikipedia > file systems an abstraction to store, retrieve and update a set of files. The term also identifies the data structures specified by some of those abstractions, which are designed to organize multiple files as a single stream of bytes, and the network protocols specified by some other of those abstractions, which are designed to allow files on a remote machine to be accessed. By extension, the term also identifies software or firmware components that implement the abstraction.

In this article , we’ll break down this definition in order to fully understand how file systems operate. easiest way to do this is to put a file system under the microscope and look at it’s implementation. I chose ext2 because it’s typical and relatively simple they are a group of contiguous blocks sharing.

ext2 file system divides the disk space into “blocks” , some blocks are grouped together into what we’ll call “block group” .
a block can be :

  • Superblock.
  • Inode (index node) block.
  • Indirect blocks (Double , triple .. discussed later).
  • Data block.

The Superblock contains all information about the layout of the file system and possibly contains other important information like what optional features were used to create the file system. The Superblock is always located at offset 1024 from the beginning of the volume with size 1024 bytes.

Let’s take a look at the super block structure
https://github.com/SaadTalaat/CATernel/blob/master/include/fs/ext2fs.h , scroll down to ” struct ext2_super_block ” , as you can see it’s pretty big to put in here.

to high light some stuff look at

it includes a lot of useful information for further filesystem usage so as you might have included that’s the first stop you’ll go to when you want to parse a file system , right after the super block there’s the block Group Descriptor Table ( see ext2_group_desc ) as you can see , it includes Inode usage bitmap block , address of Inodes table block ,etc .. looks handy huh?

at this point you’re definitely wondering what an inode is , The inode points to data or elements , i.e. file , directories , symbolic links . Each inode contains 12 direct pointers, one singly indirect pointer, one doubly indirect block pointer, and one triply indirect pointer. The direct space “overflows” into the singly indirect space, which overflows into the doubly indirect space, which overflows into the triply indirect space which gives us a total of 15 pointers (refer to struct ext2_inode) .

Singly Indirect Block Pointer: used if a file needs more than 12 blocks, a separate block is allocated to store the block addresses of the remaining data blocks needed to store its contents.
Doubly Indirect Block Pointer: If the 12 blocks and indirect blocks were not enough . A double indirect block is used , the pointer will point to a block including address of indirect blocks which will include address to blocks of data blocks’ addresses
Triply Indirect Block Pointer: If that wasn’t enough , triple indirect block is used , we’ll have a pointer to a block that has addresses of double indirect blocks which in turn have address to singly indirect blocks which will have addresses to indirect blocks which finally points to the data blocks ( that’s a lot of blocks ! ) .. you can see now why ext2 supports large files
Here’s an awesome graphs to make things more clear ( credits to h-online.com ).


That’s the simplest way a file system structure can be described i left out some details to avoid confusion an not to make the article too long.

to sum things up to open file X , it’ll have an ID in the inode table which defines which inodes point to the data in data in the file .. so we locate the inodes , follow the data block pointer and read the data .

unfortunately I haven’t implemented that part yet , So stay tuned for the full implementation article! And don’t let your research spirit go lazy there’s a lot of details that I haven’t mentioned yet ; try giving more modern filesystems like ext4 and Btrfs a look .