Filesystems
Data model
inode
Index nodes, or inodes, are structures describing each file, returned by the stat()
system call. A fixed number of these structures are allocated at filesystem creation time in most cases, with XFS being a notable exception. Note that filenames are stored elsewhere, either in directory data or a filesystem-managed BTree.
At a high level, inodes contain the following properties:
- inode number.
- Parent device.
- Mode (socket, link, regular, block dev, character device, FIFO; setuid, setgid, sticky; user, group, other).
uid
andgid
define the owning user and group.- Device ID, if the file represents a device.
acl
define POSIX ACLs.default_acl
define default POSIX ACLs.- Number of allocated blocks.
size
of the file in bytes.blocksize
denotes the preferred I/O block size.atime
contains last access time.mtime
contains the last data modification time.ctime
contains the last inode modification time.- Number of hard links.
unlink()
operations remove the inode, leaving data.
dentry
Dentries, or directory entries, relate inode numbers to filenames. They're also used as boundaries for directory caching and filesystem traversal.
File descriptors
Superblock
Superblocks are crucial data structures that contain metadata about a filesystem. Their loss prevents use of the filesystem, so filesystem drivers usually replicate them across the volume to account for damage.
They comprise:
- Filesystem size
- Block size
- Empty and filled block bitmap
- Size and location of inode table
- Disk block map
Mounting
Filesystems are mounted to locations in a single namespace -- below the root filesystem (/
).
Common flags:
remount
allows remounting an existing mount with new options.ro
andrw
determine whether the filesystem is writable or not.exec
andnoexec
control whether binaries are executable when they have the+x
mode.async
allows async operations.auto
andnoauto
set whether the filesystem should be mounted when executingmount -a
.defaults
setsrw
,suid
,dev
,exec
,auto
,nouser
, andasync
.suid
andnosuid
allow or prevent use of the suid and sgid bits.user
andnouser
allow non-root users to bring up the mount.loop
mounts images as loop devices to allow accessing their filesystems.
ext4
ext4 uses 48-bit addressing for a maximum filesystem size of 1EiB, with 16TiB maximum file size with a 4KiB block size.
It offers three journaling levels, configured with the data
mount option:
journal
offers the lowest risk, writing both metadata and data to the journal before committing changes to the filesystem. This ensures consistency at the cost of performance.ordered
writes metadata to the journal, writes the data directly, and then commits the journal. On crash incomplete writes present in the journal and can be rolled back.writeback
-- removes the ordering constraint, allowing the changes to the journal to be committed to the filesystem before the data is written.
Children
Backlinks