Linux Storage: Filesystems Introduction

2024-04-26

Linux

Linux Filesystem

In this article, our focus is files and filesystems. The UNIX philosophy of "everything is a file" continues to hold true in Linux. Although it's not an absolute rule, most resources in Linux are treated as files. These files can encompass a wide range of content, from the text of a school assignment to the humorous GIF you download (from a source you trust, of course).

Linux also extends this notion to encompass other elements, including devices and pseudo-devices. For instance, consider the command echo "Hello modern Linux users" > /dev/pts/0, which displays the message Hello modern Linux users on the screen. While you might not typically think of these resources as files, you can interact with them using the same methods and tools familiar from dealing with regular files. As an example, the kernel exposes specific runtime information for a process, such as its PID (Process ID) or the binary used to execute the process.

VFS - Basic Concepts

Let's first delve into more precise definitions of essential terms:

Drive

A drive refers to a physical block device, which could be a hard disk drive (HDD) or a solid-state drive (SSD). In the context of virtual machines, drives can also be emulated, such as /dev/sda (SCSI device), /dev/sdb (SATA device), or /dev/hda (IDE device).

For example:

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 232.9G  0 disk 
├─sda1   8:1    0   500M  0 part /boot
├─sda2   8:2    0   1.5G  0 part [SWAP]
└─sda3   8:3    0 231.9G  0 part /

Partition

Partitions are logical divisions within drives, consisting of a set of storage sectors. For instance, you might decide to create two partitions on your HDD, resulting in /dev/sdb1 and /dev/sdb2.

$ fdisk -l 
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt

Device         Start       End   Sectors   Size Type
/dev/sda1       2048   1026047   1024000   500M EFI System
/dev/sda2    1026048   3074047   2048000   1G Linux swap
/dev/sda3    3074048 488397134 485323087 231.9G Linux filesystem

Volume

A volume bears similarity to a partition but offers greater flexibility. It is also formatted for a specific filesystem. We will explore volumes in more depth in the section titled Logical Volume Manager.

$ lvdisplay
--- Logical volume ---
LV Path                /dev/vg01/lv01
LV Name                lv01
VG Name                vg01
LV UUID                hW4pH6-ZHvm-DcTg-M64Y-JePM-1PfX-CpNdqH
LV Write Access        read/write
LV Creation host, time myserver, 2023-01-01 12:00:00 +0000
LV Status              available

Super Block

Filesystems, when formatted, contain a distinct section at the beginning dedicated to storing metadata about the filesystem. This metadata includes details like the filesystem type, block configuration, state, and the number of inodes per block.

$ dumpe2fs /dev/sda3 | grep "superblock"
Primary superblock at 0, Group descriptors at 1-1
Backup superblock at 32768, Group descriptors at 32769-32769
Backup superblock at 98304, Group descriptors at 98305-98305

Inodes

In a filesystem, inodes serve as repositories for metadata concerning files, encompassing information such as size, owner, location, date, and permissions. It's important to note that inodes do not store the actual filename or the file's data; these details are maintained in directories, which essentially function as specialized regular files, mapping inodes to filenames.

$ ls -i <filename>
12345 filename.txt

Below is a table that include some command filesystem commands:

VFS - Linux Virtual Filesystem

Linux employs an abstraction known as the Virtual File System (VFS) to grant file-like access to various types of resources, whether they are stored in memory, locally attached, or accessible over a network.

The fundamental concept behind VFS is to introduce a layer of separation between clients (making system calls) and the individual filesystems responsible for executing operations on specific devices or resources. In essence, VFS decouples the standard operations like open, read, and seek from the intricate implementation details.

Within the kernel, VFS serves as an abstraction layer that offers clients a consistent method for interacting with resources, all built around the file concept. In the Linux environment, a file is devoid of any prescribed structure; it merely represents a sequence of bytes. The interpretation of these bytes is left entirely to the discretion of the client, allowing for flexibility in how the data is utilized.

Linux's Virtual File System (VFS) abstracts access to different kinds of filesystems by providing a unified and consistent interface for interacting with these filesystems. Here's how VFS accomplishes this abstraction:

Common Interface: VFS defines a standard set of system calls and data structures that applications and system components use to interact with files and directories. These include operations like open, read, write, close, stat, and many others. Regardless of the underlying filesystem, applications can use these common interfaces to perform file-related tasks.
File Abstraction: VFS abstracts the concept of a "file." In Linux, a file is not limited to regular files; it can represent various resources, including regular files, directories, devices, sockets, and more. VFS treats all of these resources as files, providing a consistent way to access and manipulate them.
Filesystem Drivers: Each specific filesystem, whether it's ext4, NTFS, FAT, or any other, has its own filesystem driver that interfaces with VFS. These drivers implement the low-level details of how data is stored, organized, and retrieved on a particular filesystem. VFS acts as a bridge between these drivers and the applications making system calls.
Filesystem Registration: When the Linux kernel boots, it loads the necessary filesystem drivers based on the filesystems present on the system. These drivers register themselves with VFS, informing it of their capabilities and how to interact with them.
Mounting: VFS allows multiple filesystems to be "mounted" at different mount points within the directory hierarchy. When a filesystem is mounted, its driver registers with VFS, and the directory tree below the mount point becomes part of that filesystem. This enables Linux to support a variety of filesystem types simultaneously.

In summary, Linux's VFS abstracts access to different filesystems by presenting a common interface to applications and system components while delegating the actual filesystem-specific operations to the appropriate filesystem drivers. This abstraction allows Linux to support a wide range of filesystems seamlessly, making it a versatile and flexible operating system.

Common Filesystem Operations

Creating filesystems

To utilize a filesystem, the initial action involves its creation, which entails configuring the essential components that constitute the filesystem, utilizing a partition or volume as the input source. Once you've gathered all the necessary details, you can employ the mkfs command to establish the filesystem.

mkfs stands for "make filesystem." It is a command-line tool that is available on most Linux distributions. The primary purpose of mkfs is to format a block device, such as a partition or a volume, into a specific filesystem format, allowing it to store and manage files and directories. To create a filesystem using the mkfs command, you need to specify the filesystem type you want to create, the target device (partition or volume), and any optional parameters. The general syntax of the mkfs command is as follows:

$ mkfs -t filesystem_type device

For example:

$ mkfs -t ext4 /dev/sdb1
...
mke2fs 1.45.5 (07-Jan-2020)
Creating filesystem with 52428800 4k blocks and 13107200 inodes
Filesystem UUID: 9c7acdc2-4c19-4c15-bf5c-6c21b74a0b23
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done
...

The above command create an Ext4 filesystem on a partition. Once you have created the filesystem with mkfs, you can then make it available in the filesystem tree.

Mounting filesystems

Mounting a filesystem involves connecting it to the existing filesystem hierarchy, which begins at the root (/) directory. You can achieve this by using the mount command, which requires two primary inputs:

The device you want to attach
The location within the filesystem hierarchy where it should be attached.

$ mount [options] device directory

Additionally, you can specify additional options, such as read-only mode using -o, or create bind mounts using --bind to integrate directories into the filesystem structure. We will explore bind mounts further in the context of containers.

$ mount /dev/sdb1 /mnt/data

In this example, we are mounting the device /dev/sdb1 to the directory /mnt/data. The -o option can be used to specify additional mount options, such as read-only mode or user permissions.

To mount a network share:

mount -t cifs //server/share /mnt/networkshare -o username=myuser,password=mypassword

Here, we are mounting a network share using the CIFS (Common Internet File System) filesystem type. We specify the server and share location as //server/share and the mount point as /mnt/networkshare. Additionally, we provide authentication details using the -o option.

Check mounted filesystems

$ mount -t ext4,tmpfs 1
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=797596k,mode=755)
/dev/mapper/elementary--vg-root on / type ext4 (rw,relatime,errors=remount-ro) 2
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)

Moreover, it's important to note that mounts established using the mount command are temporary and valid only for the duration of the system's runtime. To ensure the persistence of these mounts across system reboots, you must configure them in the /etc/fstab file. For example:

/etc/fstab

# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/sda1       /               ext4    defaults        0       1
/dev/sdb1       /data           ext4    rw,user         0       2

<file system>: The block device or remote filesystem to be mounted.
<mount point>: The directory where the filesystem will be mounted.
<type>: The type of filesystem (e.g., ext4, iso9660).
<options>: Mount options (e.g., defaults, rw for read-write, ro for read-only).
<dump>: Used by the dump command to determine whether to back up the filesystem (usually set to 0).
<pass>: Used by the fsck command to determine the order of filesystem checks during boot (usually set to 1 for the root filesystem and 2 for others).

Common Filesystem Layouts

Linux has a common filesystem layout that organizes files and directories in a structured manner. Understanding this layout is essential for navigating and managing a Linux system effectively. Here's an explanation of some common directories in the Linux filesystem:

The above directories, along with their respective purposes, form the core of the Linux filesystem layout.

Linux Pseudo Filesystems

Pseudo filesystems, often referred to as virtual filesystems, are a fundamental part of the Linux kernel that allows access to kernel data structures, hardware devices, and system-related information as if they were regular files and directories. These filesystems are essential for exposing critical system data to users, processes, and system utilities.

Below are some common Pseudo Filesystems in Linux

/proc - Process Information

The /proc pseudo filesystem provides a dynamic view of running processes and kernel parameters. It allows users to access detailed information about processes, CPU and memory usage, system configurations, and more. Each process is represented by a directory with its PID (Process ID).

~] ls /proc
1      1469   1580   16606  19   39   762        crypto         kpagecgroup    slabinfo
10     14698  15848  1664   2    4    763        devices        kpagecount     softirqs
11     1470   15851  1675   20   40   8          diskstats      kpageflags     stat
12     14701  15852  16752  21   41   813        dma            latency_stats  swaps
120    1482   15873  16756  23   42   8394       driver         loadavg        sys
122    1484   15921  16757  257  43   8409       dynamic_debug  locks          sysrq-trigger
12220  1485   15976  168    26   5    8496       execdomains    mdstat         sysvipc
124    1487   15985  1685   27   51   8506       filesystems    meminfo        thread-self
1244   1489   16     1712   28   6    8619       fs             misc           timer_list
1278   1492   16053  1725   29   69   8632       interrupts     modules        tty
13     1497   16319  1731   3    70   8704       iomem          mounts         uptime
1314   1498   16523  1732   30   73   acpi       ioports        mtrr           version
132    15     16571  1733   32   755  buddyinfo  irq            net            vmallocinfo
133    1502   16572  1739   33   756  bus        kallsyms       pagetypeinfo   vmstat
1370   1506   16573  1778   34   757  cgroups    kcore          partitions     zoneinfo
14     1507   16575  178    35   758  cmdline    key-users      pressure
1401   1521   16578  179    36   759  consoles   keys           schedstat
14338  1549   16579  18     37   761  cpuinfo    kmsg           self

/sys - Kernel Parameters

The /sys pseudo filesystem offers access to kernel parameters and configurations. It exposes device and system-related information, including hardware settings, power management, and kernel module configurations.

~] ls /sys
block  bus  class  dev  devices  firmware  fs  hypervisor  kernel  module  power

/dev - Device Files

While not entirely a pseudo filesystem, the /dev directory contains special device files that serve as interfaces to hardware devices. These files allow users and applications to interact with hardware devices as if they were files, making it an essential part of Linux device management.

~] ls /dev
autofs           loop-control  nvram     tty15  tty32  tty5   ttyS0        vcsa4
block            loop0         ptmx      tty16  tty33  tty50  ttyS1        vcsa5
btrfs-control    loop1         pts       tty17  tty34  tty51  ttyS2        vcsa6
char             loop2         random    tty18  tty35  tty52  ttyS3        vcsu
console          loop3         rtc       tty19  tty36  tty53  uhid         vcsu1
core             loop4         rtc0      tty2   tty37  tty54  uinput       vcsu2
cpu              loop5         shm       tty20  tty38  tty55  urandom      vcsu3
cpu_dma_latency  loop6         snapshot  tty21  tty39  tty56  userfaultfd  vcsu4
cuse             loop7         stderr    tty22  tty4   tty57  vcs          vcsu5
disk             mapper        stdin     tty23  tty40  tty58  vcs1         vcsu6
fd               mqueue        stdout    tty24  tty41  tty59  vcs2         vfio
full             net           tty       tty25  tty42  tty6   vcs3         vhost-net
fuse             ng0n1         tty0      tty26  tty43  tty60  vcs4         vhost-vsock
hpet             null          tty1      tty27  tty44  tty61  vcs5         xvda
hugepages        nvme0         tty10     tty28  tty45  tty62  vcs6         xvda1
initctl          nvme0n1       tty11     tty29  tty46  tty63  vcsa         xvda127
input            nvme0n1p1     tty12     tty3   tty47  tty7   vcsa1        xvda128
kmsg             nvme0n1p127   tty13     tty30  tty48  tty8   vcsa2        zero
log              nvme0n1p128   tty14     tty31  tty49  tty9   vcsa3

/run - Runtime Data

The /run directory stores runtime data and system state information. It is commonly used for system services and applications to store temporary files and runtime data that persists across reboots.

~] ls /run
acpid.socket          chrony       dbus           irqbalance  mount          sshd.pid    user
agetty.reload         chrony.d     faillock       lock        rpcbind        sssd.pid    utmp
amazon-ec2-net-utils  cloud-init   gssproxy.pid   log         screen         sudo
atd.pid               console      gssproxy.sock  lsm         sepermit       systemd
auditd.pid            credentials  initctl        motd        setrans        tmpfiles.d
blkid                 cryptsetup   initramfs      motd.d      sm-notify.pid  udev

/tmp - Temporary Files

The /tmp directory is used for temporary files created by various processes. It is typically cleared upon system reboot and provides a convenient location for programs to create and manage temporary data.

~] ls /tmp/
pyright-15985-9EBarp4ZNzX6
python-languageserver-cancellation
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-chronyd.service-HMONVW
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-dbus-broker.service-U5nshu
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-policy-routes@ens5.service-T1MlC0
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-systemd-logind.service-jE798m
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-systemd-resolved.service-6zqxRV

Linux pseudo filesystems play a vital role in providing transparency and accessibility to kernel data and system information. They enable users and system administrators to interact with the kernel and hardware devices through familiar file and directory structures. Understanding and utilizing these virtual filesystems is essential for efficient system monitoring, debugging, and configuration on Linux-based systems.

storage