Can anyone explain to me why it was so important to break the Linux file system?
Like I believe it was since literally every single distribution did it, but I don’t get why it was so important that we had to make things incompatible unless you know what you are doing.
The structure is defined by the Filesystem Hierarchy Standard 3.0, which could be implemented differently depending on the distro. /bin is usually a symlink pointing to /usr/bin.
See also (if you’re curious) two distros that purposefully don’t follow the FHS for one reason or another: GoboLinux and NixOS (there are probably others)
I love how in the first page of chapter 2 they specify the distinction of files in two classes: shareable and variable.
Then they specify that files which differ in any of these two properties should belong to a different root folder.
Then they go ahead and give you an example which clearly explains that /varshould contain both shareable and non shareable files.
Good job with your 4 categories, I guess that’s why nobody really understands what /var is…
Is /var really such a mystery? I always understood it as the non-volatile system directory that can be written into. Like log files, databases, cache etc. /var/tmp it’s somewhat weird because a non-volatile temporary folder for me is just cache, and /var/lib is named somewhat weird because it doesn’t hold what I’d usually call libraries.
Indeed, but organisation is quite a mess and certain things don’t really feel like should be together.
Why should /var/www and /var/lock be in the same place?
It’s been a while I don’t work on webservers, but any of the ones I worked on, be it Apache or Nginx, had all their domains in /var/www.
I would imagine /srv to be a much better option, but I’ve never seen it done that way.
Well, at least for nginx, you can specify the root (or alias if required) directive; to me, it makes very little sense to rely on defaults, you need to specify your servers / virtual hosts anyways, might as well make the configuration more self-documenting…
The original reasoning for having all those directories was because some nerds in a university/lab kept running out of HD space and had to keep changing the filing system to spread everything out between an increasing number of drives.
/home because you want to save the user files if you need to reinstall.
/var and /tmp because /var holds log files and /tmp temporary files that can easily take up all your disk space. So it’s best to just fill up a separate partition to prevent your system from locking up because your disk is full.
Not just log files, but any variable/dynamic data used by packages installed on the system: caches, databases (like /var/lib/mysql for MySQL), Docker volumes, etc.
Traditionally, /var and /home are parts of a Linux server that use the most disk space, which is why they used to almost always be separate partitions.
Also /tmp is often a RAM disk (tmpfs mount) these days.
I do not think that matters so much, I guess it just affects the speed at which you load your software into ram, but once it is loaded the difference in running the software should be pretty small.
That’s unless you call a command thousands of times per second, in that case it may improve performance.
The fastest drive should generally be reserved to storing software input and output, as that’s where generally drive speed affects the most execution time. Meaning if your software does a blocking read that read will be faster and thus the software will proceed running quicker after reading.
Moreover, software input in general tends to be larger than the binaries; unless we’re talking about an electron based text editor.
Because someone in the 1970s-80s (who is smarter than we are) decided that single-user mode files should be located in the root and multi-user files should be located in /usr. Somebody else (who is also smarter than we are) decided that it was a stupid ass solution because most of those files are identical and it’s easier to just symlink them to the multi-user directories (because nobody runs single-user systems anymore) than making sure that every search path contains the correct versions of the files, while also preserving backwards compatibility with systems that expect to run in single-user mode. Some distros, like Debian, also have separate executables for unprivileged sessions (/bin and /usr/bin) and privileged sessions (i.e. root, /sbin and /usr/sbin). Other distros, like Arch, symlink all of those directories to /usr/bin to preserve compatibility with programs that refer to executables using full paths.
But for most of us young whippersnappers, the most important reason is that it’s always been done like this, and changing it now would make a lot of developers and admins very unhappy, and lots of software very broken.
The only thing better than perfect is standardized.
The move to storing everything in /usr/bin rather than /bin etc? I think it actually makes things more compatible, since if you’re a program looking for something you don’t need to care whether the specific distro decided it should go in /usr/bin or /bin.
Can anyone explain to me why it was so important to break the Linux file system?
Like I believe it was since literally every single distribution did it, but I don’t get why it was so important that we had to make things incompatible unless you know what you are doing.
Because disks were too small back then. They aren’t anymore but we still do that for …sake of tradition i guess.
The structure is defined by the Filesystem Hierarchy Standard 3.0, which could be implemented differently depending on the distro. /bin is usually a symlink pointing to /usr/bin.
See also (if you’re curious) two distros that purposefully don’t follow the FHS for one reason or another: GoboLinux and NixOS (there are probably others)
There’s also https://uapi-group.org/specifications/specs/linux_file_system_hierarchy/ nowadays, which aims to build on the FHS.
I love how in the first page of chapter 2 they specify the distinction of files in two classes: shareable and variable. Then they specify that files which differ in any of these two properties should belong to a different root folder. Then they go ahead and give you an example which clearly explains that
/varshould contain both shareable and non shareable files.Good job with your 4 categories, I guess that’s why nobody really understands what
/varis…Is /var really such a mystery? I always understood it as the non-volatile system directory that can be written into. Like log files, databases, cache etc. /var/tmp it’s somewhat weird because a non-volatile temporary folder for me is just cache, and /var/lib is named somewhat weird because it doesn’t hold what I’d usually call libraries.
Indeed, but organisation is quite a mess and certain things don’t really feel like should be together. Why should
/var/wwwand/var/lockbe in the same place?Well,
/var/wwwis in fact not part of the FHS, not even optional… it doesn’t exist on my machines either. I think the better choice would be/srv/wwwwhich is an example given at https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s17.htmlIt’s been a while I don’t work on webservers, but any of the ones I worked on, be it Apache or Nginx, had all their domains in
/var/www. I would imagine/srvto be a much better option, but I’ve never seen it done that way.Well, at least for nginx, you can specify the
root(oraliasif required) directive; to me, it makes very little sense to rely on defaults, you need to specify your servers / virtual hosts anyways, might as well make the configuration more self-documenting…The original reasoning for having all those directories was because some nerds in a university/lab kept running out of HD space and had to keep changing the filing system to spread everything out between an increasing number of drives.
Noobs should’ve just used zfs
/home because you want to save the user files if you need to reinstall.
/var and /tmp because /var holds log files and /tmp temporary files that can easily take up all your disk space. So it’s best to just fill up a separate partition to prevent your system from locking up because your disk is full.
/usr and /bin… this I don’t know
Not just log files, but any variable/dynamic data used by packages installed on the system: caches, databases (like /var/lib/mysql for MySQL), Docker volumes, etc.
Traditionally, /var and /home are parts of a Linux server that use the most disk space, which is why they used to almost always be separate partitions.
Also /tmp is often a RAM disk (tmpfs mount) these days.
And in immutable distros, one of the few writable areas
True.
I would think putting /bin and /lib on the fastest thing possible would be nice 🤷
I do not think that matters so much, I guess it just affects the speed at which you load your software into ram, but once it is loaded the difference in running the software should be pretty small. That’s unless you call a command thousands of times per second, in that case it may improve performance. The fastest drive should generally be reserved to storing software input and output, as that’s where generally drive speed affects the most execution time. Meaning if your software does a blocking read that read will be faster and thus the software will proceed running quicker after reading. Moreover, software input in general tends to be larger than the binaries; unless we’re talking about an electron based text editor.
Could you not just use subdirectories?
They are subdirectories?!
Ok technically but why couldn’t we keep a stable explicit hierarchy without breaking compatibility or relying on symlinks and assumption?
In other words
Why not /system/bin, /system/lib, /apps/bin.
Or why not keep /bin as a real directory forever
Or why force /usr to be mandatory so early?
Because someone in the 1970s-80s (who is smarter than we are) decided that single-user mode files should be located in the root and multi-user files should be located in
/usr. Somebody else (who is also smarter than we are) decided that it was a stupid ass solution because most of those files are identical and it’s easier to just symlink them to the multi-user directories (because nobody runs single-user systems anymore) than making sure that every search path contains the correct versions of the files, while also preserving backwards compatibility with systems that expect to run in single-user mode. Some distros, like Debian, also have separate executables for unprivileged sessions (/binand/usr/bin) and privileged sessions (i.e. root,/sbinand/usr/sbin). Other distros, like Arch, symlink all of those directories to/usr/binto preserve compatibility with programs that refer to executables using full paths.But for most of us young whippersnappers, the most important reason is that it’s always been done like this, and changing it now would make a lot of developers and admins very unhappy, and lots of software very broken.
The only thing better than perfect is standardized.
The move to storing everything in
/usr/binrather than/binetc? I think it actually makes things more compatible, since if you’re a program looking for something you don’t need to care whether the specific distro decided it should go in/usr/binor/bin.