Every single time I think of restructuring my homelab storage. What do you use for storage engines and how does it benefit you?

rcmd@lemmy.world · 27 days ago

Every single time I think of restructuring my homelab storage. What do you use for storage engines and how does it benefit you?

panda_abyss@lemmy.ca · edit-2 27 days ago

I set up garage, which works fine.

Advantage of an s3 style layer is it’s simplicity and integration with apps.

I also use it so I can run AI agents that have zero access to a disk based system

Possibly linux@lemmy.zip · 26 days ago

Just out of curiosity, what are you using S3 for?

melfie@lemy.lol · 26 days ago

I was considering MinIO, then evaluated Garage, then decided it wasn’t with the trouble since a lot of the things I host don’t even natively support object storage. I do use LFS with Forgejo and it would’ve made sense there, and maybe Jellyfin supporting object storage would be a tipping point.

Pokexpert30 🌓@jlai.lu · 27 days ago

Longhorn is goated to manage volume availability across geographically distant nodes.

If you’re running a one-node show, hostpath will do fine (or just dont kubernetes at all, tbh)

Dalraz@lemmy.ca · 27 days ago

This has been my journey.

I started with pure docker and hostpath on an Ubuntu server. This worked well for me for many years and is good for most people.

Later I really wanted to learn k8s so I built a 3 node cluster with NSF managed PVC for storage, this was fantastic for learning. I enjoyed this for 3 plus years. This is all on top of proxmox and zfs

About 8 months ago I decided I’m done with my k8s learning and I wanted more simplicity in my life. I created a lxc docker and slowly migrated all my workloads back to docker and hostpath, this time backed by my mirrored zfs files system.

I guess my point is what are you hoping to get out of your journey and then tailor your solution to that.

Also I do recommend using proxmox and zfs.

Devjavu@lemmy.dbzer0.com · edit-2 26 days ago

A single ssd with whatever formatting came with it, along with a webdav frontend I made myself. Very high security (confidentiality) actually, since I check for client side cert, user auth, biometrics (that’s plural), behavior recognition through a custom typing website and hardware token, but the integrity could use some help. And I’m painfully aware that someone could just steal my session.

I love security.
You’ll never get my duck nudes.

^In ^reality ^I ^just ^had ^a ^fun ^night

Devjavu@lemmy.dbzer0.com · 26 days ago

Shit I forgot to install a firewall.

InnerScientist@lemmy.world · 25 days ago

No worries, I installed it for you.

Devjavu@lemmy.dbzer0.com · 25 days ago

Pfew, close one.

Wait a minute.

entropicdrift@lemmy.sdf.org · 27 days ago

I just use mergerfs and SnapRAID so I can scale dynamically when I can afford new drives. Granted it’s all fully replaceable media files on my end, so I’m not obsessed with data integrity.

rcmd@lemmy.world · 27 days ago

Well, this path seems to be the most appropriate for what I am for.

And more to that, both mergerfs and snapraid are available out of the box in the latest stable Debian release.

Thanks for pointing me at it!

signalsayge@infosec.pub · 26 days ago

This is what I’m doing as well. The nice thing about it is that it supports different sized drives in the same mergerfs mount and with snapraid, you just need to make sure one of your biggest drives is the parity drive. I’ve got 10 drives right now with 78TB usable in the mergerfs mount and two 14TB drives acting as parity. I’ve been able to build it up over the years and add slowly.

entropicdrift@lemmy.sdf.org · 27 days ago

Happy to help!

Jason2357@lemmy.ca · 25 days ago

Hot take: For personal use, I see no value at all in “availability,” only data preservation. If a drive fails catastrophically and I lose a day waiting for a restore from backups, no one is going to fire me. No one is going to be held up in their job. It’s not enterprise.

However, redundancy doesn’t save you when a file is deleted, corrupted, ransom-wared or whatever. Your raid mirror will just copy the problem instantly. Snapshots and 3,2,1 backups are what are important to me because when personal data is lost, it’s lost forever.

I really do think a lot of hobbyists need to focus less on highly available redundancy and more on real backups. Both time and money are better spent on that.

Something Burger 🍔@jlai.lu · 25 days ago

Agreed. RAID is useless. Your drives will never fail before you’d want to replace them with larger ones anyway.

SparroHawc@lemmy.zip · edit-2 25 days ago

That’s true until it isn’t.

Unrecoverable hard drive failures definitely occur, even early on in the life cycle of a drive. I like having a RAID-5 array … but then again, I don’t really have any other backups (which I really should fix).

What I really need is an ISP that doesn’t have a 1.2TB data cap.

chirping@infosec.pub · 23 days ago

no what you really need is backups, isn’t it? having an external hdd that you’re backing up to is a lot better against data loss than putting that same drive into any kind of raid. (because now you truly have a copy, while in a raid it’s still a single point of failure)

I can feel your pain on the ISP part though. (Haven’t looked into this, but sounds like a zfs-job) Just saying that backups doesn’t have to be offsite, but they do need to be separate from the original data medium. Going offsite is an important early step, but getting it on separate storage is the first step.

If anything, I would argue that especially in a homelab, the risk of misconfigurations or by mistakes when tinkering can increase by using raid. If you’ve have a couple of years of experience with raid and do not see my above argument, then please share your experiences.

I am sorry for this wall of text, your comment caught my eye while thinking about something else, tl;dr: raid is not a backup

SparroHawc@lemmy.zip · 21 days ago

You’re absolutely right on all counts.

That said, my RAID setup is on a Synology, so it’s brain-dead simple and not especially prone to falling over.

squinky@sh.itjust.works · 26 days ago

Just btrfs.

MrModest@lemmy.world · edit-2 26 days ago

Why btrfs and not ZFS? In my info bubble, the btrfs has a reputation of an unstable FS and people ended up with unrecoverable data.

unit327@lemmy.zip · 26 days ago

Btrfs used to be easier to install because it is part of the kernel while zfs required shenanigans, though I think that has changed now.

Btrfs also just works with whatever drives of mismatched sizes you throw at it and adding more later is easy. This used to be impossible with zfs pools but I think is a feature now?

non_burglar@lemmy.world · 26 days ago

That is apparently not the case anymore, but ZFS is certainly more rich in features and more battle-tested.

ikidd@lemmy.world · 26 days ago

Just the 5-6 raid modes are shit. And its weird willingness to let you boot a failed raid without letting you know a drive is borked.

squinky@sh.itjust.works · 22 days ago

All I know about ZFS is that there are weird patent or closed source encumbrances or something. I hear it’s good, and it seems popular, I just avoid proprietary Oracle products.

As for btrfs, the only thing that’s claimed to be unstable is raid 5 or 6. And people use it in production saying the claims are overblown. I don’t. I use it in raid1 mode. But raid1 in btrfs doesn’t require a bunch of matching drives. It lets you glom together a number of mismatched disks and just puts every block on more than one of them. So it’s a nice cross between a raid and LFS or JBOD.

MrModest@lemmy.world · edit-2 3 days ago

There’s a thing called OpenZFS. With ZFS happened almost the same thing as with Java. Oracle bought a company and tried to close ZFS, but people just reimplemented ZFS under a FOSS licence and community. I don’t know who uses Oracle ZFS nowadays. Everyone uses OpenZFS.

It’s true that there’s some licence incompatibility that doesn’t allow integrate OpenZFS into a Linux core, but it’s not like ZFS is proprietary

https://openzfs.github.io/openzfs-docs/License.html

While both (OpenZFS and Linux Kernel) are free open source licenses they are restrictive licenses. The combination of them causes problems

melfie@lemy.lol · 26 days ago

Ha, I went down the whole Ceph and Longhorn path as well, then ended up with hostPath and btrfs. Glad I’m not the only one who considers the former options too much of a headache after fully evaluating them.

spacemanspiffy@lemmy.world · 26 days ago

I have a few Ext4 drives connected and I mount them in /etc/fstab and that’s it.

I’ve yet to find a reason to change it.

skilltheamps@feddit.org · 27 days ago

You need to ask yourself what properties you want in your storage, then you can judge which solution fits. For me it is:

effortless rollback (i.e. in case something with a db updates, does a db migration and fails)
effortless backups, that preserve database integrity without slow/cumbersome/downtime-inducing crutches like sql dump
a scheme that works the same way for every service I host, no tailored solutions for individual services/containers
low maintenance

The amount of data I’m handling fits on larger harddrives (so I don’t need pools), but I don’t want to waste storage space. And my homeserver is not my learn and break stuff environment anymore, but rather just needs to work.

I went with btrfs raid 1, every service is in its own subvolume. The containers are precisely referenced by their digest-hashes, which gets snapshotted together with all persistent data. So every snapshot holds exactly the amount of data that is required to do a seamless rollback. Snapper maintains a timeline of snapshots for every service. Updating is semi-automated where it does snapshot -> update digest hash from container tags -> pull new images -> restart service. Nightly offsite backups happen with btrbk, which mirrors snapshots in an incremental fashion on another offsite server with btrfs.

arcayne@lemmy.today · 25 days ago

I’d recommend ZFS for most home server/NAS scenarios. Gives you everything you need, and nothing you don’t.

Stuff like Ceph is just as hungry as it is powerful. The performance sweet spot for Ceph barely begins at 5 dedicated nodes (with at least a dozen drives each, ideally). I could never recommend it for home use unless you want to run it in a lab for the sake of learning.

Source: I’ve designed/built/deployed several 1PB+ Ceph clusters over the last ~5yrs.

i_am_not_a_robot@discuss.tchncs.de · 27 days ago

You can use OpenEBS to provision and manage LVM volumes. Host path requires you to manually manage the host paths.

SayCyberOnceMore@feddit.uk · 26 days ago

Backups… with LVM, if you’re trying to do a full system backup (ie with clonezilla, etc) then you have to backup the whole thing - you can’t backup just 1 drive.

I have a media server with 2x 2TB HDDs and 1x SSD in a LVM, split into Music, Video, TV… and the OS … and I can backup the individual files of course, but I can’t backup just the OS drive.

btrfs didn’t exist when I created it, but I use it on my NAS and it’s great.

I’ll be rebuilding my media server one day and change LVM to btrfs.

fruitycoder@sh.itjust.works · 25 days ago

Why always scale to 1?

Balldowern@lemmy.zip · 6 days ago

Probably cause he’ll be the only one using it.

fruitycoder@sh.itjust.works · 6 days ago

I guess I like HA scaled stuff even if just for play. I hate hurdles though

azureskypirate@lemmy.zip · 26 days ago

I’ve got Proxmox running on a nvme mirror. Two HDDs are passed to Turnkey Linux mediaserver; they are mirrored with BTRFS and act as storage. I am satisfied with all (prox, turnkey, btrfs) and would recommend.

I had one BTRFS drive fail, and replacing it with no experience took about an hour.

I do wish there was better user documentation for WebDAVcgi, the WebDAV frontend in Turnkey linux mediaserver.

mediaserver comes with Samba, so I use that to connect devices like phone or laptop to the server

Turnkey’s mediaserver was my replacement for Openmediavault with Filebrowser plugin. Filebrowser creates an internal user to write files for anything uploaded via web interface, so if you mount the folder later via NFS, the permissions don’t match. Openmediavault would stall or crash a lot as a container and especially as a VM, but maybe it runs better on bare metal.