Any experience with Pangolin?

robber@lemmy.ml · 12 days ago

Your biggest issue with 2010 cards will be software (inference engine) support, I assume.

robber@lemmy.ml · edit-2 12 days ago

To add some practical advice:

It depends on what you mean by more advanced models. I run Qwen3.6-27b on 48GB VRAM across 3 cards (RTX 2000e Ada), and with the recent software optimizations merged into llama.cpp (tensor parallelism & MTP) I get around 30 tokens per second in generation. I use the model through openwebui for (agentic) web research and simple Q&A mostly and I’m quite happy with what it can do.

If you want something similar, maybe look at one or two second hand V100 PCIE 32GB. Or something from the Intel Arc Pro series, if you don’t mind the software support lacking behind a bit (as in less optimized).

Also it might be worth reading into the difference of dense vs MoE models, if you’re new to that. For MoE models, if your system RAM is fast enough, it’s often viable to offload the “experts” (largest parts of such models) to RAM, reducing VRAM capacity needs. Note that server motherboards with e.g. octa-channel RAM have a huge advantage over consumer boards (making DDR4 interesting despite slower speed per module).

And to adress your last question, while I have no direct experience, I’ve seen posts online about people connecting Strix Halo or DGX Spark devices, but usually via a 10+Gbit/s switch as interconnect is crucial (except if you just want to load balance).

Self-hosting LLMs is a very fun thing to do, but also a time- and money-consuming rabbit hole. You might wanna check out the LocalLlama community over at shitjustworks.

Edit: typos

robber@lemmy.ml · 3 months ago

Global sustainability rules???

robber@lemmy.ml · 8 months ago

Depends on the version you’re running.

https://forgejo.org/docs/latest/admin/upgrade/from-gitea/

robber@lemmy.ml · edit-2 9 months ago

One reason could be that the audience on lemmy has a left-ish bias and there’s a political component to the Spotify exodus.

Edit: don’t get me wrong, I love seeing content and engagement on here.

robber@lemmy.ml · 11 months ago

SFTPGo is such an awesome project, never had any problems with it.

robber@lemmy.ml · edit-2 1 year ago

I’ll add Pangolin to the list, it’s a self-hosted Cloudflare tunnel alternative.

robber@lemmy.ml · 1 year ago

It really depends on how much you enjoy to set things up for yourself and how much it hurts you to give up control over your data with managed solutions.

If you want to do it yourself, I recommend taking a look at ZFS and its RAIDZ configurations, snapshots and replication capabilities. It’s probably the most solid setup you will achieve, but possibly also a bit complicated to wrap your head around at first.

But there are a ton of options as beautifully represented by all the comments.

robber@lemmy.ml · 1 year ago

Thanks for the hint to pocketID, haven’t heard of it before. That makes me think it’s time to upgrade my auth stack as well.

robber@lemmy.ml · 1 year ago

That sounds awesome! No issues at all so far?

robber@lemmy.ml · 1 year ago

Thanks for the list! Do you use Pangolin yourself?

robber@lemmy.ml · 1 year ago

Any experience with Pangolin?

robber@lemmy.ml · 1 year ago

I’ve been testing Zed for the last couple weeks for some Vue / Nuxt projects. It works great for that and seems very stable so far, but is also developed by a for-profit. Curious to see how the Zedless project works out.

robber@lemmy.ml · 1 year ago

I would recommed to use redundant storage, such as a RAID 1 (or 5 or 6, if you want a more advanced setup). This way your data doesn’t die with your SSD.

robber@lemmy.ml · 1 year ago

Did you configure port forwarding properly? Otherwise it might be that leechers can’t contact you.

robber@lemmy.ml · 1 year ago

“Independent” browsers. Yeah right.

robber@lemmy.ml · 1 year ago

More than 140 Kenya Facebook moderators diagnosed with severe PTSD

robber@lemmy.ml · 1 year ago

Don't forget to ...

robber@lemmy.ml · 2 years ago

That’s really helpful, thank you. I’ve ordered an AX23 which will arrive tomorrow. I’ll try to figure it out in the next few days and report back.

robber@lemmy.ml · 2 years ago

Thank you! I’ll evaluate and report back.

robber@lemmy.ml · 2 years ago

And openwrt is capable enough?

Yeah it’s insane right? Every address is reachable when I open a port range. And it’s like there are ~ 10 predefined services (HTTP/S, SMTP, …) and the category “All other ports” where also 22 is part of. So I really have the choice to either keep everything shut or leave everything wide open.

I think I can’t use my own modem but I’ll have to double check with my ISP. But yes the Wi-Fi is also provided by that router and it’s also quite crappy.

robber@lemmy.ml · 2 years ago

Thank you! Do you have an example for such a firewall device? Could something like the TP-Link Archer AX55 in IPv6 “pass-through” mode do the job? Or would you go for a standalone firewall? My budget is around a hundret bucks.

robber@lemmy.ml · edit-2 1 year ago

[Solved] Chaining routers and GUA IPv6 addresses

robber@lemmy.ml · 2 years ago

haha, word

robber@lemmy.ml · edit-2 2 years ago

USA to be renamed to XXX

robber@lemmy.ml · 2 years ago

Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...)

robber@lemmy.ml · 2 years ago

Migrated my self-hosted Nextcloud to AIO and I absolutely love it