At least it required skill to make anything convincing.
…Isn’t that the joke of the post?
At least it required skill to make anything convincing.
…Isn’t that the joke of the post?


Problem is, I have non-tech-savvy family who do try. And succeed.


In practice, they’re not very good because of broken FP16, broken kernels, high idle usage and a bunch of other things.
Same with the AMD MI50 and MI100. Looks great on paper, not practical IRL, unless you want to pay a whole team of software devs to fix them for you.
Better to just save up for a 2080 TI or 3090, sadly.


No.
Even the biggest open weights models are trained on pennies compared to OpenAI and Claude. They just don’t have the hardware to be so wasteful.
In fact, the Nvidia GPU ban was the best thing to ever happen to “small” AI devs. It made them thrifty.


Fun fact: USB A can fit in HDMI ports.


MoEs can be very fast with hybrid inference. I run Xiaomi Mimo 2.5 (a 310B model, 116GB weights) on my single 3090 + 7800 CPU, and it outputs faster than I can read it.
It’s also easier to fit long context, if you need that.
It’s best to use the ik_llama.cpp fork for that, though. It gives a huge boost to hybrid MoE speeds.


Probably Qwen 35B then. ~9GB free VRAM + (let’s say) ~16GB of free CPU RAM is a good size for that, and squeezing bigger models in would be hard unless it’s a headless linux server.


I posit it’s a consumer culture issue.
Look at Temo, Tiktok, Amazon, YouTube; people are bombarded with “buy this on impulse!” every day, 24/7, through notifications. They’re urged to buy high by dozens of influencers they’re bombarded with.
So they do.
And now that’s the culture. Competition isn’t going to fix that, and doesn’t naturally arise in that kind of environment anyway.


Depends on how much CPU RAM you have, and how fast it is.
As others said, Qwen 35B at the very least. But you can get better models with more CPU RAM.


Yeah.
It’s not even about efficiency, really, but independence from corporations, privacy, and principle. Kind of like Lemmy.


This is a “feel guilty about missing recycling” kind of complaint.
Having a server run for an hour or two (?) a day is negligible. You use more energy running a fridge, or leaving a few lights on, or browsing Lemmy for a while. Or running a docker container for other services. You release more greenhouse gasses eating beef, or driving anywhere, or even opening your front door a few times, and individual industries are going to use vastly more electricity than a few self hosters ever would. If you own an EV, you’ve probably blown out your entire zip code of self hosters.
…But if it still bothers you, you can find an ewaste smartphone(s) and host on that. This is actually a very neat use case IMO.
However, if you get to the homelab scale of “an EPYC + 3090s running all the time” that electricity use does start to add up. But that’s quite a rare hobbyist tier, I’d say, and it really shouldnt be running 24/7.


I hope people here give Valve the same flack they gave Nintendo and Sony for raising prices.
Narrator: they did not.


The other half is that, at these prices, the manufacturing doesn’t have to be competitive.
They can use a less advanced dram fab process, sell it for less than what Micron/SK/Samsung are charging and still make ends meet.
Maybe it’s higher voltage and slower. But again, at this point, people will take it.


That’s not going to decrease the cost, as Strix Halo is expensive. See: the Framework Desktop.
And it needs LPDDR5X, though they could use SOCAMM modules. Framework tried, but I think they ran out of R&D time to work out the electrical gremlins.


+1 for Kagi’s base tier limit. It’s just not close to enough.


For what it’s worth, DDG recognized this immediately.
They dipped their toe in AI search, felt the pushback, and went all-in on putting toggles and immediately accessible opt-outs everywhere. They put a filter for AI images (and I hope they do the same for AI SEO spam).
In other words, they actually leaned in and listened to their own users. Unlike the soulless vampire on a throne Google has become.


I can.
“Google” is a taken-for-granted utility to most people. It’s a verb. It’s kinda like saying “rain isn’t rain anymore,” as it’s just always been a part their lives, until it changed out of the blue, enough to break their routine.


I got it once, and didn’t like it one bit.
And I’m long time user of local LLMs. I use Google AI Studio sometimes. But that’s just “AI” precisely when and where I do not want it.


Or that it’s a supremely scummy and buggy, even compared to other LLM frontends.
That sums up my experience with Ubuntu Gnome.