

Probably Qwen 35B then. ~9GB free VRAM + (let’s say) ~16GB of free CPU RAM is a good size for that, and squeezing bigger models in would be hard unless it’s a headless linux server.


Probably Qwen 35B then. ~9GB free VRAM + (let’s say) ~16GB of free CPU RAM is a good size for that, and squeezing bigger models in would be hard unless it’s a headless linux server.


I posit it’s a consumer culture issue.
Look at Temo, Tiktok, Amazon, YouTube; people are bombarded with “buy this on impulse!” every day, 24/7, through notifications. They’re urged to buy high by dozens of influencers they’re bombarded with.
So they do.
And now that’s the culture. Competition isn’t going to fix that, and doesn’t naturally arise in that kind of environment anyway.


Depends on how much CPU RAM you have, and how fast it is.
As others said, Qwen 35B at the very least. But you can get better models with more CPU RAM.


Yeah.
It’s not even about efficiency, really, but independence from corporations, privacy, and principle. Kind of like Lemmy.


This is a “feel guilty about missing recycling” kind of complaint.
Having a server run for an hour or two (?) a day is negligible. You use more energy running a fridge, or leaving a few lights on, or browsing Lemmy for a while. Or running a docker container for other services. You release more greenhouse gasses eating beef, or driving anywhere, or even opening your front door a few times, and individual industries are going to use vastly more electricity than a few self hosters ever would. If you own an EV, you’ve probably blown out your entire zip code of self hosters.
…But if it still bothers you, you can find an ewaste smartphone(s) and host on that. This is actually a very neat use case IMO.
However, if you get to the homelab scale of “an EPYC + 3090s running all the time” that electricity use does start to add up. But that’s quite a rare hobbyist tier, I’d say, and it really shouldnt be running 24/7.


I hope people here give Valve the same flack they gave Nintendo and Sony for raising prices.
Narrator: they did not.


The other half is that, at these prices, the manufacturing doesn’t have to be competitive.
They can use a less advanced dram fab process, sell it for less than what Micron/SK/Samsung are charging and still make ends meet.
Maybe it’s higher voltage and slower. But again, at this point, people will take it.


That’s not going to decrease the cost, as Strix Halo is expensive. See: the Framework Desktop.
And it needs LPDDR5X, though they could use SOCAMM modules. Framework tried, but I think they ran out of R&D time to work out the electrical gremlins.


+1 for Kagi’s base tier limit. It’s just not close to enough.


For what it’s worth, DDG recognized this immediately.
They dipped their toe in AI search, felt the pushback, and went all-in on putting toggles and immediately accessible opt-outs everywhere. They put a filter for AI images (and I hope they do the same for AI SEO spam).
In other words, they actually leaned in and listened to their own users. Unlike the soulless vampire on a throne Google has become.


I can.
“Google” is a taken-for-granted utility to most people. It’s a verb. It’s kinda like saying “rain isn’t rain anymore,” as it’s just always been a part their lives, until it changed out of the blue, enough to break their routine.


I got it once, and didn’t like it one bit.
And I’m long time user of local LLMs. I use Google AI Studio sometimes. But that’s just “AI” precisely when and where I do not want it.


Or that it’s a supremely scummy and buggy, even compared to other LLM frontends.


Yeah.
DDG mobile’s “privacy ergonomics” are perfect. Every browser should be structured like that.
I mostly use Orion for other reasons, but I still use DDG a good bit.


It’s great it even has a toggle.
You can tell it’s Twitter because they’re no acknowledgment of talking to bots.


I would not bet against it, that’s for sure.
My bet is on j before Starfield 2. BGS will really, actually think “hey, let’s remind everyone how great Starfield 1 was!”


I don’t think it was a sales failure, was it? Not a short term one.


At risk of going Fallout 4…
How about medieval homebuilding?
Build a little hut? Store stuff. Have chickens. Fish in a pond out back, kiddo running around.
If it’s done even half as meticulously as KCDII, I personally know people that would play the hell out of content like that. That’s the premise of a whole lot of Skyrim modding.
MoEs can be very fast with hybrid inference. I run Xiaomi Mimo 2.5 (a 310B model, 116GB weights) on my single 3090 + 7800 CPU, and it outputs faster than I can read it.
It’s also easier to fit long context, if you need that.
It’s best to use the ik_llama.cpp fork for that, though. It gives a huge boost to hybrid MoE speeds.