

At ~3.5% market share, depending on the statistics source. And declining, month-on-month:
https://radar.cloudflare.com/reports/browser-market-share-2025-q4


At ~3.5% market share, depending on the statistics source. And declining, month-on-month:
https://radar.cloudflare.com/reports/browser-market-share-2025-q4


Ublock was already somewhat neutered on Chrome, and people didn’t seem to notice. They keep using it.
I’m just so cynical these days. It’s not like the Windows XP era, where people eventually get fed up with enshittification, and move.
Google won. Facebook won.
They have absolute control, basically.


Except they aren’t moving anymore.
I think Google finally trapped most of the web’s population, for good.


Like how they killed JPEG-XL.
All over some employee’s ego associated with his AVIF contributions, from what I’ve seen.


Apparently not, as ads keep selling.
I hate to sound so cynical, but many folks are gullible. They’ll trust a flashy ad because it looks nice to them, and gives them a positive emotional response, and then internalize that judgement as their own decision (so when someone comes to challenge it, they take it personally).
It’s not just old people living in another time, either. I’ve watched teenagers and young adults trust obviously-sponsored influencers like they’re friends. Or wear brands as status symbols.


How about SRWare Iron?
It’s corporate backed, so security may view it favorably over FOSS forks like Helium or Ungoogled Chromium.
There’s a whole slew of Chromium forks that I think are trying to preserve V2 functionality.


Visibility.
See: Helium Browser. Which is already doing this, and shipping full UBO, yet most aren’t aware of it.
Also see: this comment is minimized by default, and most of Lemmy will never know the answer to OP’s question already exists. There are probably other forks that do this, too.
Could this ever be “self hosted” on a phone, in the future? Eg run as a web app, basically?
That would get around the issue of rate limiting for those of us with no home server.
That’s just a far flung idea though. Either way, this is amazing.
Eh, most of the poison is the dark patterns in the UI, the relentless engagement optimization, algorithmic recommendations, the tracking, the ads, and so on.
This short circuits all of that.
You could still watch toxic influencers, but it’s not funneling you towards that anymore.


I wouldn’t use the word “desperate.”
Scaling is inefficient.
For training, it takes a ton of work to even get half-decent utilization across a bunch of servers, and it makes any sort of experimentation with architectures immensely more difficult.
Hence allegations that some GPUs are assigned “busywork” just to meet utilization quotas from the hardware seller.
For inference, scale isn’t so important. But the demand for tokens is self inflicted: from Meta shoving chatbots in ramdom places in software, and from their architecture being archaic and inefficient.
In other words, none of this has to be. It’s just the whims of one insecure man, surrounded by sycophantic tech bros, who’s feeling FOMO but doesn’t understand transformers LLMs at all.
If he had half a brain, he wouldn’t have fired the team that literally founded the open weights LLM space.
But he’s also too rich to ever feel the consequences of bad decisions now.


A note: “AI” doesn’t have to be that way.
It’s not using evaporative cooling out of necessity. It’s just the absolute cheapest, fastest way to cool en masse. Just like slamming a gas generator down on a site, or housing servers in tents:

They could take an extra second to build something efficient, and they did not.
Or, they could just not use waste so many GPUs on “intelligence scaling” that does not scale. Like most non-US firms do, just fine. But FOMO.
In other words, non technical decision makers, who don’t understand how transformers models even work, dictated this would happen. It’s not even a sane business planning decision, and they’re too rich to face any consequences now.
It’s not democracy though.
Whatever the ideals of cypto are, however user friendly could be made, in reality, it’s just fundamentally too easy to be abused.
As-is, it’s one of those “it would work fine if everyone learned it in detail, and grifters would go away” ideas, and that’s not going to happen.
Democracy is fragile and exploitable too, but it has a track record of working across general populations for reasonable lengths of time.


Rarely of course, something is so complicated that it actually takes more time to come up with the right code than do a review. But that is only a rare thing.
This is definitely a thing though.
On this very topic, many llama.cpp PRs are good examples. A model trainer may present a PR with poor understanding of the (very complicated, highly specialized, sparsely documented) project. Then a maintainer comes to fix it, but has absolutely no knowledge of certain things the model trainer would know (“Oh, the whole thing NaNs if this one value on layer 23 isn’t FP32!”)
There has to be a back-and-forth. A whole lot of it.
That is an exception, yeah.
But I’m not sure I’d call it “rare.” There are definitely situations where fixing without explaining is ultimately a whole lot of work.


It would have been nice if crypto didn’t turn into a network of pyramid schemes.
Like, I am sympathetic to the idea. I mined a Bitcoin a long time ago (and lost it in intervening years). But holy moly, did it erupt into a tire fire.


I think it will massively correct, like the dotcom bubble for websites. LLMs are a useful utility, but not something that’s going to make economics irrelevant (like people thought about the internet).
Why? LLMs are tools, text models, not AGI magic lamps, and a couple of con artists are trying to convince the world otherwise. That’s an oversimplification, but the jist of it.
And I’m no LLM skeptic. I’ve been playing with ML as a hobby for a decade, with local LLMs before ChatGPT was even available, but the market attitude towards all this is absolutely bonkers. It’s worse than crypto.
Like what? RT is not blocked for me; I read an article from there once in awhile. Neither is the Russian government website or anything else I can think to test.


Really? I don’t use it for work, but I swore I was hitting some internal MS model for chat/code, as it was one of the worst experiences I’ve had with LLMs over 24B.


It’s more complex than that. The weights of big models are distributed, and then tokens are processed in parallel for multiple users. The setup varies, but it could be 8 GPUs serving many dozens of users at once, or bigger sets with even more parallelism.
I think the bigger problem is that Copilot is… shit.
It’s probably some ancient, inefficient architecture, not something super sparse and hardware efficient like (say) Deepseek V4, or Kimi 2.6, or Gemini Pro.
And literally every interesting dev team Microsoft has ever acquired (Phi, WizardLM, many more), and any interesting innovation they figured out, has just disappeared into a black hole.
They don’t have custom hardware, either, like Huawei NPUs or Cerebras WSEs, or Google TPUs. They’ve written some very interesting papers on that, and proceeded to do squat with them.
Also, it is AWFUL for its size. Tiny models that are basically free run circles around CoPilot.
What I’m getting at is that CoPilot is probably the most inefficient LLM out there. Like, it’s impressive how bad it is.


I use sigma N sampling at 1.0, a slop phrase banlist, and maybe a little rep penalty.
Beyond that it depends on the usage.
For scripts or “questioning a document,” it’s as low as can be until it loops. I start with zero temperature. But I don’t really use Gemma for coding, TBH, and it’s not good for longer documents.
If it’s for a specific language or a very specific script, I sometimes constrain grammar for the language.
For more “general” writing, like brainstorming or RP or whatever, I start at around 0.7 with minimal DRY sampling and look at the logit percentages in the Mikupad UI. Especially “important” tokens like names or information recall. If the probability of getting correct answers is too low, I turn the temperature down.
…But honestly, I tend to use big MoEs instead of Gemma for that, too.
And if none of this makes any sense…
Yeah. That’s the problem.
Sampling was supposed to be a temporary stopgap until looping and such was figured out, but the big LLM devs just never addressed it in production. There are all sorts of interesting papers, including one from Google about sampling logits per-layer, but they don’t implement any of them in the API models.
Yeah.
Say what you will about Apple, but they are the major hedge against Google controlling much of the world, at the moment.
That sounds hyperbolic and conspiratorial. But, sadly, it really isn’t. Without them, there would effectively be one web browser, one mobile operating system, one search engine, with basically no way out of whatever Google dictates for 97% of the population.