Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit before joining the Threadiverse as well.

  • 0 Posts
  • 1.21K Comments
Joined 2 years ago
cake
Cake day: March 3rd, 2024

help-circle
  • You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.

    Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that’s more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.

    The “cranking up” part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year,

    This is exactly what I’m talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they’re the biggest and bestest. But most tasks don’t need that much.

    I’ve used coding agents a fair bit along with the various other AI applications I’ve fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.

    There still was no guarantee that the output was useable (and there can’t be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

    I don’t think you’ve used modern coding AIs much.

    Or, for that matter, worked with human coders.

    Remember, this is the “killer” application for LLMs.

    There is no one single “killer” application for LLMs. They’re about as general a computing platform as you can get.


  • Right, which is why I said 90% and not 100%, and called out the challenge of deciding which tasks to send to which AIs. A lot of the interesting work I’m seeing in AI right now is in the agentic frameworks and harnesses that call the LLMs rather than just the LLMs themselves, these are the things that will break big complicated tasks down into more focused sub-tasks that cheaper LLMs can handle.

    Given how some of the big providers like Gemini and Anthropic have been cranking up their API costs in recent weeks I expect we’ll see a lot more effort being put into rolling those sorts of features out.


  • I think a lot of people just want to conclude that AI is going to “go away”, and latch on to beliefs that lead to this conclusion.

    I think a lot of AI companies are likely to “go away.” That’s what happened when the dot com bubble popped, if there is indeed an AI bubble then we’ll see a similar massacre at the stock market. But the technology itself is sound, just like how the basic idea of e-commerce didn’t vanish with the dot-coms.

    I’ve been doing a lot of fiddling with locally-run AI models and I’m thinking that the local open-weight models will be good enough to perform 90% of the tasks that most of us are currently depending on those big companies like Anthropic and OpenAI for. That’s going to let a lot of the air out of them when the applications catch up and start using those cheaper commodity-level models instead. For now it’s easier to just throw an OpenAI API key into your application and let it use the heavyweight models for everything, a powerful model can do simple tasks just as well as a simple model. Most tasks are simple but adding the ability to distinguish those tasks from the complicated ones is hard.














  • Whether inference is profitable or not is not a global yes/no question. It depends heavily on the circumstances, what you’re using it for and what you’re charging for it. A lot of the money being invested in research right now is going into making inference cheaper, which would of course make it more profitable to sell at current price points. Or just run it yourself, local models are getting quite capable these days.

    I wouldn’t bet on any specific company being the ones to survive this, especially not first-movers like OpenAI. More likely they’ll spend their money blazing the trail and the ones to profit from it will be the ones who followed along behind. When a company goes bankrupt it doesn’t poof out of existence, its assets get sold off at pennies on the dollar and then whoever bought those assets gets a chance at running them without the overhead of the previous company’s debt.


  • FaceDeer@fedia.iotoTechnology@lemmy.worldIs AI Profitable Yet?
    link
    fedilink
    arrow-up
    6
    arrow-down
    2
    ·
    13 days ago

    This seems somewhat misleading. Lots of products take a lot of investment in them for many years before they reach profitability. The Boeing 787 Dreamliner, for example, was in development for 7 years and another three years after that before it was profitable. The Falcon 9 rocket took 13 years to develop and now it’s the most profitable satellite launcher around. The Dyson bag-free cyclonic vacuum cleaner took 15 years to develop.

    Most of this AI stuff has only been in heavy development since ChatGPT burst upon the scene in 2023. It’s not unreasonable to see the industry still heavily into the investment and development side of things.