• FaceDeer@fedia.io
    link
    fedilink
    arrow-up
    3
    arrow-down
    1
    ·
    23 hours ago

    You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.

    Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that’s more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.

    The “cranking up” part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year,

    This is exactly what I’m talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they’re the biggest and bestest. But most tasks don’t need that much.

    I’ve used coding agents a fair bit along with the various other AI applications I’ve fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.

    There still was no guarantee that the output was useable (and there can’t be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

    I don’t think you’ve used modern coding AIs much.

    Or, for that matter, worked with human coders.

    Remember, this is the “killer” application for LLMs.

    There is no one single “killer” application for LLMs. They’re about as general a computing platform as you can get.

    • Wildmimic@anarchist.nexus
      link
      fedilink
      English
      arrow-up
      2
      ·
      22 hours ago

      I used to think like you, and I am still pro local LLMs - I use them as tutors for areas I don’t know much about, and since I use the output just as a guide and implement it on my own I quickly realize if something isn’t right.

      We will see - when OpenAI and Anthropic rush towards IPO this year, which was made very likely because SpaceX has upped the tempo - what the real costs are. If this article and others I’ve read in the last year are correct, and the prices have to go up x10 to break even, then we are in for a wild ride. I’m only grateful that for now they don’t get lumped into the index funds.