Basically a deer with a human face. Despite probably being some sort of magical nature spirit, his interests are primarily in technology and politics and science fiction.

Spent many years on Reddit before joining the Threadiverse as well.

  • 0 Posts
  • 1.22K Comments
Joined 2 years ago
cake
Cake day: March 3rd, 2024

help-circle







  • You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.

    Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that’s more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.

    The “cranking up” part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year,

    This is exactly what I’m talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they’re the biggest and bestest. But most tasks don’t need that much.

    I’ve used coding agents a fair bit along with the various other AI applications I’ve fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.

    There still was no guarantee that the output was useable (and there can’t be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

    I don’t think you’ve used modern coding AIs much.

    Or, for that matter, worked with human coders.

    Remember, this is the “killer” application for LLMs.

    There is no one single “killer” application for LLMs. They’re about as general a computing platform as you can get.


  • Right, which is why I said 90% and not 100%, and called out the challenge of deciding which tasks to send to which AIs. A lot of the interesting work I’m seeing in AI right now is in the agentic frameworks and harnesses that call the LLMs rather than just the LLMs themselves, these are the things that will break big complicated tasks down into more focused sub-tasks that cheaper LLMs can handle.

    Given how some of the big providers like Gemini and Anthropic have been cranking up their API costs in recent weeks I expect we’ll see a lot more effort being put into rolling those sorts of features out.


  • I think a lot of people just want to conclude that AI is going to “go away”, and latch on to beliefs that lead to this conclusion.

    I think a lot of AI companies are likely to “go away.” That’s what happened when the dot com bubble popped, if there is indeed an AI bubble then we’ll see a similar massacre at the stock market. But the technology itself is sound, just like how the basic idea of e-commerce didn’t vanish with the dot-coms.

    I’ve been doing a lot of fiddling with locally-run AI models and I’m thinking that the local open-weight models will be good enough to perform 90% of the tasks that most of us are currently depending on those big companies like Anthropic and OpenAI for. That’s going to let a lot of the air out of them when the applications catch up and start using those cheaper commodity-level models instead. For now it’s easier to just throw an OpenAI API key into your application and let it use the heavyweight models for everything, a powerful model can do simple tasks just as well as a simple model. Most tasks are simple but adding the ability to distinguish those tasks from the complicated ones is hard.