• Eager Eagle@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    edit-2
    6 小时前

    Not really, there are ways to count tokens before running an inference. Some providers make tokenizers public, so they even work offline. APIs also usually return rolling costs per response and have budget limits - though some could have more fine-grained limits.

    Users who are surprised by the bill are usually not paying attention to each call, or using autonomous subagents, or a setup where they have little or no control to what is sent to the provider.

    So the problem isn’t really the API provider, as much as it’s the tooling around it, which makes it too easy to overspend.