DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent

jaykrown@lemmy.world · 2 months ago

DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent

Eager Eagle@lemmy.world · edit-2 2 months ago

Not really, there are ways to count tokens before running an inference. Some providers make tokenizers public, so they even work offline. APIs also usually return rolling costs per response and have budget limits - though some could have more fine-grained limits.

Users who are surprised by the bill are usually not paying attention to each call, or using autonomous subagents, or a setup where they have little or no control to what is sent to the provider.

So the problem isn’t really the API provider, as much as it’s the tooling around it, which makes it too easy to overspend.