For the longest time, I’ve been trying to figure out a way to “survive” in this new AI age without having to fork over a ton of money just to keep up. I’ve tried using local models via Ollama, and while they definitely work to a degree, they’re (unsurprisingly) not as good as the big model providers.

The local models tend to

  • Forget what they’re doing
  • Struggle to break larger tasks into smaller ones
  • Lose focus easily
  • Have weaker coding performance
  • Drift over longer sessions

So to improve the reliability of fully local, smaller models (and to keep all my data local and in my own network), I created Loki.

It’s a local-first, batteries-included command line tool and runtime for building and running LLM workflows locally. It’s model agnostic and supports things like

  • Agents and agent delegation
  • Roles/personas
  • MCP Servers
  • RAG
  • Custom tools
  • Macros
  • Workflow Scripting

A lot of the features it supports are specifically designed to compensate for weaknesses in smaller local models. For example:

  • Auto continuation to keep pushing models to completion instead of stopping halfway through problems
  • Parallel agent delegation so tasks can be split into smaller, focused scopes
  • Workflow-based execution (“If this, do that”) for building more reliable and repeatable automations

It also supports the major cloud providers if you want them (which definitely helped while testing 😄), but my long-term goal is simple:

Get as close as possible to Claude Code-style reliability using fully local models.

I’m always open to feedback, questions, or ideas.

Repo: https://github.com/Dark-Alex-17/loki

    • JollyForeheadRidges@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 hours ago

      Crap. I was just starting to play with Ollama and thought it might be a good balance between running local models and using one of the proprietary services.

      Could you elaborate on what’s happening with them / what to watch out for?

      • MalReynolds@slrpnk.net
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        If it gets you started with local models, by all means go ahead, their onboarding is the easiest and it works. Also a lot of 3rd party stuff uses it as a first class citizen allowing you to try out other things (e.g. Open WebUI) easily as you explore what’s possible. Currently try the Qwen 3.6 and Gemma4 models as best bang for buck, somewhere there’s a does it fit in my machine website that can help (search for it).

        That said, basically all roads in local LLM lead to llama.cpp, which gets the innovations first and then others copy their homework. Ollama (looks like they’re angling to go commercial) for a long time used it internally without attribution, now they use a bodged up engine of their own that is less performant and almost certainly a copy (possibly vibe coded) of llama.cpp. They heavily encourage using their own models / quantizations and don’t let you play with a lot of parameters without a lot of friction (possibly because they’re not implemented yet, but who knows, low transparency). You get the picture, wannabe techbros. That’s off the top of my head, search for more authoritative sources.

        After you’ve gotten the hang of things, have a look at llama-swap which just wraps llama.cpp, lemonade if you’re on AMD, vLLM for nvidia, LM Studio for mac.

    • Dark-Alex-17@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 hours ago

      Looking at Llama-swap, since it says it supports OpenAI-compatible API, it should just work natively already. Just set up the client to be type: openai-compatible and fill in the URL and provide the models. Should work out of the box!

      • MalReynolds@slrpnk.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        Hope so, bet it doesn’t without some tweaking though, OpenAI-compatible seldom is, and ollama is bad for that. Still, worth checking out, I’ll have a go at it sometime soonish and perhaps you’ll see a PR (or some doco in the best case scenario).

        • Dark-Alex-17@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          9 hours ago

          Looking forward to it! Heads up in case you missed it: I had settled on renaming it to Coyote, so sometime this week will be a breaking change and release to get that done.

          Biggest pains are just going to be updating the repo tokens for Crates.io and renaming the homebrew repo.