Doing the Lord’s work in the Devil’s basement

  • 0 Posts
  • 92 Comments
Joined 6 months ago
cake
Cake day: May 8th, 2024

help-circle




  • I … I don’t know man !

    I think some time around the year 10000 humanity solved most of their problem and the only remaining scarcity was “good living”. Like, cultures that had a sophisticated way of enjoying life through good food, good drink and good companionship suddenly came at a high premium. The people from SW France became insanely wealthy very quick, and a sort of federation was struck between the Gascony people, the Basque and the Brittons. It was really the only possible counter-power to the more colonialist and military minded Italians.

    Boar religion could be described as Albigensian catharism, except in space. Their freedom-loving ways are despised by the Italian catholic church but the galaxy is so vast that religion wars never really break out, it’s just local skirmishes.

    I haven’t yet determined what animal the italians have morphed into, really glad to hear any suggestion.

    Oh and here’s a picture of the Assembly of the Perfecti, held annually at Baiona Station :










  • Yeh, i did some looking up in the meantime and indeed you’re gonna have a context size issue. That’s why it’s only summarizing the last few thousand characters of the text, that’s the size of its attention.

    There are some models fine-tuned to 8K tokens context window, some even to 16K like this Mistral brew. If you have a GPU with 8G of VRAM you should be able to run it, using one of the quantized versions (Q4 or Q5 should be fine). Summarizing should still be reasonably good.

    If 16k isn’t enough for you then that’s probably not something you can perform locally. However you can still run a larger model privately in the cloud. Hugging face for example allows you to rent GPUs by the minute and run inference on them, it should just net you a few dollars. As far as i know this approach should still be compatible with Open WebUI.