• De Lancre@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    35 minutes ago

    You don’t need 170+ GB of VRAM. Whole model can be run at around 1 token/second on a modern hardware from an ssd. Which is slow, don’t get me wrong, but it still somewhat useable.

    Upd.Once again, for those who use AI because struggles to read: it is slow, but it is usable. Which is, by definition, means that you don’t need 170+ GB of VRAM to run this model. Period. It runs from ssd. That is a fact.

    • dil@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      45 minutes ago

      1 token/second isn’t remotely acceptsble lol

    • placebo@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      5 hours ago

      “Somewhat” is doing a lot of heavy lifting there 😂 How much time does it take to process your average request?