• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    6 hours ago

    Just not power/cost efficiently on CPU only, is what I meant. CPUs don’t have the compute for batching (running generation requests in parallel). You need an accelerator, like Huawei’s, to be economical.

    It’s fine for local inference, of course.

    • ag10n@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 hours ago

      A whole ecosystem that can run on any hardware, efficiently or not, is a whole ecosystem developed for the Chinese market