• definitemaybe@lemmy.ca
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    8 hours ago

    Re: your last paragraph:

    I think the future is likely going to be more task-specific, targeted models. I don’t have the research handy, but small, targeted LLMs can outperform massive LLMs at a tiny fraction of the compute costs to both train and run the model, and can be run on much more modest hardware to boot.

    Like, an LLM that is targeted only at:

    • teaching writing and reading skills
    • teaching English writing to English Language Learners
    • writing business emails and documents
    • writing/editing only resumes and cover letters
    • summarizing text
    • summarizing fiction texts
    • writing & analyzing poetry
    • analyzing poetry only (not even writing poetry)
    • a counselor
    • an ADHD counselor
    • a depression counselor

    The more specific the model, the smaller the LLM can be that can do the targeted task (s) “well”.

    • ErmahgherdDavid@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      8 hours ago

      Yeah I agree. Small models is the way. You can also use LoRa/QLoRa adapters to “fine tune” the same big model for specific tasks and swap the use case in realtime. This is what apple do with apple intelligence. You can outperform a big general LLM with an SLM if you have a nice specific use case and some data (which you can synthesise in come cases)