• not_amm@lemmy.ml
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 months ago

      I found a small command to run KDE Spectacle (screenshot software) with Tesseract so I can OCR a screenshot if I want to, I only had to install Tesseract and a main language, you could easily do the same with an API and/or a local AI.

    • MacN'Cheezus@lemmy.today
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 months ago

      Llava and Bakllava are two Ollama models than can not only extract text but also describe what’s happening on screen.

      Using tesseract-ocr, as the other guy suggested, is probably simpler and less resource intensive though.