Yup, I’m posting another this week. Sorry.

This week I’m hoping we can wrangle a solution around AI and our selfhosted community. There are plenty of strong opinions (both pro and con), but one thing is for certain - there needs to be better disclosure in promo posts. Two options (that aren’t mutually exclusive):

  • Any posts of an AI focused, AI Developed, etc software gets an [AI] tag. No, a [Not-AI] tag is not needed to accomplish this, thats kind of a “non-golfer” sort of tag.
  • Comment requiring an AI disclosure response to every promo post, if its not detailed in the post itself. Specifics (generating docs for commands, translation, whole-boat vibe-coded this app, etc) would be requested.

I will say that having disclosure and/or tagging would mean that comments that just say “slop” or “fuck ai” or whatever would be off topic at that point, that information is already provided, so its just noise (and sometimes pretty uncivil - I’ve been light on that for now due to the need for a rule on this).

The tag [AI] would make it easy to filter out (or search for, if that’s your thing), but there is a wildly different degree of AI use out there, and from the posts with a positive score, its usually due to responsible AI use (translations, a snippet they had to do something obscure with, available to use with AI but doesn’t require it, whatever), which is why I think the disclosure has a place as a benefit to everyone.

Please provide any input or alternative options on this, and I can then put it to a vote like the last one. Comments seem to be the best approach without involving something off-site, but if you have a better idea/option, please share.

  • midribbon_action@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    19 hours ago

    I don’t think you need hardly any hardware to do ocr. USPS started doing reliable ocr on 80s hardware. You really think an ai cluster is necessary for that?

    Anyways, cool anecdote, not an actual financial study or report, and very long-winded honestly.

    Post-edit reply: wow, that’s kinda fucked up not to disclose that they disassembled it already. Looks like they found better uses. That’s your success story?

    • curbstickle@anarchist.nexusOPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      20 hours ago

      OCR <> data ingest

      OCR wouldn’t work, as I mentioned, because of the varying structures of the forms.

      I’m sorry my answer was too “long winded” for you, I was trying to be informative, but clearly you aren’t interested in that. Enjoy your day.

      • midribbon_action@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        20 hours ago

        Don’t think that’s true. You can run the whole form through, come out with an identical pdf with searchable/copyable text. Even a completely novel form uses the same alphabet. Add some regex to pull out the fields you need to enter, and on failure give it to a human. All of that can be done with python on a raspberry pi. A decade ago.

        https://github.com/ocrmypdf/OCRmyPDF

        • curbstickle@anarchist.nexusOPM
          link
          fedilink
          English
          arrow-up
          1
          ·
          20 hours ago

          You’d be wrong.

          The fields aren’t all the same kinds of values, which requires relationship between the data to be evaluated for entry.

          You’re assuming this is transposing contents, which was not the issue. Your example is what was initially planned and halted before transitioning to the approach I helped deploy.

          • midribbon_action@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            1
            ·
            20 hours ago

            That’s wrong, you didn’t know that there’s another if/else statement required by them. That’s what the supercomputer is for.

            That’s how you sound.

            • curbstickle@anarchist.nexusOPM
              link
              fedilink
              English
              arrow-up
              1
              ·
              20 hours ago

              So I’ll go back to my previous comment; you’re not actually interested in understanding the use, you have a pre-determined (and uninformed) view of use and operation, and providing that information as an example is “long-winded”.

              Ill be done with this discussion now. Enjoy your day.

              • midribbon_action@lemmy.blahaj.zone
                link
                fedilink
                English
                arrow-up
                1
                ·
                19 hours ago

                The difference is that each person didnt need to hunt across the form to find the details. When the comparison comes up for approval at each stage, they get the snippet being brought in and the field its being applied to.

                This is the only technical detail in the whole 500 word comment.