TL;DR: See title. How can I tell Google they’re probably processing their mail wrong?

After setting up the Matrix Authentication Service (MAS) and exim-relay as mail server, I noticed verification mails sent from the service are often in the spam directory.

When digging deeper, I found out the mails are failing DKIM authentication. This was weird because DKIM is set up correctly, as verified by other mail providers and online DKIM test tools such as DMARC Tester.

Searching online for “gmail fails DKIM authentication, while other providers pass”, I found regular reports, posts or similar without resolution, or unrelated resolutions such as DKIM alignment.

Using meld, I compared the original source of mails as received by gmail with those of other providers, and found a difference:

In other providers, the header for “From:” and “Reply-To:” fields are presented with double-quotes:

From: "John Smith" <j.smith@example.com>
Reply-To: "John Smith" <j.smith@example.com>

In gmail, where DKIM fails, there are no double-quotes:

From: John Smith <j.smith@example.com>
Reply-To: John Smith <j.smith@example.com>

As this should be the raw source each, I ruled out presentation issues and digged deeper.

I found out, that specifically the rust crate lettre, as used by the MAS, encodes names with whitespace using double-quotes. Further, from researching a bit more and reading RFC 2822 sections 3.2.4 and 3.2.5, I come to the conclusion that whitespace needs no quoting in mail headers.

I created issues upstream and downstream to report the issue at lettre and MAS, particularly that their mails are failing DKIM checks at gmail:

If you’ve read that far, you probably wonder why I post all of that? For one, to provide another data point for people scratching their heads over mail issues.

But other than that: I’m pretty sure the google mail servers should not strip the quotes before doing the DKIM check. I assume they have some kind of decode -> process -> encode workflow, that then simply encodes the headers again, this time without the quotes. But IMHO a correctly signed message should not lead to an authentication error, even if the contents are not perfectly encoded.

I would be curious on getting some feedback from some mail experts on what is happening here. This is not my field of expertise and I’m going by what I’ve learned over the past 48h.

  • hperrin@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 hour ago

    Quotes around the display name in a header certainly conforms to the standard (and in fact, the way Gmail rewrites it does not conform to the standard when it contains a space, but is the obsolete form), and I would expect any decent mail program to leave it alone. Then again, Gmail is not a decent mail program, and hasn’t been for a long time.

    Edit: @[email protected] points out below that Gmail’s rewrite does follow the spec, so it is conforming to the standard when white space is in the display name. It’s only when there is a dot (.) that it would be using the obsolete form.

    • w2xel@gehirneimer.deOP
      link
      fedilink
      arrow-up
      1
      ·
      3 hours ago

      I had to think about this for a while, but the standard is only obsoleting folding white space, i.e. white space that wraps lines, such as:

      Subject: This
       is a folded line
      

      which is equivalent to

      Subject: This is a folded line
      

      As I understand it, white space is allowed before and after obsoletion. Or do I understand it wrong?

      Edit:

      I think in the obsoleted language the following would have been allowed for a From: field as well:

      From: John
       Smith <j.smith@example.com>
      
      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        3 hours ago

        Folding white space is a different issue. It’s the atom / quoted-string part. That’s the standard form of a display name. Meaning if it’s more than one word (separated by white space), it should use the quoted string form. It does list obs-phrase as an alternative, but using obsolete syntax should be avoided when possible.

        • w2xel@gehirneimer.deOP
          link
          fedilink
          arrow-up
          1
          ·
          2 hours ago

          phrase is 1*word, not word. The difference to obs-phrase is that it allows dots (“.”) and folding whitespace. Not that it allows whitespace.

          • hperrin@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            1 hour ago

            Atom is defined as such:

            atext           =       ALPHA / DIGIT / ; Any character except controls,
                                    "!" / "#" /     ;  SP, and specials.
                                    "$" / "%" /     ;  Used for atoms
                                    "&" / "'" /
                                    "*" / "+" /
                                    "-" / "/" /
                                    "=" / "?" /
                                    "^" / "_" /
                                    "`" / "{" /
                                    "|" / "}" /
                                    "~"
            
            atom            =       [CFWS] 1*atext [CFWS]
            

            So since atom does not contain white space, there should be no white space. Only comments can separate 1*atom.

            If they wanted to allow separating atoms by a space, it would be written:

            atom *(SP atom)

            Edit: wait, I’m wrong. CFWS includes white space. You’re right.

            • w2xel@gehirneimer.deOP
              link
              fedilink
              arrow-up
              1
              ·
              55 minutes ago

              Ah, but I was right for the wrong reasons, I also only now understand it fully. Thanks for the nice discussion.

      • hperrin@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        3 hours ago

        The section right below shows that it’s obsolete:

        A.6.1. Obsolete addressing
        
           Note in the below example the lack of quotes around Joe Q. Public,
           the route that appears in the address for Mary Smith, the two commas
           that appear in the "To:" field, and the spaces that appear around the
           "." in the jdoe address.
        
        ----
        From: Joe Q. Public <john.q.public@example.com>
        To: Mary Smith <@machine.tld:mary@example.net>, , jdoe@test   . example
        Date: Tue, 1 Jul 2003 10:52:37 +0200
        Message-ID: <5678.21-Nov-1997@example.com>
        
        Hi everyone.
        ----
        

        And in the spec itself, that syntax is named as “obs-phrase”.

        But yes, though obsolete, it is still legal syntax. So I guess I shouldn’t say it “does not conform to the standard”, but rather “just barely conforms to the standard”.

        • w2xel@gehirneimer.deOP
          link
          fedilink
          arrow-up
          1
          ·
          2 hours ago

          In the same example, the Mary Smith is not the issue, rather the @machine.tld, as is written in the description.

          John Q. Public is an issue because of the dot, not the spaces. The spaces are an issue in jdoe@test . example, as that’s actually an address, not a name

          • hperrin@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 hour ago

            Yep, you are right. So Gmail is following the spec, but still altering headers. So at least it’s not breaking the email, just breaking the authentication.