• AmbitiousProcess (they/them)@piefed.social
      link
      fedilink
      English
      arrow-up
      18
      ·
      2 days ago

      The outage also took down people’s banks, which stopped many of them from doing things like buying groceries 💀

      I don’t think saying it’s good for us “touching grass” is a good argument here when AWS hosts such a substantial portion of all online services.

      • CallMeAnAI@lemmy.world
        link
        fedilink
        arrow-up
        4
        arrow-down
        12
        ·
        edit-2
        2 days ago

        How many banks didn’t work? Which ones? You have a source? Visa and MC were good all day here in the real world in the east coast.

        Sounds like you’re just trying to exaggerate around an edge case that frankly isn’t the end of the world even if it were common for 4 hours a year

        Why aren’t you blaming the bank for having redundancy outside a single DC? How many banks do you know if that were out susessfully using other providers that have a higher SLO/SLA?

        • AmbitiousProcess (they/them)@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 days ago

          I can see why your account is marked with two red marks on PieFed for low reputation, because man do you come off confrontational.

          How many banks didn’t work? Which ones? You have a source?

          Search engines exist. Use them before acting as if I"m making shit up.

          The list of financial institutions that had issues, as far as I can tell from industry reporting and downdetector graphs, is Navy Federal Credit Union (~15 million members), Truist (~15 million customers), Chime (~8-9 million customers), Venmo (~60 million users), Ally Bank (~10 million customers), and Lloyds Banking group (~30 million customers).

          Assuming no overlap, that’s nearly 140 million people that lost banking and money transfer access.

          Sounds like you’re just trying to exaggerate around an edge case that frankly isn’t the end of the world even if it were common for 4 hours a year

          The outage lasted for 15 hours in some cases, due to many AWS services recovering after the outage, yet having a backlog to work through, which took many more hours. Many services also depend on AWS in a manner where AWS coming back online doesn’t instantaneously restart service. These systems are complex, and not every company that relied on them could instantly start back up the moment the main outage was resolved, let alone when many services were still marked as impacted for hours and hours later as they worked through their backlog.

          Why aren’t you blaming the bank for having redundancy outside a single DC? How many banks do you know if that were out susessfully using other providers that have a higher SLO/SLA?

          I also blame them for not having additional redundancy. I blame both them for not having a fallback, and AWS for allowing such a major outage to happen. Shockingly, more than one party can be at fault.

        • jj4211@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          2 days ago

          I’m also skeptical that any payment processing networks were impacted. I would be surprised, but less so if they couldn’t manage their account online which might have similar effect. I’m not surprised at all of the grocery store or restaurants were significantly impacted. I know a lot of the apps were broken and I could imagine someone used to apping everything leaving their cards at home and unable to get lunch. Might have some aggressively “modern” establishments that are kiosk only and I could imagine them getting downed by aws outage.

          outside a single DC?

          I’m told that a lot of the companies did all the right things but still got taken down because some dependent Amazon services are tethered to that single DC and only Amazon has the power to change that.

          • CallMeAnAI@lemmy.world
            link
            fedilink
            arrow-up
            2
            arrow-down
            2
            ·
            2 days ago

            I’ll wait for the final root cause but…

            We mitigated most of it by swapping to secondary DNS and completely taking any thing related to AWS DNS and services in useast1. If you didn’t have secondary DNS and heavily reliant on AWS internal DNS this might be something they experienced.

            • jj4211@lemmy.world
              link
              fedilink
              arrow-up
              1
              ·
              2 days ago

              I’m not familiar with AWS myself, but they seemed to be referencing something they vaguely characterized as ‘security infrastructure’, kind of as a handwaving for why they thought it made sense to be single point of failure because to enable distribution of it would somehow be insecure…

              I frankly wasn’t interested in delving deeper, because that excuse sounds pretty stupid, but I’d be trying to get details I don’t personally need about something I probably shouldn’t be arguing about. I’ve gotten burned too much by someone championing something stupid ostensibly in the name of ‘security’ to try to sign up for another one of those arguments.

      • CallMeAnAI@lemmy.world
        link
        fedilink
        arrow-up
        3
        arrow-down
        5
        ·
        2 days ago

        Sure that’s what I said.

        Go ahead to rack space, or SAP, I’m sure you’ll have a much more reliable experience. Or just run your own. I’m sure it’ll be easy peasy and super reliable.

    • balance8873@lemmy.myserv.one
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      Some of us have jobs. I mean I guess you have a job, but in your case losing network just means those pesky humans stop bothering you and go to a real therapist.

      • CallMeAnAI@lemmy.world
        link
        fedilink
        arrow-up
        3
        arrow-down
        3
        ·
        2 days ago

        I’m a staff engineer who has been dealing with the results of SLAs before Amazon was an idea.

        God forbid I have a p0 where I have to message a bunch of non technical directors it’s AWS not us. Much much worse than having to figure out and then pull in the team that pushed whatever untested shit made it’s way into production on a Friday afternoon.

        Unless you’ve been responsible for a SaaS with SLAs in a b2b setting; I know more about the consequences of a provider outage than you.

        • balance8873@lemmy.myserv.one
          link
          fedilink
          arrow-up
          4
          arrow-down
          2
          ·
          2 days ago

          I don’t know what you’re responding to but it doesn’t seem to be me. Either that or you forgot the username you picked for yourself in which case: whoosh