• MangoPenguin@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    3
    ·
    1 day ago

    Download of 6GB is wild, is that re-downloading the entire package for each one that needs an update? Shouldn’t it be more efficient to download only the changes and patch the existing files?

    At this point it seems like my desktop Linux install needs as much space and bandwidth than windows does.

    • KubeRoot@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      3
      ·
      18 hours ago

      Shouldn’t it be more efficient to download only the changes and patch the existing files?

      As people mentioned, that becomes problematic with a distro like arch. You could easily be jumping 5-6 versions with an update, with some more busy packages and updating less frequently. This means you need to go through the diffs in order, and you need to actually keep those diffs available.

      This actually poses two issues, and the first one is that software usually isn’t built for this kind of binary stability - anything compiled/autogenerated might change a lot with a small source change, and even just compressing data files will mess it up. Because of that, a diff/delta might end up not saving much space, and going through multiple of them could end up bigger than just a direct download of the files.

      And the second issue is, mirrors - mirrors need to store and provide a lot of data, and they’re not controlled by the distribution. Presumably to save on space, they quickly remove older package versions - and when I say older, I mean potentially less than a week old. In order for diffs/deltas to work, you’d need the mirrors to not only store the full package files they already do (for any new installs), but now also store deltas for N days back, and they’d only be useful to people who update more often than every N days.

    • Ephera@lemmy.ml
      link
      fedilink
      English
      arrow-up
      5
      ·
      23 hours ago

      This doesn’t work too well for rolling releases, because users will quickly get several version jumps behind.

      For example, let’s say libbanana is currently at version 1.2.1, but then releases 1.2.2, which you ship as a distro right away, but then a few days later, they’ve already released 1.2.3, which you ship, too.
      Now Agnes comes home at the weekend and runs package updates on her system, which is still on libbanana v1.2.1. At that point, she would need the diffs 1.2.1→1.2.2 and then 1.2.2→1.2.3 separately, which may have overlaps in which files changed.

      In principle, you could additionally provide the diff 1.2.1→1.2.3, but if Greg updates only every other weekend, and libbanana celebrates the 1.3.0 release by then, then you will also need the diffs 1.2.1→1.3.0, 1.2.2→1.3.0 and 1.2.3→1.3.0. So, this strategy quickly explodes with the number of different diffs you might need.

      At that point, just not bothering with diffs and making users always download the new package version in full is generally preferred.

      • MangoPenguin@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 hours ago

        Interesting, it wouldn’t work like rsync where it compares the new files to the old ones and transfers the parts that have changed?

        • Ephera@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          9 hours ago

          Hmm, good question. I know of one such implementation, which is Delta RPM, which works the way I described it.
          But I’m not sure, if they just designed it to fit into the current architecture, where all their mirrors and such were set up to deal with package files.

          I could imagine that doing it rsync-style would be really terrible for server load, since you can’t really cache things at that point…

          • MangoPenguin@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 hours ago

            Yeah I guess these days the majority of users have fast enough connections that its not worth it. It sucks if you have crappy internet though hah.

    • Ricaz@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      19 hours ago

      No, that’s not how compiling works. And yes, 6GB is wild. If I don’t patch in a month, the download might be 2GB and the net will still be smaller.

      I don’t think I could get close to my Windows installation even if I installed literally every single package…

    • Olap@lemmy.world
      link
      fedilink
      arrow-up
      16
      arrow-down
      1
      ·
      1 day ago

      Patching means rebuilding. And packagers don’t really publish diffs. So it’s use all your bandwidth instead!

      • [object Object]@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        3 hours ago

        With stuff like rsync, diffs can be calculated on the fly. But it requires way more server cpu than just chucking files onto the network.

      • definitemaybe@lemmy.ca
        link
        fedilink
        arrow-up
        29
        ·
        1 day ago

        Which is WAY more economical.

        Rebuilding packages takes a lot of compute. Downloading mostly requires just flashing some very small lights very quickly.

        • cmnybo@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          7
          arrow-down
          1
          ·
          1 day ago

          If you have multiple computers, you can always set up a caching proxy so you only have to download the packages once.

          • SmoochyPit@lemmy.ca
            link
            fedilink
            arrow-up
            1
            ·
            1 day ago

            That reminds me of Chaotic AUR, though it’s an online public repo. It automatically builds popular AUR packages and lets you download the binaries.

            It sometimes builds against outdated libraries/dependencies though, so for pre-release software I’ve sometimes had to download and compile it locally still. Also you can’t make any patches or move to an old commit, like you can with normal AUR packages.

            I’ve found it’s better to use Arch Linux’s official packages when I can, though, since they always publish binaries built with the same latest-release dependencies. I haven’t had dependency version issues with that, as long as I’ve avoided partial upgrades.

          • Ephera@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            ·
            23 hours ago

            openSUSE Leap does have differential package updates. Pretty sure, I once saw it on one of the Red-Hat-likes, too.

            But yeah, it makes most sense on slow-moving, versioned releases with corporate backing.