• MHLoppy@fedia.io
    link
    fedilink
    arrow-up
    8
    ·
    17 hours ago

    I actually think this video is doing a pretty bad job of summarizing the practical-comparison part of the paper.

    If you go here you can get a GitHub link which in turn has a OneDrive link with a dataset of images and textures which they used. (This doesn’t include some of the images shown in the paper - not sure why and don’t really want to dig into it because spending an hour writing one comment as-is is already a suspicious use of my time.)

    Using the example with an explicit file size mentioned in the video which I’ll re-encode with Paint.NET trying to match the ~160KB file size:

    Hadriscus has the right idea suggesting that JPEG is the wrong comparison, but this type of low-detail image at low bit rates is actually where AVIF rather than JPEG XL shines. The latter (for this specific image) looks a lot worse at the above settings, and WebP is generally just worse than AVIF or JPEG XL for compression efficiency since it’s much older. This type of image is also where I would guess this type of compression / reconstruction technique also does comparatively well.

    But honestly, the technique as described by the paper doesn’t seem to be trying to directly compete against JPEG which is another reason I don’t like that the video put a spotlight on that comparison; quoting the paper:

    We also include JPEG [Wallace 1991] as a conventional baseline for completeness. Since our objective is to represent high-resolution images at ultra-low bitrates, the allow-able memory budget exceeds the range explored by most baselines.

    Most image compression formats (with AVIF being a possible exception) aren’t tailored for “ultra-low bitrates”. Nevertheless, here’s another comparison with the flamingo photo in the dataset where I’ll try to match the 0.061 bpp low-side bit rate target (if I’ve got my math right that’s 255,860.544 bits):

    • Original PNG (2,811,804 bytes) https://files.catbox.moe/w72nsv.png
    • AVIF; as above but quality 30 (31,238 bytes) https://files.catbox.moe/w2k2eo.avif
    • JPEG XL could not go below ~36KB even at quality 0 when using my available encoder, so I considered it to fail this test
    • JPEG (including when using MozJPEG, which is generally more efficient than “normal” JPEG) and WebP could only hit the target file size by looking garbage, so I considered them to fail this test out of hand

    (Ideally I would now compare this image at some of the other, higher bpp targets but I am le tired.)

    It looks like interesting research for low bit rate / low bpp compression techniques and is probably also more exciting for anyone in the “AI compression” scene, but I’m not convinced about “Intel Just Changed Computer Graphics Forever!” as the video title.


    As an aside, every image in the supplied dataset looks weird to me (even the ones marked as photos), as though it were AI-generated or AI-enhanced or something - not sure if the authors are trying to pull a fast one or if misuse of generative AI has eroded my ability to discern reality 🤔


    edit: to save you from JPEG XL hell, here’s the JPEG XL image which you probably can’t view, but losslessly re-encoded to a PNG: https://files.catbox.moe/8ar1px.png

    • WereCat@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 hours ago

      Thanks for the analysis and comparisons. I agree that the spotlight was too much on the image compression which is not really the main advantage of this rendering technique.

      My main takeaway from this video was not that this should be used for image compression but that this is viable for 3D scene rendering in real time. Yes, the compression is impressive vs JPEG but there are better image formats with better compression… however this technique is done in real time so the processing speed is viable for 3D scene rendering.

      I feel like most people are comparing only quality vs file size but speed should also be a big factor in those comparisons.

    • KingRandomGuy@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      9 hours ago

      I’ll need to give this a read, but I’m not super sure what’s novel here. The core idea sounds a lot like GaussianImage (ECCV '24), in which they basically perform 3DGS except with 2D gaussians to fit an image with fewer parameters than implicit neural methods. Thanks for the breakdown!

  • Hadriscus@jlai.lu
    link
    fedilink
    English
    arrow-up
    12
    ·
    1 day ago

    Is this the future of image compression ? they compare it to a JPEG of equivalent size and it’s much better, but JPEG is ancient. It’s far from being a good baseline. Compare it to something modern like JPEG-XL or webP