• 4 Posts
  • 1.92K Comments
Joined 2 years ago
cake
Cake day: March 22nd, 2024

help-circle





  • brucethemoose@lemmy.worldtoGames@lemmy.worldEnd of an era?
    link
    fedilink
    English
    arrow-up
    50
    arrow-down
    2
    ·
    edit-2
    11 hours ago

    A funny anecdote from outside gaming land:


    Sony mirrorless cameras, even MEGA expensive ones, have this quirk where they can’t film 30P HEVC video.

    The A6700 for instance, can record 24, 30, 60 h.264, but only 24 or 60 h.265. The A7C II is particularly handicapped, with only 24 h.265.

    The $7000 A1 flagship is the same. It doesn’t make any sense; all their hardware supports it.

    Why?

    Well, people kept asking Sony, and someone finally got a response:

    https://us.community.sony.com/s/question/0D54O00007B6m2ISAR/why-sony-a7s-iii-doesnt-support-25-or-30-fps-in-h265-codech265-is-available-for-24-50-60-100-120-fps-but-not-the-once-ive-mentioned-aboveplease-sony-update-this-h265-is-going-to-get-more-and-more-popular-with-new-powerful-laptops-coming?language=en_US

    Hello, Sony’s commitment to customer satisfaction is our top priority. The reason for that is that we determined the current product specifications from our analyses of target users and available technologies. For XAVC HS, we envision users enjoying HLG movies by connecting the camera to an HDR (HLG) compatible BRAVIA TV over HDMI, and 60p is what we view as the default setting for playback with it. To record movies in higher quality in 4K/30p, we would recommend selecting XAVC S and XAVC S-I, 10-bit 4:2:2 sampling, either Long GoP and All-I codec. Could you tell us how creators such as yourself want to use HEVC at 30p so we can provide feedback to our design team?


    I found that oddly enlightening.

    “Well, it’s not optimal for a BRAVIA TV over HDMI, so why would users want this? Why is it worth it for us to waste half a kilobyte for this option in the UI?” Sony cannot fathom anything outside that.

    It makes that whole list in OP’s post make total sense. It’s not even cynicism on their part; with that mindset, what would they need a disk player for? What value is Bluepoint is a business review?

    They literally cannot understand why PlayStation users would want any of those things. It’s outside the domain.








  • Because, with a cursory glance, it doesn’t always look like spam.

    A classic example I see starts with “I built a…” in the title, has a wall of text in the description, and actually promises to do something interesting. Only upon deeply inspecting the code (or trying it yourself)… it becomes clear it’s hallucinated nonsense.

    And it’s not always malicious, either. A lot of devs get deep in AI psychosis and truly believe they’ve building something revolutionary with their vibe coding agent.

    And sometimes these projects are interesting!


    Hence it would be EXTREMELY helpful to have this tagged, up front. To me, an [AIP] is gigantic red flag to warrant extra caution, but not necessarily a smoking gun, and would help “regular” homebuilt projects stand out from the vibecoded ones.

    And [AIT] is just nice to have. Some users don’t want to see any AI in /c/selfhosted, period. Hence AI discussion posts get reported as spam because people interpret it as spam, and this would clarify that nebulous distinction, while giving those users a way to easily filter AI posts out.




  • Failure to provide a disclosure after using the tag would mean removing the post. It could be locked, but I would have to assume the majority of the spam-type postings that happened to make it past the rule 7 criteria are the ones who will not provide the requested disclosure. I think it makes for a good filter this way, but please comment if you think otherwise.

    Sounds reasonable to me!

    I think the major choice is for y’all (the mod team), as enforcing a tagging system is going to increase the moderation workload. Though I guess it would cut back on AI reports, like you said.

    I have no recommendations for an existing bot.


    …You could use an embeddings model for a little extra automation though.

    This is a pre-LLM thing, but basically you could feed a script new untagged posts, use a embeddings model to compare the text of their bodies to a keyword (“AI”?), and spit out a number as a rough “similarity” metric. If it’s above a certain threshold (eg if the post seems AI related), send a message to the moderation team to check it, or maybe even post a rules reminder in the comments.

    And FYI, embeddings models are tiny, so it doesn’t need special resources to run or anything.



  • brucethemoose@lemmy.worldtoSelfhosted@lemmy.worldSelfhosted & AI
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    3 days ago

    They 100% do. They’re probably serving “naive” FP8 via VLLM, which is worse than you’d think, especially if they flip on the awful FP8 KV cache.


    In a local quant, you can stop quantized models from falling apart at higher CTX by leaving the attention heads at a higher quantization. As an example, with MiMo 2.5, I have all the MoE MLP layers at IQ3_KT, the dense experts at Q6K, but all the attention layers at Q8_0.

    For Qwen 27B, I’m still experimenting, but leaning towards IQ4_KT for the MLPs, Q6K for attention, and Q8_0 for the small, very sensitive KV heads. Or a similar scheme as an exl3 quant.


    That being said, sometimes even unquantized models fall apart in certain long context scenarios because the max advertised context is a lie. You just have to test them and see, but Qwen has certainly done this in the past.



  • brucethemoose@lemmy.worldtoSelfhosted@lemmy.worldSelfhosted & AI
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    3 days ago

    It’s drops off, but not as much as you’d think.

    MiMo uses 5:1 SWA, so its long-context compute doesn’t increase as catastrophically as older models. That, and most of the “slowness” comes from the MoE layers being on CPU (whereas the attention layers that get heavier at high context are all on the 3090).

    That’s the beauty of these MoEs: they’re just the right size for the “compute-lite” parts to stay in CPU RAM.

    I will measure it tomorrow. It is a constant ~9-10TPS for short queries, but definitely slower near my current max context of 85K.


    And do you mean prompt compaction? I don’t automate that; when I use that particular model, I tend to use it in Mikupad, aka “raw” notepad mode, and manipulate the context directly. This is so I can do things like chop out conversations, pick different tokens from the logprobs, or edit its own replies/thinking and continue mid reply.

    I like manually handling this because, being a local model, prompts are cached. Streaming starts quickly if most of the prompt stays cached, which is actually a really nice advantage over APIs.