• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    6 hours ago

    I just meant for mass inference serving.

    Yeah, I haven’t seen much in the way of bitnet training savings yet, like regular old QAT. It does appear that Deepseek is finetuning their MoEs in a 4-bit format now, though.