DeepSeek ditches Nvidia for Huawei chips in V4 launch

inari@piefed.zip · edit-2 22 days ago

DeepSeek ditches Nvidia for Huawei chips in V4 launch

[object Object]@lemmy.ca · edit-2 22 days ago

Even with a bitnet, it’s almost definitely better to train on a high precision float then refine down to bits.

I would expect bitnet to require more layers for equivalent quality too.

brucethemoose@lemmy.world · 22 days ago

I just meant for mass inference serving.

Yeah, I haven’t seen much in the way of bitnet training savings yet, like regular old QAT. It does appear that Deepseek is finetuning their MoEs in a 4-bit format now, though.

DeepSeek ditches Nvidia for Huawei chips in V4 launch

DeepSeek ditches Nvidia for Huawei chips in V4 launch

Attention Required! | Cloudflare