DeepSeek API isn’t free, and to use Qwen you’d have to sign up for Ollama Cloud or something like that
To use Qwen, all you need is a decent video card and a local LLM server like LM Studio.
Local deploying is prohibitive
There’s a shitton of LLM models in various sizes to fit the requirements of your video card. Don’t have the 256GB VRAM requirements for the full quantized 8-bit 235B Qwen3 model? Fine, get the quantized 4-bit 30B model that fits into a 24GB card. Or a Qwen3 8B Base with DeepSeek-R1 post-trained Q 6-bit that fits on a 8GB card.
There are literally hundreds of variations that people have made to fit whatever size you need… because it’s fucking open-source!
To use Qwen, all you need is a decent video card and a local LLM server like LM Studio.
There’s a shitton of LLM models in various sizes to fit the requirements of your video card. Don’t have the 256GB VRAM requirements for the full quantized 8-bit 235B Qwen3 model? Fine, get the quantized 4-bit 30B model that fits into a 24GB card. Or a Qwen3 8B Base with DeepSeek-R1 post-trained Q 6-bit that fits on a 8GB card.
There are literally hundreds of variations that people have made to fit whatever size you need… because it’s fucking open-source!