Qwen3 30b a3b, for example, is brilliant for its size and i can run it on my 8 GB VRAM + 32 GB RAM system at like 20 tokens per second. For lower powered systems, Qwen3 4b + a search tool is also insanely great for its size and can fit in less than 3 GB of RAM or VRAM at Q5 quantization
Qwen3 30b a3b, for example, is brilliant for its size and i can run it on my 8 GB VRAM + 32 GB RAM system at like 20 tokens per second. For lower powered systems, Qwen3 4b + a search tool is also insanely great for its size and can fit in less than 3 GB of RAM or VRAM at Q5 quantization