1U mini PC for AI?

nagaram@startrek.website · 2 months ago

1U mini PC for AI?

ZeDoTelhado@lemmy.world · edit-2 2 months ago

I have a question about ai usage on this: how do you do this? Every time I see ai usage some sort of 4090 or 5090 is mentioned, so I am curious what kind of ai usage you can do here

teslasdisciple@lemmy.ca · 2 months ago

I’m running ai on an old 1080 ti. You can run ai on almost anything, but the less memory you have the smaller (ie. dumber) your models will have to be.

As for the “how”, I use Ollama and Open WebUI. It’s pretty easy to set up.

kata1yst@sh.itjust.works · edit-2 2 months ago

Similar setup here with a 7900xtx, works great and the 20-30b models are honestly pretty good these days. Magistral, Qwen 3 Coder, GPT-OSS are most of what I use

ZeDoTelhado@lemmy.world · 2 months ago

I tried a couple of times with Jen ai and local llama, but somehow does not work that well for me.

But at the same time i have a 9070xt, so, not exactly optimal

chaospatterns@lemmy.world · 2 months ago

Your options are to run smaller models or wait. llama3.2:3b fits on my 1080 Ti VRAM and is sufficiently fast. Bigger models will get split between VRAM and RAM and run slower but it’ll work.

Not all models are Gen AI style LLMs. I run GPU based speech to text models on my GPU too for my smart home.

nagaram@startrek.website · 2 months ago

With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.

ZeDoTelhado@lemmy.world · 2 months ago

So is not on this rack. OK because for a second I was thinking somehow you were able to run ai tasks with some sort of small cluster.

I have nowadays a 9070xt on my system. I just dabbled on this, but until now I havent been that successful. Maybe I will read more into it to understand better.

nagaram@startrek.website · edit-2 2 months ago

Ollama + Gemma/Deepseek is a great start. I have only ran AI on my AMD 6600XT and that wasn’t great and everything that I know is that AMD is fine for gaming AI tasks these days and not really LLM or Gen AI tasks.

A RTX 3060 12gb is the easiest and best self hosted option in my opinion. New for <$300 and used even less. However, I was running with a Geforce 1660 ti for a while and thats <$100

teslasdisciple@lemmy.ca · 2 months ago

deleted by creator