I picked up an old bitcoin mining rig off my cousin for cheap. Four GTX 1660 Supers, a 1660 Ti, and an RTX 3060 — all crammed into an ASUS Z270-P board with 64GB of RAM.
Most people would have parted it out. I turned it into a local AI cluster.
It now runs Ollama serving multiple LLM models simultaneously — a 30B parameter model for my primary AI agent (Mr. Peepers - From SNL Chris Kattan skit) and a 14B model for my marketing agent (Velma - Scooby Dew Nerd with glasses [I don't know why I went there]). No API keys, no cloud bills, no data leaving my network.
The total GPU VRAM across all six cards is about 42GB. Not A100 territory, but more than enough to run quantized models that rival GPT-3.5 in practical use cases.
What it costs me: electricity. What it would cost on cloud: hundreds per month and time. lots of time
The lesson? Used gaming hardware is the best-kept secret in local AI. These cards were designed for rendering frames at 144fps — turns out they're also great at running transformer inference.
#AI #HomeLab #GPU #LocalAI #MachineLearning #DIY #Ollama