A self hosted LLM runs AI on your own hardware, so your data never leaves the building, and once the card is paid for it costs only a few cents an hour to run. Cloud AI is faster, smarter, and has nothing to maintain, but your words go to someone else's servers and the bill never stops. For most small businesses the honest answer in local AI vs cloud AI is that you will use both.
This is a trade, not an upgrade. Below is a fair look at each side, with the real numbers, so you can match the choice to what your business actually needs.
What a self hosted LLM gives you (and what it costs)
A self hosted LLM is an AI model running on a computer you own, in your office, with nothing sent to a third party. The appeal is simple. Your data stays put, and after the upfront hardware cost the running cost is tiny. A graphics card pulling around 200 watts costs roughly three cents an hour to run at a power rate of $0.1527 per kWh. So "expensive to run at home" is mostly a myth.
The honest cost lives elsewhere. The card itself is the real money, and the time you spend setting it up and keeping it behaving is the part most guides skip. This is not "download an app." It is drivers, hardware, and settings you did not know existed. The electricity is cheap. The tinkering is not free.
What cloud AI gives you (and what it costs)
Cloud AI, like the major subscription assistants, is faster and easier, and the models are simply better than anything you can fit on a used card at home. You pay a monthly fee, you type, it works, and someone else handles the hardware and the updates. For most small businesses, most of the time, that is the correct answer, and it is worth saying plainly.
The cost is twofold. First, the bill never stops. A common reference point is a subscription around $200 a month, which is $2,400 a year, every year. Second, your words go to their servers. For a lot of work that is completely fine. For the conversations you specifically do not want logged on a machine you will never see, it is not. That is the gap private AI for business is built to close.
The setup: a mini PC, an eGPU dock, and a used GPU
A practical private AI for business build is three pieces. A mini PC that is small, quiet, and lives on the desk. An eGPU dock, which is an external enclosure that lets a desktop graphics card talk to the little PC over one cable. And a used GPU to put in the dock, because a brand-new card with enough memory to run a real model costs as much as a car payment, and used is where the actual value sits.
The number that matters most is video memory, or VRAM, because the whole model has to fit inside it or performance falls off a cliff. A used card with enough VRAM to run a useful model is the heart of the build. The rest is getting the dock recognized, loading the model, and confirming it is actually using the GPU and not quietly running on the processor, which is slow enough that you stop using it.
Which one wins for your business
It comes down to your real constraint. If you need speed, ease, and the smartest possible answers, cloud is the right call most of the time, and a used card under a desk does not replace a frontier model. If your real constraint is keeping data in the building, or you run AI heavily enough that a recurring bill stings, a self hosted LLM earns its place.
On the dollars, a local setup is already cheaper than a $200-a-month subscription if you only count the sticker prices and you use it enough. What is harder to price is your time, and the months of real use that prove whether the privacy keeps mattering once the novelty fades. Many businesses land on both: cloud for the heavy, public-facing work, and a local box for the private conversations. You do not have to pick a team.
Companion video on The Chronicler: "Local AI vs cloud AI for a small business" walks through the actual build, the misfires, and the honest comparison as it happened.
Want private AI on your own hardware?
If you have read this far and the privacy side is your real constraint, the build above is exactly the kind of thing we set up. We handle the mini PC, the eGPU dock, the used GPU, the drivers, and the model, and we hand you a working local AI server. Your hardware, your data, nothing sent to a third party.
FAQ
Is a self hosted LLM cheaper than a cloud AI subscription?
On running costs, usually yes. A card pulling around 200 watts costs roughly three cents an hour to run at a power rate of $0.1527 per kWh. Against a $200-a-month cloud subscription, the hardware pays for itself if you use it a lot. The real cost is the card up front and the hours spent setting it up and keeping it working.
Is a self hosted LLM as smart as ChatGPT or Claude?
No. The models you can run on a used graphics card at home are not as capable as the frontier models behind a cloud subscription. For private back-and-forth and routine work, a local model is plenty. For hard reasoning or production code, cloud is still the better tool. Anyone claiming a used card replaces a frontier model is overselling it.
What hardware do I need to run private AI for business?
A common setup is three pieces: a small mini PC, an external GPU dock that connects a desktop graphics card over one cable, and a used GPU with enough video memory (VRAM) to fit the model. VRAM is the number that matters most, because the whole model has to fit in it. A used card delivers the most value here.
Should a small business choose local AI or cloud AI?
It depends on your real constraint. If you need speed, ease, and the smartest possible answers, cloud is the correct choice most of the time. If you need your data to stay in the building, or you run AI heavily enough that a recurring bill stings, a self hosted LLM is worth the setup. Many businesses end up using both.