News

Do-It-Yourself AI, Part 1: Bringing AI In-House

Let's explore how easy it is to host an LLM on your Windows machine.

AI chatbots such as ChatGPT, Grok and Microsoft Copilot have completely changed the way that we work. Even so, these and other AI chatbots are not without their problems. As an example, I tend to use ChatGPT for research purposes (I do not use AI chatbots to write my blog posts), and often I find myself reaching the daily usage limits for these services. Similarly, because ChatGPT is a cloud-based solution it sometimes becomes oversaturated with traffic and slows to a crawl. As such, I have occasionally found myself wondering if there is a way to bring ChatGPT or something similar inhouse and run it on my own hardware as a way of avoiding Internet latency and daily usage limits.

As it turns out, there is, indeed, a way of running an intelligent large language model AI on your own hardware. In fact, it's possible to have a ChatGPT-like experience without even being connected to the Internet (although an initial download is required). In this article series, I will show you how it's done.

Before I get started, there are two things that you need to know. First, what I am about to show you can work on a wide variety of hardware configurations. Having higher end hardware will allow you to run AI models that are more advanced, but there are basic models that will run on relatively modest hardware. I have written a separate blog post on Large Language Model hardware considerations.

One of the things that I discovered as I was preparing to write this blog series was that although having a good GPU tends to result in much better overall performance, you don't absolutely have to have a GPU in spite of what some articles found on the Internet say. During my tests, I ran one of the models on a virtual machine with no GPU and it worked fine (albeit slowly).

The other thing that you need to know before I get into the how-to portion of this blog post is that there are a variety of models to choose from. Some of these models do fine, while others bog you down with a lot of unnecessary information. DeepSeek-R1 for example, is a reasoning model. As such, it takes a long time for the AI model to get to the point and give you an actual answer to your question.

The good news however, is that by using a bit of PowerShell scripting, you can make things a lot better. I have developed a script that gets rid of the unnecessary output. I have also designed my script to act as a full blown GUI chatbot, so that you don't have to worry about interacting with the AI model from the command line.

All of this raises a couple of questions. First, what kind of hardware do you actually need? Second, what is this unnecessary output that my script is filtering?

The AI that is most notorious for producing unnecessary output is called DeepSeek-R1 (there are other models that you can use instead). Like some of the other Ollama models, there are various sizes of the DeepSeek-R1 model available. This means that you can pick a model that is well suited to your own hardware platform. The smallest model contains 1.5 billion parameters and can be run on something as modest as the
Jetson Orin Nano Super. At the opposite end of the spectrum, there is a model containing 671 billion parameters. Needless to say, this model requires some rather extreme hardware. You can find a list of the available DeepSeek-R1 models here. Other models can be found here.

So let's get back to the other question. Why does DeepSeek-R1 produce unnecessary output that needs to be filtered out? DeepSeek-R1 works differently from ChatGPT and similar AI chatbots. Most AI chatbots analyze Web content and then use probabilities to string together a series of words that collectively form an answer to your question. DeepSeek-R1 on the other hand, uses a reasoning model. In other words, it actually uses a “thought process” to come up with an answer to your question. The problem is that this thought process is displayed onscreen and can be quite lengthy. It can sometimes be difficult to figure out where the thought process ends and the answer begins.

To show you what I mean, check out Figure 1. Here, I asked DeepSeek-R1 a simple question: why is the sky blue? As you can see in the figure, DeepSeek-R1 goes through a long reasoning process before getting to the answer. Interestingly, the model mentions having once been a child and having siblings. You will also notice that the reasoning process goes on for quite some time and that the answer to the question does not even appear within the screen capture. For anyone who might be curious, you can see what the answer looks like in Figure 2.

[Click on image for larger view.] Figure 1.This is what DeepSeek-R1's reasoning process looks like.
[Click on image for larger view.] Figure 2.This is what an answer looks like.

So as previously noted, my PowerShell script filters out all of the AI's reasoning, and displays only the answer to the question. The chatbot must still take the time to reason out its answer, but as a user you won't have to worry about reading through the model's ramblings before arriving at an answer. Only the answer itself is displayed by my script.

So now that I have explained why I have designed my PowerShell script the way that I did, I want to show you how it works. In Part 2 of this series, I will show you how to deploy DeepSeek R-1 (or other models) and how to use it in a manner similar to what you have seen in the previous screen captures. In Part 3, I will show you how to capture the DeepSeek-R1 output and then filter that output using PowerShell. Finally, in Part 4, I will show you how to use PowerShell to create a full blown chatbot.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.

Featured

comments powered by Disqus

Subscribe on YouTube