Ollama
Preparation and Requirements
Theoretically you could even run everything on a laptop, but for sure you would face several issues at least after a while. Better is to have a server with installed docker up and running. Also a reverse proxy, and an URL would make things more smooth, but are not mandatory. As we see later in chapter performance, it would be beneficial if your server has a dedicated graphics card, but also that is not a must.
Installation and Configuration
The guys from the „Open WebUI" project made it extremely easy to get your chatbot running. Basically you just create a new docker-compose.yml file like the one in the example below and start the thing as usual by command „docker compose up -d". That’s it, no joke!
services:
chat:
container_name: chat
image: ghcr.io/open-webui/open-webui:ollama
volumes:
- ./ollama:/root/.ollama
- ./open-webui:/app/backend/data
restart: unless-stopped
#ports:
# - 8080:8080
networks:
caddy:
networks:
caddy:
external: true
As you can see in my example file I customized the network configuration, and also configured my reverse proxy caddy to point access to chat.handtrixxx.com to my new container. As you can see in the following screenshot you can click on „Sign up" to create a new user account for yourself as Administrator.
Now, after you logged in, there are just two steps more to do to start your A.I. chats. At first you should go to the admin panel and then at the „Admin settings" to disable registration for other users to avoid other users just create an account on your instance. Then in the settings at the models tab you will have to download one ore more language models. There are plenty to choose from. An overview is available at: https://ollama.com/library . You are done and as you see it does not have to take more than 5 minutes in case you are a bit experienced in docker and setting up tools in general.
Costs
Since everything I introduced and described is based on Open Source software, there are no costs or licensing fees at all. Great, isn’t it? But to say it is completly free is also not completly true, since you have to cover the charges for the server if you do not „have" one anyway 🙂.
Performance
As mentioned before, a dedicated graphics card would speed up the response times of the chatbot trendemously. By running it only on CPU, like i did in my example, every generation of a response took all the CPU power i have (and I have a lot) for some seconds. So the whole thing feels a bit like the early versions of ChatGPT. That’s no drama, but definitly noticeable.
Conclusion
As conclusion i let the openchat language model answer itself to my prompt:


