Ollama

Veröffentlicht: 14.11.2025 | Aktualisiert: 20.03.2026

Sprachmodelle

Da mein Server keine GPU hat, bin ich bei der Auswahl der Sprachmodelle eingeschränkt.

NameTypKontextDateigrößePerformanceQualitätVeröffentlichtHerausgeber
deepseek-r1:8b-llama-distill-q4_K_M????Deepseek (MIT)
deepseek-coder-v2:16b-lite-base-q2_K???⭐⭐?Deepseek (MIT)
deepseek-r1:8b-llama-distill-q4_K_M⭐ (6.2 Tokens/s)⭐⭐Deepseek (MIT)
gemma3:1b⭐⭐⭐⭐ (30.7 Tokens/s)⭐⭐
qwen3.5:9bText, Image256K6.6GB--2026-03Alibaba
qwen3.5:4bText, Image256K3.4GB--2026-03Alibaba
qwen3.5:2bText, Image256K2.7GB⭐⭐ (9.6 Tokens/s)⭐⭐⭐2026-03Alibaba
qwen2.5-coder:3bText32k2.7GB⭐⭐⭐⭐ (38.2 Tokens/s)⭐⭐2025-07Alibaba

\

\

Installation and Configuration

The guys from the „Open WebUI" project made it extremely easy to get your chatbot running. Basically you just create a new docker-compose.yml file like the one in the example below and start the thing as usual by command „docker compose up -d". That’s it, no joke!


services:
  chat:
    container_name: chat
    image: ghcr.io/open-webui/open-webui:ollama
    volumes:
      - ./ollama:/root/.ollama
      - ./open-webui:/app/backend/data
    restart: unless-stopped
    #ports:
    #  - 8080:8080
    networks:
      caddy:
networks:
  caddy:
    external: true

As you can see in my example file I customized the network configuration, and also configured my reverse proxy caddy to point access to chat.handtrixxx.com to my new container. As you can see in the following screenshot you can click on „Sign up" to create a new user account for yourself as Administrator.

Now, after you logged in, there are just two steps more to do to start your A.I. chats. At first you should go to the admin panel and then at the „Admin settings" to disable registration for other users to avoid other users just create an account on your instance. Then in the settings at the models tab you will have to download one ore more language models. There are plenty to choose from. An overview is available at: https://ollama.com/library . You are done and as you see it does not have to take more than 5 minutes in case you are a bit experienced in docker and setting up tools in general.

Costs

Since everything I introduced and described is based on Open Source software, there are no costs or licensing fees at all. Great, isn’t it? But to say it is completly free is also not completly true, since you have to cover the charges for the server if you do not „have" one anyway 🙂.

Performance

As mentioned before, a dedicated graphics card would speed up the response times of the chatbot trendemously. By running it only on CPU, like i did in my example, every generation of a response took all the CPU power i have (and I have a lot) for some seconds. So the whole thing feels a bit like the early versions of ChatGPT. That’s no drama, but definitly noticeable.

Conclusion

As conclusion i let the openchat language model answer itself to my prompt: