Convenient LLMs for Home

For a while, I wanted to experiment with LLMs on my homelab but didn’t like the overhead of a GPU machine or the slowness that CPU processing brings. I also wanted to make everything convenient in the long run: updates had to be automated and if my OS died, rebuilding it would have to be quick and simple.

Running NixOS on my gaming computer with WSL seemed like the ideal solution. However, I ran into several challenges.

  • Concerns about my vram being locked to LLMs.
  • WSL shutting down automatically. Microsoft does not support WSL running when you aren’t actively utilizing it.
  • NixOS with WSL did not support Nvidia right out of the box.
  • It was not worth it to me to manage a separate Ubuntu machine that would require reconfiguring everything from scratch.

I spent a few weeks hacking at it and have now solved the blocks.

  • Ollama unloads models by default if they’ve not been used in the last 5 minutes.
  • WSL starts automatically, and remains active.
  • Configure Nvidia Container Toolkit on WSL.
  • Ollama Container configured for NixOS.
  • NixOS manages the configuration of the entire system, so rebuilding is easy.
  • The NixOS flake that I have is already configured to automatically update, which my WSL system will inherit.

Although there are some general information, this is heavily NixOS-focused. I heavily rely on Tailscale for my own networking convenience, so there are also some optional Tailscale steps.


Live Configuration that I actively use at Home:

Just for reference


Force WSL not to stop

I found a fix for the biggest problem with an OS-independent fix. This github postis a good place to start. If you are using Ubuntu on WSL, you can run

.

1
wsl --exec dbus-launch true

will launch wsl, and keep it running. You can set up a basic task on Windows Task Scheduler that will automatically run this command when at startup is pressed. Set it to run while the user is logged out.

I found that this didn’t work for NixOS with WSL, as the --exec options seemed to have issues. So I set it like this:

1
wsl.exe dbus-launch true

For NixOS, this means that the shell will run in the background. This is less ideal than --exec on Ubuntu but I’ll take what I can.


Installation of NixOS on WSL

NixOS meets most of my long-term convenience needs. NixOS allows you to configure the entire system, including nvidia, networking and containers. This makes it easy to re-deploy everything. Also, my NixOS Flake (19459177) is already configured for automatic weekly updates via a github command, and all of my NixOS hosts have been configured to automatically pull these updates and rebuild them. My NixOS will be able inherit these benefits.

There are alternative ways to achieve this. If you’d like, there are other ways to automate updates on a single NixOS machine.

Follow the steps in the article to get started. Nixos WSL github :

  1. Enable WSL, if you haven???t already done so:

    Download nixos.wsl

    1
     wsl --install --no-distribution

  2. The latest releaseDouble-click on the file you downloaded (requires a WSL>=2.4.4).

  3. Now you can run NixOS.

Set it as default.


Basic NixOS Configuration

Enter WSL and navigate /etc/nixos/to configure NixOS. You will find a configuration.nix which contains the configuration for the entire system. It is very basic, but we will add some basics to make it easier. You’ll have to use Nano until the first rebuild has been completed. Tailscale is a networking tool I use. It’s not required.

123456789
environment.systemPackages=[    pkgs.vim    pkgs.git    pkgs.tailscale    pkgs.docker];services.tailscale.enable=true;wsl.useWindowsDriver=true;nixpkgs.config.allowUnfree=true;

Now Run

1
sudo nix-channel --update

et

Follow the link to log in to Tailscale.


Configuring NixOS to the Nvidia Container Toolkit.

Current Nixoson WSL doesn't support nvidia or the nvidia toolkit as standard NixOS does, so I needed to make some changes to make it work.

The nvidia toolkit works with these fixes and you can interact as you would expect. (So far as I have tested, things like nvidia-smi) However, built-in NixOS modules relying on Nvidia, such as Nixos services.ollama do not work. Each of these services will need their own patches in order to connect to Cuda correctly, and I'm not working on them since I only use container gpu workloads.

This is the configuration that I used to get it running:

Services

1234567891011121314151617181920212223242526272829303132333435
[196590[xserver]VideoDrivers = ["nvidia"];HardwareNvidiaopen = true;environment.sessionVariables =     Cuda_path = "${pkgs.cudatoolkit}";    EXTRA_LDFLAGS = "-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib";    EXTRA_CCFLAGS = "-I/usr/include";    LD_LIBRARY_PATH = [        "/usr/lib/wsl/lib"        "${pkgs.linuxPackages.nvidia_x11}/lib"        "${pkgs.ncurses5}/lib"];    MESA_D3D12_DEFAULT_ADAPTER_NAME = "Nvidia";;hardware.nvidia-container-toolkit =     enable = true;    mount-nvidia-executables = false;;systemd.services 19659108]=     nvidia-cdi generator =  "Generate nvidia cdi";        description 19659114]= "Generate nvidia cdi";        wantBy = ;        type 19659121= ;        ExecStart {docker = Docker is anddaemon.SettingsFeaturescdi = true;    daemon. Settingscdi-spec-dirs = ["/etc/cdi"];;

Do anther nixos-rebuild switch and restart WSL.|Cdi-specdir = ["/etc/cdi"];;

Do another nixos-rebuild switch and restart WSL.}
You should now be able run nvidia-smi to see your GPU. You’ll have to run your docker containers using --device=nvidia.com/gpu=all in order to connect to the GPUs.

I did not discover these fixes on my own, I pieced this information together from these two github issues:

  • https://github.com/nix-community/NixOS-WSL/issues/454
  • https://github.com/nix-community/NixOS-WSL/issues/578

Configure Ollama Container:

To make networking easier, I’ve set up an example ollama containers and an optional Tailscale Docker container to pair with. Uncomment the code and add your Tailscale domain, then comment out port and networking.firewall for the ollama containers.

Tailscale Serve is already configured to provide the Ollama HTTP API at https://ollama.${YOUR_TAILSCALE_DOMAIN}.ts.netfor those who use the Tailscale Container.

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
virtualisation.oci-containers. {backend = "docker";virtualisation = {  docker = backend = "docker";Virtualisation =   Docker =     enabled = true;    Autopruneenable = true;  ;};systemd.|enable 19659184= false;  Systemd;}tmpfiles.rules= (   "d /var/lib/ollama 0755 root root"   #"d /var/lib/tailscale-container 0755 root root");network; FirewallallowedTCPPorts=   ; Virtualisation=oci-containers.containers =   "ollama" =     image = "docker.io/ollama/ollama:latest";    autostart = true;    environment =       "OLLAMA_NUM_PARALLEL" = "1";    ;    ports = [ 11434 ];    volumes = [ "/var/lib/ollama:/root/.ollama" ];    extraOptions = [      "--pull=always"      "--device=nvidia.com/gpu=all"      "--network=container:ollama-tailscale"    ];  ;  #"ollama-tailscale"=  # image="ghcr.io/tailscale/tailscale:latest";  # autoStart=true;  # environment=  #    "TS_HOSTNAME"="ollama";  #    "TS_STATE_DIR"="/var/lib/tailscale";  #    "TS_SERVE_CONFIG"="config/tailscaleCfg.json";  # ;  # volumes=[  #    "/var/lib/tailscale-container:/var/lib"  #    "/dev/net/tun:/dev/net/tun"  # "$  # (pkgs.writeTextFile  # name="ollamaTScfg";  # text=''  #  #          "TCP":  #            "443":  #              "HTTPS": true  #  # ,  #          "Web":  # #replace this with YOUR tailscale domain   #            "ollama.${YOUR_TAILSCALE_DOMAIN}.ts.net:443":  #              "Handlers":  #                "/":  #                  "Proxy": "http://127.0.0.1:11434"  #  #  #  #  #  # '';  # )  # :/config/tailscaleCfg.json"  # ];  # extraOptions=[  #    "--pull=always"  #    "--cap-add=net_admin"  #    "--cap-add=sys_module"  #    "--device=/dev/net/tun:/dev/net/tun"  # ];  #;;

One morenixos-rebuild switchand your ollama container should be started.


Testing and Networking

If you are using Tailscale,

  • the Tailscale container must be setup before both containers can work.
  • Exec in the Tailscale container. sudo docker exec -it ollama-tailscale sh
  • tailscale up
  • Use this link to add it your Tailnet.
  • Exec in the ollama to pull a model. sudo docker exec -it ollama ollama run gemma3
  • Run the test prompt and verify with nvidia-smi that the GPU is being used.
  • Test api on another Tailscale-connected device:

    12345
    curl https://ollama.$YOUR_TAILSCALE_DOMAIN.ts.net/api/generate -d '  "model": "gemma3",  "prompt": "test",  "stream": false '

If NOT using Tailscale:

  • Exec into the ollama container to pull a model sudo docker exec -it ollama ollama run gemma3
  • Run a test prompt, verify with nvidia-smi on wsl to see that the gpu is in use.
  • Ollama on port 11434 is WSL. Follow this guide to verify that the gpu is in use. Add

    to tldr

    • to expose it on your network
      12
      "OLLAMA_HOST"   =19459086;"OLLAMA_ORIGINS"="*";

    is the ollama configuration of nixos.

  • Use the ifconfig command to find your WSL IP address, which is usually found under eth0.
  • Create firewall rules on Windows using Powershell and Admin rights.
    123
    New-NetFireWallRule -DisplayName 'WSL firewall unlock' -Direction Outbound -LocalPort 11434 -Action Allow -Protocol TCPNew-NetFireWallRule -DisplayName 'WSL firewall unlock' -Direction Inbound -LocalPort 11434 -Action Allow -Protocol TCP
  • Windows Powershell with admin again:

    1
    netsh interface portproxy add v4tov4 listenport=11434 listenaddress=0.0.0.0 connectport=11434 connectaddress=$WSL-IP-ADDRESS

    Replace $WSL-IPADDRESS.

  • Using your Windows’ LAN address, you should now be able access ollama. http://192.168.1.123:11434
  • Use your Windows LAN address to test the api.

    12345
    curl http://WINDOWS-LAN-IP:11434api/generate -d '  "model": "gemma3",  "prompt": "test",  "stream": false '


  • Done!

    You can now connect your Ollama API anywhere you like, for example Open-WebUI.

    www.aiobserver.co

    More from this stream

    Recomended