From 329659b2fba54cebab7bd7fee404b28f28be5de5 Mon Sep 17 00:00:00 2001 From: Mert <101130780+mertalev@users.noreply.github.com> Date: Sat, 3 Feb 2024 09:11:53 -0500 Subject: [PATCH] docs(ml,server): updated hwaccel docs (#6878) --- docs/docs/features/hardware-transcoding.md | 52 +++++++++++----- .../docs/features/ml-hardware-acceleration.md | 59 ++++++++++++++++++- 2 files changed, 95 insertions(+), 16 deletions(-) diff --git a/docs/docs/features/hardware-transcoding.md b/docs/docs/features/hardware-transcoding.md index 0bc5a2f19c..db3d1ba7d6 100644 --- a/docs/docs/features/hardware-transcoding.md +++ b/docs/docs/features/hardware-transcoding.md @@ -4,6 +4,10 @@ This feature allows you to use a GPU to accelerate transcoding and reduce CPU lo Note that hardware transcoding is much less efficient for file sizes. As this is a new feature, it is still experimental and may not work on all systems. +:::info +You do not need to redo any transcoding jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it. +::: + ## Supported APIs - NVENC (NVIDIA) @@ -50,6 +54,40 @@ As this is a new feature, it is still experimental and may not work on all syste 3. Redeploy the `immich-microservices` container with these updated settings. 4. In the Admin page under `Video transcoding settings`, change the hardware acceleration setting to the appropriate option and save. +#### Single Compose File + +Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly. + +For example, the `qsv` section in this file is: + +```yaml +devices: + - /dev/dri:/dev/dri +``` + +You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`: + +```yaml +immich-microservices: + container_name: immich_microservices + image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} + # Note the lack of an `extends` section + devices: + - /dev/dri:/dev/dri + command: ['start.sh', 'microservices'] + volumes: + - ${UPLOAD_LOCATION}:/usr/src/app/upload + - /etc/localtime:/etc/localtime:ro + env_file: + - .env + depends_on: + - redis + - database + restart: always +``` + +Once this is done, you can continue to step 3 of "Basic Setup". + #### All-In-One - Unraid Setup ##### NVENC - NVIDIA GPUs @@ -59,20 +97,6 @@ As this is a new feature, it is still experimental and may not work on all syste 3. Restart the container app. 4. Continue to step 4 of "Basic Setup". -##### Other APIs - -Unraid does not currently support multiple Compose files. As an alternative, you can "inline" the relevant contents of the [`hwaccel.transcoding.yml`][hw-file] file into the `immich-microservices` service directly. - -For example, the `qsv` section in this file is: - -``` -devices: - - /dev/dri:/dev/dri -``` - -You can add this to the `immich-microservices` service instead of extending from `hwaccel.transcoding.yml`. -Once this is done, you can continue to step 3 of "Basic Setup". - ## Tips - You may want to choose a slower preset than for software transcoding to maintain quality and efficiency diff --git a/docs/docs/features/ml-hardware-acceleration.md b/docs/docs/features/ml-hardware-acceleration.md index 2323b468cb..b50185c580 100644 --- a/docs/docs/features/ml-hardware-acceleration.md +++ b/docs/docs/features/ml-hardware-acceleration.md @@ -3,7 +3,11 @@ This feature allows you to use a GPU to accelerate machine learning tasks, such as Smart Search and Facial Recognition, while reducing CPU load. As this is a new feature, it is still experimental and may not work on all systems. -## Supported APIs +:::info +You do not need to redo any machine learning jobs after enabling hardware acceleration. The acceleration device will be used for any jobs that run after enabling it. +::: + +## Supported Backends - ARM NN (Mali) - CUDA (NVIDIA) @@ -14,7 +18,8 @@ As this is a new feature, it is still experimental and may not work on all syste - The instructions and configurations here are specific to Docker Compose. Other container engines may require different configuration. - Only Linux and Windows (through WSL2) servers are supported. - ARM NN is only supported on devices with Mali GPUs. Other Arm devices are not supported. -- The OpenVINO backend has only been tested on an iGPU. ARC GPUs may not work without other changes. +- There is currently an upstream issue with OpenVINO, so whether it will work is device-dependent. +- Some models may not be compatible with certain backends. CUDA is the most reliable. ## Prerequisites @@ -40,10 +45,60 @@ As this is a new feature, it is still experimental and may not work on all syste 2. In the `docker-compose.yml` under `immich-machine-learning`, uncomment the `extends` section and change `cpu` to the appropriate backend. 3. Redeploy the `immich-machine-learning` container with these updated settings. +#### Single Compose File + +Some platforms, including Unraid and Portainer, do not support multiple Compose files as of writing. As an alternative, you can "inline" the relevant contents of the [`hwaccel.ml.yml`][hw-file] file into the `immich-machine-learning` service directly. + +For example, the `cuda` section in this file is: + +```yaml +deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: + - gpu + - compute + - video +``` + +You can add this to the `immich-machine-learning` service instead of extending from `hwaccel.ml.yml`: + +```yaml +immich-machine-learning: + container_name: immich_machine_learning + image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release} + # Note the lack of an `extends` section + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: + - gpu + - compute + - video + volumes: + - model-cache:/cache + env_file: + - .env + restart: always +``` + +Once this is done, you can redeploy the `immich-machine-learning` container. + +:::info +You can confirm the device is being recognized and used by checking its utilization (via `nvtop` for CUDA, `intel_gpu_top` for OpenVINO, etc.). You can also enable debug logging by setting `LOG_LEVEL=debug` in the `.env` file and restarting the `immich-machine-learning` container. When a Smart Search or Face Detection job begins, you should see a log for `Available ORT providers` containing the relevant provider. In the case of ARM NN, the absence of a `Could not load ANN shared libraries` log entry means it loaded successfully. +::: + [hw-file]: https://github.com/immich-app/immich/releases/latest/download/hwaccel.ml.yml [nvcr]: https://github.com/NVIDIA/nvidia-container-runtime/ ## Tips +- If you encounter an error when a model is running, try a different model to see if the issue is model-specific. - You may want to increase concurrency past the default for higher utilization. However, keep in mind that this will also increase VRAM consumption. - Larger models benefit more from hardware acceleration, if you have the VRAM for them.