The scheduling of OpenCL devices on most typical systems can be optimized using the “OpenCL scheduling profile” settings. However, if your system is equipped with a variety of GPUs you might want to set the relative device priority manually. To do so you need to select the “default” scheduling profile and do your settings in the “opencl_device_priority” configuration parameter.
It is important to understand how darktable uses OpenCL devices. Each processing sequence of an image – to convert an input to the final output using a certain history stack – is run in a so called pixelpipe. There are four different types of pixelpipe in darktable. One type is responsible to process the center image view (or full view) in darkroom mode, another pixelpipe processes the preview image (navigation window) top left in darkroom mode. Of each of these two pixelpipe there can be one at a time – with the full and the preview pixelpipe running in parallel. In addition there can be multiple parallel pixelpipes doing file exports and there can be multiple parallel pixelpipes generating thumbnails. If an OpenCL device is available darktable dynamically allocates it to one specific pixelpipe for one run and releases it afterwards.
The computational demand depends a lot on the pixelpipe type. Preview image and thumbnails have a low resolution and can be processed quickly; center image view is more demanding, let alone the pixelpipe doing a file export.
Configuration parameter “opencl_device_priority” holds a string with the following structure:
a,b,c.../k,l,m.../o,p,q.../x,y,z...
Each letter represents one specific OpenCL device. There are four fields in the parameter string separated by a slash, each representing one type of pixelpipe. “a,b,c...” defines the devices that are allowed to process the center image (full) pixelpipe. Likewise devices “k,l,m...” can process the preview pixelpipe, devices “o,p,q...” the export pixelpipes and finally devices “x,y,z...” the thumbnail pixelpipes. An empty field means that no OpenCL device may serve this type of pixelpipe.
darktable has an internal numbering system, where the first available OpenCL device will receive number “0”. All further devices are numbered consecutively. This number together with the device name is displayed when you start darktable with “darktable -d opencl”. You can specify a device either by number or by name (upper/lower case and whitespace do not matter). If you have more than one device – all with the same name – you need to use the device numbers in order to differentiate them.
A device specifier can be prefixed with an exclamation mark “!”, in which case the device is excluded from processing this pixelpipe. You can also give an asterisk “*” as a wildcard, representing all devices not mentioned explicitly before in that group.
Sequence order within a group matters. darktable will read the list from left to right and whenever it tries to allocate an OpenCL device to a pixelpipe it will scan the devices in that order, taking the first free device it finds.
If a pixelpipe process is about to be started and if all GPUs in the corresponding group are busy, darktable automatically processes the image on the CPU by default. You can enforce GPU processing by prefixing the list of allowed GPUs with a plus sign “+”. darktable will not use the CPU but rather suspend processing until the next allowed OpenCL device is available.
darktable's default setting for “opencl_device_priority” is:
*/!0,*/*/*
Any detected OpenCL device is allowed to process our center view image. The first OpenCL device (0) is not allowed to process the preview pixelpipe. As a consequence, if there is only one GPU owned by your system, preview pixelpipe will always be processed on CPU, keeping your single GPU exclusively for the more demanding center image view. This is a reasonable setting for most systems. No restrictions apply to export and thumbnail pixelpipes.
The default is a good choice if you have only one device. If you have several devices it forms a reasonable starting point. However, as your devices might have quite different levels of processing power, it makes sense to invest a few thoughts and optimize your priority list.
Here is an example. Let's assume we have a system with two devices, a fast Radeon HD7950 and an older and slower GeForce GTS450. darktable (started with “darktable -d opencl”) will report the following devices:
[opencl_init] successfully initialized. [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable: [opencl_init] 0 'GeForce GTS 450' [opencl_init] 1 'Tahiti' [opencl_init] FINALLY: opencl is AVAILABLE on this system.
So the GeForce GTS 450 is detected as the first device; the Radeon HD7950 ('Tahiti') as the second one. This order will normally not change unless the hardware or driver configuration is modified. But it's better to use device names rather than numbers to be on the safe side.
As the GTS450 is slower than the HD7950, an optimized opencl_device_priority could look like:
!GeForce GTS450,*/!Tahiti,*/Tahiti,*/Tahiti,*
The GTS450 is explicitly excluded from doing the center image pixelpipe; this is reserved to “all” other devices (i.e. the HD7950/Tahiti). Completely the opposite for our preview pixelpipe. Here the Tahiti is excluded, so that only the GTS450 will be allowed to do the work.
For file export and thumbnail generation we want all hands on deck. However, darktable should first look if device Tahiti is free, because it's faster. If that's not the case, all other devices – in fact only the GTS450 – are checked.