netcl wiki
concepts

DeviceManager

DeviceManager

Status: Public API in netcl.core.device.manager.DeviceManager

DeviceManager is netcl's central registry of OpenCL devices. It enumerates the available platforms and devices, builds a profile for each, and provides the "default" device the rest of the stack uses when no explicit device is requested.

The manager is the entry point for the rest of the runtime: the Tape, the BufferPool, the DataLoader, and the AMP module all look up the device through the manager. The user code typically interacts with it exactly once, at startup, to pick the device they want to run on.

Overview

DeviceManager is a singleton: there is exactly one instance per process, accessed via nc.device.manager (or from netcl.core.device import manager). The singleton is constructed lazily on first access, after the OpenCL platforms have been enumerated.

The manager holds three pieces of state:

  • devices — the list of all cl.Devices the process can see.
  • profiles — the per-device DeviceProfile records (FP16 support, subgroup support, SVM support, etc.).
  • default — the device the rest of the stack uses when no explicit device is requested. Set by pick_default() or by the NETCL_PLATFORM_FILTER environment variable.

Where It Lives

  • File path: core/device.py (class DeviceManager).
  • Module path: netcl.core.device.
  • Public re-export: from netcl.core.device import manager.

How It Works

from netcl.core.device import manager

# Enumerate all devices.
for d in manager.devices:
    print(d.name, d.type)

# Inspect a device profile.
profile = manager.profile(manager.devices[0])
print(profile.has_fp16)         # True / False
print(profile.has_subgroups)    # True / False
print(profile.has_svm_support)  # True / False

# Pick the device for the rest of the program.
ctx, queue = manager.default()  # uses manager.default

The default method is the standard way to start a netcl program. It returns a (context, queue) tuple that the rest of the runtime accepts. Internally:

  1. The manager looks at the NETCL_PLATFORM_FILTER env var. If set, it restricts the candidate set to devices whose platform name contains the filter substring.
  2. Of the remaining devices, it picks the first GPU (preferred) or, if no GPU is present, the first CPU device.
  3. It creates a cl.Context from the chosen device and a cl.CommandQueue for it. The queue is the one the rest of the runtime uses.

The manager is re-entrant: calling default() a second time returns the same (context, queue) pair.

Code Example

A minimal program:

from netcl.core.device import manager
import netcl as nc

# Pick the default device.
ctx, queue = manager.default()

# Use it.
x = nc.Tensor.zeros((4, 1024), dtype="float32",
                    context=ctx, queue=queue)
y = nc.relu(x)

Picking a specific device by name:

manager.default = manager.devices[2]    # third device
ctx, queue = manager.default()

Picking a device by environment variable:

NETCL_PLATFORM_FILTER=NVIDIA python my_training.py

The manager will only consider devices whose platform name contains "NVIDIA". This is the standard way to run a multi-GPU box in a reproducible way.

Performance & Trade-offs

  • default() is cached. The first call pays the OpenCL enumeration cost (a few hundred microseconds); subsequent calls are essentially free.
  • Switching manager.default mid-process is supported but dangerous: the BufferPool and the Tape are tied to the original context. Reset them too if you switch.
  • The NETCL_PLATFORM_FILTER env var is a substring match, not a regex. Use "NVIDIA" to match all NVIDIA devices and "Intel(R) OpenCL" to match Intel's CPU OpenCL.
  • The manager is not multi-process safe. In a DistributedDataParallel setup, each replica has its own manager and its own device; do not share a manager across processes.

See also