netcl wiki
concepts

no_grad

no_grad

Status: Public API in netcl.autograd.no_grad, set_grad_enabled

no_grad is the context manager (and the module-level function) that disables autograd recording. Inside a no_grad block, op calls do not add new nodes to the Tape; the resulting tensors have requires_grad=False even if their inputs had requires_grad=True.

no_grad is the standard tool for evaluation, weight updates, and any other "we are not training right now" code path. The Trainer wraps the evaluation loop in no_grad automatically.

Overview

The recording machinery in netcl is driven by a single module-global flag: _GRAD_MODE. The flag is True by default. no_grad sets it to False for the duration of the context; set_grad_enabled(enabled) sets it to the given value.

When the flag is False, apply_op does not insert the new node into the tape. The output tensor's requires_grad is False and its grad_fn is None. The op still runs (the forward computation is unchanged); only the autograd side effects are skipped.

Where It Lives

  • File path: autograd/engine.py.
  • Module path: netcl.autograd.
  • Public re-export: from netcl.autograd import no_grad, set_grad_enabled, is_grad_enabled.

How It Works

class no_grad:
    def __init__(self, enabled=True):
        self.prev = _GRAD_MODE
        self.enabled = enabled

    def __enter__(self):
        global _GRAD_MODE
        _GRAD_MODE = not self.enabled

    def __exit__(self, *args):
        global _GRAD_MODE
        _GRAD_MODE = self.prev

The class is nested so that no_grad() and no_grad(enabled=False) (do nothing) are both supported. The module-level function set_grad_enabled(enabled) is a one-shot version that permanently changes the global flag.

is_grad_enabled() returns the current value of the flag; it is mainly useful for debugging.

Code Example

The standard evaluation loop:

import netcl as nc
import netcl.autograd as ag

model.eval()
correct, total = 0, 0
with ag.no_grad():
    for x, y in test_loader:
        logits = model(x)
        pred = logits.argmax(axis=-1)
        correct += (pred == y).sum().item()
        total += y.shape[0]
print(f"accuracy: {correct / total:.2%}")

The weight-update step (the optimizer step itself is a plain operation that does not need the tape):

with ag.no_grad():
    optimizer.step()
    optimizer.zero_grad()

Temporarily enabling grad inside a no_grad block:

with ag.no_grad():
    # ... evaluation ...
    with ag.set_grad_enabled(True):
        # ... this op is recorded ...
        y = model(x)
    # ... back to no_grad ...

Performance & Trade-offs

  • no_grad saves memory: no Node is allocated, no grad field is kept on the output tensors, no grad_fn closure is built. For a typical inference pass, this is a 20% to 40% memory reduction.
  • no_grad saves compute: the autograd engine never sees the ops, so the backward pass has nothing to walk. The forward is the same speed; the savings come from skipping the backward entirely.
  • Forgetting to wrap the evaluation loop in no_grad is the most common cause of "the model uses 3x more memory than expected" bugs.
  • set_grad_enabled(False) is the global version of no_grad. Use it at the start of a script to disable autograd for the whole run; this is what some inference scripts do.

See also