Bug description
transformers4rec.torch.model.base.Model.load (rewritten in PR #808, commit 41b14d7b) and its accompanying test call torch.load(...) without passing weights_only=True:
- Test call site:
|
state_dict = torch.load(os.path.join(tmpdir, "t4rec_model_class.pt")) |
- Docstring guidance in
Model.load directs users toward the same pattern.
requirements/pytorch.txt currently pins only torch>=1.0:
https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/8bf122f5dcb39feecfc6dabde734d79c2d1c4380/requirements/pytorch.txt
On installs of PyTorch <2.6 (which the pin permits), torch.load defaults to weights_only=False. That is unrestricted pickle deserialization: any __reduce__ on the file executes on load, which is the same class of issue as CVE-2025-32434. PyTorch ≥2.6 flipped the default to weights_only=True, but T4Rec's pin does not enforce that.
This is particularly relevant because PRs #802/#804 were framed as pickle-security hardening — the hardening does not cover this code path. When the allowlist-based utils/serialization.py was removed in PR #808, torch.load became the primary load path again, with its version-dependent default.
Steps/Code to reproduce bug
Benign repro (side effect is a temp file; no persistent change):
import os, tempfile, torch
class Exploit:
def __reduce__(self):
return (os.system, ("touch /tmp/t4rec_rce_demo",))
with tempfile.NamedTemporaryFile(suffix=".pt", delete=False) as tf:
torch.save(Exploit(), tf.name)
path = tf.name
torch.load(path, weights_only=False) # runs os.system under pickle's __reduce__
assert os.path.exists("/tmp/t4rec_rce_demo")
torch.load(path, weights_only=True) refuses the same payload with UnpicklingError:
Weights only load failed. [...] Trying to load unsupported GLOBAL posix.system whose module posix is blocked.
Expected behavior
- Every internal
torch.load call passes weights_only=True.
- Tutorials and docstrings are updated to use the same pattern.
requirements/pytorch.txt bumps the floor to torch>=2.6 so the safe default is always in effect.
Environment details
- Transformers4Rec:
main @ 8bf122f5
- PyTorch: the pin is
>=1.0 (see requirements/pytorch.txt); installs on <2.6 are the affected population.
- Python: any
Additional context
Two-part remediation:
-
In code: add weights_only=True to every torch.load(...) call in the library, tests, and example notebooks. This is safe because the surrounding code already expects a state_dict (plain tensor dict), which weights_only=True supports.
-
In requirements: bump torch>=1.0 to torch>=2.6. The project's other dependencies (HF transformers, Merlin) already require a recent PyTorch, so this bump is effectively a no-op for supported users.
Happy to send a PR covering both parts.
Bug description
transformers4rec.torch.model.base.Model.load(rewritten in PR #808, commit41b14d7b) and its accompanying test calltorch.load(...)without passingweights_only=True:Transformers4Rec/tests/unit/torch/model/test_model.py
Line 425 in 8bf122f
Model.loaddirects users toward the same pattern.requirements/pytorch.txtcurrently pins onlytorch>=1.0:https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/8bf122f5dcb39feecfc6dabde734d79c2d1c4380/requirements/pytorch.txt
On installs of PyTorch
<2.6(which the pin permits),torch.loaddefaults toweights_only=False. That is unrestricted pickle deserialization: any__reduce__on the file executes on load, which is the same class of issue as CVE-2025-32434. PyTorch ≥2.6 flipped the default toweights_only=True, but T4Rec's pin does not enforce that.This is particularly relevant because PRs #802/#804 were framed as pickle-security hardening — the hardening does not cover this code path. When the allowlist-based
utils/serialization.pywas removed in PR #808,torch.loadbecame the primary load path again, with its version-dependent default.Steps/Code to reproduce bug
Benign repro (side effect is a temp file; no persistent change):
torch.load(path, weights_only=True)refuses the same payload withUnpicklingError:Expected behavior
torch.loadcall passesweights_only=True.requirements/pytorch.txtbumps the floor totorch>=2.6so the safe default is always in effect.Environment details
main@8bf122f5>=1.0(seerequirements/pytorch.txt); installs on<2.6are the affected population.Additional context
Two-part remediation:
In code: add
weights_only=Trueto everytorch.load(...)call in the library, tests, and example notebooks. This is safe because the surrounding code already expects astate_dict(plain tensor dict), whichweights_only=Truesupports.In requirements: bump
torch>=1.0totorch>=2.6. The project's other dependencies (HFtransformers, Merlin) already require a recent PyTorch, so this bump is effectively a no-op for supported users.Happy to send a PR covering both parts.