Move TRT-RTX runtime controls to runtime context managers (v3, for review) by tp5uiuc · Pull Request #3 · tp5uiuc/TensorRT

tp5uiuc · 2026-06-03T17:18:42Z

Rewrites the v2 design (PR #2 base branch) to move cuda_graph_strategy, dynamic_shapes_kernel_specialization_strategy, runtime_cache from CompilationSettings / serialized engine slots to runtime context managers per pytorch#4310.

Summary

New RuntimeSettings dataclass on both Python and C++ sides; RuntimeCacheHandle registered as a torchbind class for shared-cache semantics.
Three new CMs in torch_tensorrt.runtime: runtime_config (pool API), runtime_cache (shared cache), plus per-knob sugars. All accept a list of modules.
New runtime_settings= kwarg on compile() / cross_compile_for_windows() / convert_module() for compile-time hints (1 context-create cost, no enter/exit recreate).
Per-engine update_runtime_settings(rs) with fast-path equality check; rebuilds IRuntimeConfig + recreates execution context on diff.
SerializedInfoIndex drops 4 RTX slots; SERIALIZATION_LEN back to 12.

Tests

New test_004_runtime_settings.py (12 tests) covering data model, compile-time hint, CM restore, multi-target, dispatch.
test_000_runtime_cache.py, test_001_dynamic_shapes_kernel_strategy.py, test_001_cuda_graph_strategy.py migrated to the new API.

Status

Pre-commit clean (SKIP=mypy for the pre-existing _TRTEngine.py errors tracked separately).
RTX wheel build succeeds; test_004 all 12 pass; Python-runtime half of the three other test files passes.
C++-engine path crashes inside libtensorrt_rtx.so.1 at cuda_engine->getStreamableWeightsSize() -- I confirmed this is a pre-existing environmental issue on the test node (the same crash occurs with a known-good pre-built v2 wheel installed in the same env), not a regression from this refactor.

…anagers Replaces the v2 design that packed three runtime-mode controls (``cuda_graph_strategy``, ``dynamic_shapes_kernel_specialization_strategy``, ``runtime_cache``) into ``CompilationSettings`` and the serialized engine tuple. Per pytorch#4310, these are runtime mode controls -- not engine properties -- and shouldn't pin at compile time or round-trip through serialization. Highlights: * New ``RuntimeSettings`` dataclass on both Python and C++ sides (``py/torch_tensorrt/runtime/_runtime_settings.py``, ``core/runtime/RuntimeSettings.h``). Three fields: ``dynamic_shapes_kernel_specialization_strategy``, ``cuda_graph_strategy``, ``runtime_cache``. The cache field accepts ``None``, a path string (engine creates an implicit handle, saves on ``__del__``, mirrors old ``runtime_cache_path=`` behavior), or a ``RuntimeCacheHandle`` (shared cache, lifecycle owned by the ``runtime_cache()`` CM). * New ``RuntimeCacheHandle`` registered as a torchbind class (``torch.classes.tensorrt.RuntimeCacheHandle``) so the same C++ ``IRuntimeCache`` shared_ptr crosses the Python/C++ boundary. * New per-engine ``update_runtime_settings`` API on both ``TRTEngine`` flavors. Fast-paths on settings equality; eagerly rebuilds ``IRuntimeConfig`` + recreates execution context on diff. * Three new context managers in ``torch_tensorrt.runtime``: ``runtime_config(target_or_targets, **kw)`` (the pool API; also yields the target so ``with runtime_config(model, ...) as m:`` works), ``runtime_cache(target, path)`` (shared cache CM), and the per-knob sugars ``set_cuda_graph_strategy`` / ``set_dynamic_shapes_kernel_strategy``. All three accept a list of modules for multi-target use; the cache CM yields the ``RuntimeCacheHandle`` for inspection or explicit ``save()``. * New ``runtime_settings=`` kwarg on ``compile()``, ``cross_compile_for_windows()``, and ``convert_module()`` so callers can prime the engine with the right values up front. Compile-time hint avoids the enter/exit recreate cost. * ``CompilationSettings`` loses the three fields; the compiler entry points drop the three kwargs. ``SerializedInfoIndex`` drops the four RTX-related slots; ``SERIALIZATION_LEN`` returns to 12. Engines saved with the old 16-slot layout will raise the existing layout-mismatch error on load. * Three existing test files migrated to the new API; new ``tests/py/dynamo/runtime/test_004_runtime_settings.py`` covers the data model, compile-time hint, runtime CM restore semantics, multi-target form, and dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Two follow-up bugs exposed by the cross-runtime test parameterization on the C++ engine path: 1. ``torch.classes.tensorrt.Engine.update_runtime_settings(...)`` rejected Python ``None`` for the ``RuntimeCacheHandle`` argument because TorchBind does not auto-convert ``None`` to a null ``c10::intrusive_ptr``. Switch the signature to ``c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>>`` so the default ``runtime_cache=None`` case round-trips cleanly. 2. ``RuntimeSettings(runtime_cache="/some/path")`` only auto-saved to disk on engine destruction for the Python runtime (via ``_TRTEngine.__del__``). The C++ engine had no equivalent saver and the IRuntimeCache it materialized internally wasn't accessible from Python. Make the cpp path symmetric: - Expose ``serialize() -> at::Tensor`` / ``deserialize(at::Tensor)`` / ``has_cache()`` on the torchbind ``RuntimeCacheHandle`` class. ``at::Tensor`` of uint8 is used instead of ``std::string`` because TorchBind forces ``std::string`` through Python ``str`` (UTF-8) and serialized cache bytes are not valid UTF-8. - In ``TorchTensorRTModule.setup_engine`` (cpp branch), pre-materialize a torchbind handle when ``runtime_cache`` is a path string, store it on the module, and substitute it into ``_runtime_settings`` so the dispatch passes the same handle through. - Add ``_load_cpp_implicit_cache`` / ``_save_cpp_implicit_cache`` and a module ``__del__`` that mirrors the Python ``_TRTEngine`` saver, with ``filelock`` + atomic-rename semantics. - Teach ``_to_torchbind_handle`` to pass an already-torchbind ``torch.ScriptObject`` through unchanged. All cpp + python runtime tests pass on TRT-RTX 1.5: test_004 (12/12), test_000 (10/10), test_001 dynamic_shapes (14/14), test_001 cuda_graph (13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

…timeCacheHandle lifecycle Structural cleanup on top of the v3 work (no observable behavior change). C++ side -------- ``RuntimeSettings`` migrates from a ``TRTEngine`` member to a ``TRTRuntimeConfig`` member -- the value-type now lives with its primary consumer (the IRuntimeConfig builder). ``TRTRuntimeConfig`` gains ``set_settings()`` (the diff-and-invalidate primitive) and turns the static ``uses_internal_capture`` / ``is_monolithic_capturable`` helpers into instance methods so callers do not need to pass settings around. ``TRTEngine::runtime_settings()`` forwards through. Python side ----------- Introduces a Python ``TRTRuntimeConfig`` class mirroring the C++ struct. ``_TRTEngine`` drops its three legacy fields (``runtime_config``, ``runtime_settings``, ``_implicit_cache_handle``) for a single ``self._trt_runtime_config`` member; ``_create_execution_context`` / ``update_runtime_settings`` / ``_is_monolithic_capturable`` / ``_enable_rtx_native_cudagraphs`` all delegate. Every ``ENABLED_FEATURES.tensorrt_rtx`` branch related to runtime-mode controls is absorbed into the shim, so engine and module call sites stay uniform across TRT and TRT-RTX builds. Following the project's grouping convention, ``py/torch_tensorrt/runtime/_runtime_settings.py`` is merged into ``_runtime_config.py``; that file now holds ``RuntimeSettings``, the new ``TRTRuntimeConfig``, the existing ``runtime_config()`` CM, and its factory. Imports across the tree are repointed. RuntimeCacheHandle ownership model ---------------------------------- Save-on-destruction moves from the two engine-side ``__del__`` paths (``_TRTEngine.close()`` for Python runtime, ``TorchTensorRTModule.__del__`` for cpp runtime) onto ``RuntimeCacheHandle.__del__`` itself, gated by a new ``autosave_on_del`` flag. The flag is set by ownership context: * Engine-implicit handles (created from a path-string compile-time hint) get ``autosave_on_del=True`` -- no other Python object holds them, so the destructor is the only save opportunity. * The ``runtime_cache(target, path)`` CM uses ``autosave_on_del=False`` on the handle it constructs; its ``__exit__`` saves explicitly. * Hand-built handles default to ``autosave_on_del=False`` so save timing stays under the user's control. The handle additionally accepts a ``torchbind_handle`` sibling so the same Python object can wrap either a ``trt.IRuntimeCache`` (Python rt) or a ``torch.classes.tensorrt.RuntimeCacheHandle`` (cpp rt); ``save`` / ``load`` source bytes from whichever is populated. The cpp-runtime helpers on ``TorchTensorRTModule`` (``_load_cpp_implicit_cache``, ``_save_cpp_implicit_cache``, ``__del__``) and the duplicate save logic in ``_TRTEngine.close()`` are removed; both runtimes funnel through the single ``RuntimeCacheHandle.__del__`` path. Tests ----- test_000 grows two new tests asserting the new contract: * ``test_cm_does_not_double_save_on_rc_gc`` -- only one save fires per CM block even after ``rc`` is GC'd. * ``test_user_built_handle_no_autosave_by_default`` -- hand-built handles do not autosave on GC. All 51 runtime tests pass on the refactored design (test_004 12/12, test_000 12/12, test_001 ds 14/14, test_001 cg 13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Five follow-up changes responding to PR review comments: * **Fold strategy sugar into ``_runtime_config.py``.** Delete ``_dynamic_shapes_kernel_strategy.py`` and ``_cuda_graph_strategy.py``; ``set_dynamic_shapes_kernel_strategy`` / ``set_cuda_graph_strategy`` now live alongside the ``runtime_config`` CM they delegate to. ``torch_tensorrt/runtime/__init__.py`` re-exports them from the consolidated module. * **Hoist ``RuntimeSettings`` defaults into ``_defaults.py``.** Three new constants (``DYNAMIC_SHAPES_KERNEL_SPECIALIZATION_STRATEGY``, ``CUDA_GRAPH_STRATEGY``, ``RUNTIME_CACHE_PATH``) mirror the compilation-settings pattern. ``RUNTIME_CACHE_PATH`` defaults to a per-user temp file similar to ``ENGINE_CACHE_DIR``, so users get a disk-backed runtime cache without explicit opt-in; override via ``RuntimeSettings(runtime_cache="/path")`` or the ``runtime_cache`` CM. Test_000 and test_004 updated to reflect the new default. * **Warn on non-RTX ``RuntimeSettings`` construction.** ``__post_init__`` now emits a one-shot ``UserWarning`` on regular TRT builds (gated by ``ENABLED_FEATURES.tensorrt_rtx``) so users see that the settings have no effect. * **Drop ``TYPE_CHECKING`` string forward-refs for ``RuntimeSettings``.** Direct top-level imports across ``_compiler.py``, ``_conversion.py``, ``_TRTEngine.py`` and ``_TorchTensorRTModule.py``; bare ``Optional[RuntimeSettings]`` annotations everywhere. Deferred imports inside ``__init__`` / ``__setstate__`` removed. All 51 runtime tests pass (test_004 12/12, test_000 12/12, test_001 ds 14/14, test_001 cg 13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tp5uiuc · 2026-06-04T03:54:34Z

+            [](const c10::intrusive_ptr<TRTEngine>& self,
+               std::string const& dynamic_shapes_kernel_specialization_strategy,
+               std::string const& cuda_graph_strategy,
+               c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> runtime_cache) -> void {


Is it not possible to implement this as a property with getter and setter because of this c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> runtime_cache?

Possible — the c10::optional<c10::intrusive_ptr<RuntimeCacheHandle>> signature is fine for torchbind def_property (device_memory_budget immediately below in this same registration is a property on TRTEngine, for a comparable point).

The reason update_runtime_settings is a single bundled setter is that RuntimeSettings is the unit of context invalidation: changing any one of the three fields ends up calling recreate_execution_context once. Splitting into three individual properties would cause three sequential context-recreates on the engine-setup path (where all three are set together via _dispatch_runtime_settings_to_engine). The diff-check inside TRTRuntimeConfig::set_settings would catch no-op repeats, but consecutive changing writes would each trigger a recreate.

If you would rather have property syntax I can split it, but the bundled form keeps setup tight. WDYT?

Maybe a compromise here would be to have a tuple(...) as a setter in both python and C++ and pass the data back and forth, so that .settings = would call the update settings method? But that would mean python and C++ code within TRTEngine.py needs to be handled differently (since RuntimeSettings is not available in C++ API, and nor should it be since we only use the python API). Then internally (in this function) we can unpack the tuple (or even use std::apply()) to convert to runtime settings and move it internally to update_runtime_settings.

Discussion-only: the tuple-as-property idea on torchbind is doable but I want to flag the cost before going down that road.

To make engine.settings = ... work as a Python-side property on torch.classes.tensorrt.Engine we would need to:

Define a torchbind def_property("settings", getter, setter) whose setter accepts a tuple-of-primitives (since TorchBind cannot carry the RuntimeSettings value type natively -- only scalars, strings, tensors, and registered torchbind classes).

The tuple shape would have to mirror our struct: (int64_t ds_strategy, int64_t cg_strategy, optional<intrusive_ptr<RuntimeCacheHandle>>). Same data as update_runtime_settings today, just packaged.

On the Python _TRTEngine side, mirror the same property: engine.settings returns a RuntimeSettings dataclass; engine.settings = rs does the dispatch.

The asymmetry you flagged is real: _TRTEngine.py (Python runtime) has access to the RuntimeSettings dataclass directly, but the cpp-torchbind Engine only sees the tuple form. Python module code that talks to self.engine has to branch on isinstance(self.engine, TRTEngine) -- exactly the pattern we already have in _dispatch_runtime_settings_to_engine, except now it would also be true for the property read path (not just write).

Net: the current state -- update_runtime_settings method on the C++ torchbind binding + runtime_settings property on the Python TorchTensorRTModule wrapper -- already gives you mod.runtime_settings = rs at the user-facing layer, without forcing the engine-class boundary to also be a property. Going the extra step to make self.engine.settings = ... work has only an internal-API benefit (the dispatch path), at the cost of a more complex tuple-marshaling property.

Happy to do it if you want it for symmetry, but my preference would be to leave the engine binding as a method and treat the module-level property as the API contract. WDYT?

Let's leave it as-is now.

Mirror ``TRTRuntimeConfig.set_settings`` (Python runtime) on the cpp runtime path. Previously the cpp side dropped the C++ engine's intrusive_ptr on settings change but left ``self._implicit_cache_handle`` on the ``TorchTensorRTModule`` pointing at the *old* wrapper -- the new cache had no Python autosave companion and never wrote to disk. Factor the path-string-to-torchbind-handle materialization into ``TorchTensorRTModule._materialize_cpp_implicit_handle``. Called from ``setup_engine`` and ``_dispatch_runtime_settings_to_engine`` (cpp branch); synchronously saves the prior wrapper before swap, replaces ``self._implicit_cache_handle`` with the new one, then runs ``load()`` after the C++ engine has attached the IRuntimeCache. Test: ``test_set_runtime_settings_saves_prior_cache_on_swap`` (parametrized over both runtimes). Compiles with path A; swaps to path B; asserts A is written synchronously at swap time and B is written on ``del compiled``. The walk-to-inner-module is wrapped in a helper so the loop variable doesn't outlive the call and keep the inner TRT module alive past ``del compiled`` (which would suppress the post-del autosave). All 53 tests pass (test_004 12/12, test_000 14/14, test_001 ds 14/14, test_001 cg 13/13).

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

C++: - ``RuntimeSettings`` strategy fields are now typed ``enum class : int32_t`` values (``DynamicShapesKernelSpecializationStrategy`` / ``CudaGraphStrategy``) mirroring the nvinfer1 enums. Validation moves to dedicated boundary helpers ``to_dynamic_shapes_kernel_strategy`` / ``to_cuda_graph_strategy`` called from the torchbind ``update_runtime_settings`` lambda; the rest of the code uses enum values directly (no more raw ``int32_t`` field reads). - Reverse-lookup helpers ``ds_strategy_name`` / ``cg_strategy_name`` now take the enum type and return ``std::string_view``; the lookup tables switch to ``std::array<std::string_view, N>``. - ``RuntimeCacheHandle::cache`` renamed to ``trt_handle`` so call sites read ``runtime_cache->trt_handle`` instead of ``runtime_cache->cache``. - ``TRTRuntimeConfig::set_settings`` renamed to ``settings(RuntimeSettings)`` (overload of the getter) with ``[[nodiscard]]``. ``TRTEngine``'s ``update_runtime_settings`` similarly renamed to ``runtime_settings(...)`` overload with ``[[nodiscard]] bool`` return. Torchbind binding name stays ``update_runtime_settings`` for Python contract stability. - ``TRTRuntimeConfig::is_monolithic_capturable`` drops the unconditional ``noexcept`` (the RTX branch uses ``TORCHTRT_ASSERT`` which can throw). - ``TRTEngine::num_execution_contexts_created`` regains ``noexcept`` -- bound via a torchbind lambda to sidestep the lack of a ``const noexcept`` ``def`` specialization. - ``TRTEngine::has_dynamic_inputs`` default changed to ``false``. - ``TRTRuntimeConfig::ensure_initialized`` introduces an ``auto& rt_cache = settings_.runtime_cache`` alias for the cache attachment block. - ``RuntimeSettings::to_str`` wraps its output in ``RuntimeSettings{...}``. - ``RuntimeCacheHandle::serialize`` collapses the three early ``at::empty({0}, opts)`` returns into a single ``empty`` lambda. Python: - ``TorchTensorRTModule.set_runtime_settings(rs)`` becomes a ``runtime_settings`` property setter so callers write ``mod.runtime_settings = rs``. Operates on ``self``; outer callers walk ``named_modules()`` themselves (the ``runtime_config`` CM and tests already do). - Docstrings + the prior caller in ``runtime_config`` CM updated to use the setter syntax. All 61 runtime tests pass on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tp5uiuc · 2026-06-07T15:56:27Z

Round 4 review feedback addressed in 38b7033 (full build + 61/61 runtime tests pass on TRT-RTX 1.5.0.103).

C++ changes

RuntimeSettings strategy fields → enum class : int32_t (DynamicShapesKernelSpecializationStrategy, CudaGraphStrategy); validators to_*_strategy(int32_t) on the Py→C++ boundary.
ds_strategy_name / cg_strategy_name reverse-lookup helpers return std::string_view; tables are std::array<std::string_view, N>.
RuntimeCacheHandle::cache → trt_handle (so runtime_cache->trt_handle reads cleanly).
TRTRuntimeConfig::set_settings → settings(RuntimeSettings) overload + [[nodiscard]]. Same pattern on TRTEngine::update_runtime_settings → runtime_settings(...) overload + [[nodiscard] bool. Torchbind binding name stays update_runtime_settings for Python API stability.
TRTRuntimeConfig::is_monolithic_capturable drops the unconditional noexcept (RTX branch can throw via TORCHTRT_ASSERT).
TRTEngine::num_execution_contexts_created regains const noexcept; bound via lambda to sidestep torchbind missing a const noexcept def specialization.
TRTEngine::has_dynamic_inputs default → false.
TRTRuntimeConfig::ensure_initialized uses an auto& rt_cache = settings_.runtime_cache alias.
RuntimeSettings::to_str() wraps in RuntimeSettings{...}.
RuntimeCacheHandle::serialize collapses 3 empty-tensor returns into one empty lambda local.

Python changes

TorchTensorRTModule.set_runtime_settings(rs) → runtime_settings property setter. Callers now write mod.runtime_settings = rs; the runtime_config CM and tests walk named_modules() themselves.

Discussion-only replies posted on:

register_jit_hooks.cpp:72 (tuple-as-property on torch.classes.tensorrt.Engine -- doable but pushes complexity; preference noted to keep the engine binding as a method and treat the module-level property as the API contract).

Layered: stream <-> bytes is the new primitive, path-mode opens the file and delegates. The CM accepts ``str`` / ``os.PathLike`` / file-like in the same positional slot; a stream is read once on enter and written once on exit, with open/close ownership staying with the caller's ``with open(...)`` block. ``io.BytesIO``, gzip streams, and HTTP buffers all "just work" through the same code path. * ``RuntimeCacheHandle.load_from_stream`` / ``save_to_stream`` are the byte-bridge primitives; ``load`` / ``save`` now delegate (atomic tmp+rename + filelock stays in path-mode where it belongs). * ``_RuntimeCacheContextManager`` duck-types the IO arg, raises TypeError on anything that's not a path, PathLike, or file-like. * Read-only / write-only streams degrade silently (OSError / UnsupportedOperation are caught), matching the early-return path for ``path=""``. Tests: 4 new cases for BytesIO round-trip, real file handle, the handle-level stream primitives, and the TypeError contract. Existing 65 runtime tests (incl. all CM + persistence + autosave tests) stay green on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

…rs propagate Two follow-up cleanups after review. * ``RuntimeCacheHandle.__eq__`` / ``__hash__`` were spelling out the default ``object`` semantics (identity comparison, id-derived hash). Deleted both; moved the rationale ("handles wrap distinct ``IRuntimeCache`` instances even when paths match -- separate slots in ``IRuntimeConfig``, no kernel-specialization sharing") into the class docstring under "Equality is identity-based". * ``load_from_stream`` / ``save_to_stream`` were swallowing ``(AttributeError, OSError)`` and conflating it with the legitimate "nothing to load / nothing to save" cases (both returned ``0``). That hid programmer bugs: passing a write-only sink to load, or a closed handle to save, looked identical to first-run empty. Path-mode ``load`` / ``save`` already let IO errors propagate; the stream primitives now do the same, so ``0`` unambiguously means "the buffer was empty". Callers that genuinely want a tolerant variant can wrap the call themselves. All 65 runtime tests stay green on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

…ensorRTModule Before: the Python-rt implicit ``RuntimeCacheHandle`` lived on ``TRTRuntimeConfig._implicit_cache_handle`` (exposed via a forwarding property on ``_TRTEngine``), while the cpp-rt one already lived on ``TorchTensorRTModule._implicit_cache_handle``. Two locations, two construct/swap/save code paths, mostly mirrored. The save-unification landed earlier via ``RuntimeCacheHandle.__del__`` -- this commit closes the loop by giving both runtimes one storage slot. * ``TorchTensorRTModule._implicit_cache_handle`` is now the canonical single owner. ``_materialize_cpp_implicit_handle`` renamed to ``_materialize_implicit_handle`` and branches on ``self._use_python_runtime``; the helper builds a wrapper, swap-saves the prior on change, and returns the dispatch-flavored settings. * ``setup_engine`` pre-wraps string ``runtime_cache`` paths for BOTH runtimes before the engine is constructed, so ``TRTRuntimeConfig`` only ever sees ``None`` or an external ``RuntimeCacheHandle`` -- the ``isinstance(rc, str)`` branch in ``_apply_settings`` is gone, replaced by an explicit ``TypeError`` to catch new callers. * ``TRTRuntimeConfig`` shrinks: no ``_implicit_cache_handle`` field, no ``implicit_cache_handle`` property, no save-on-swap inside ``set_settings``. The class is now a pure-execution shim. To keep the Python-rt's lazy ``IExecutionContext`` semantics (the handle's pybind ``IRuntimeCache`` doesn't exist until ``ensure_cache`` fires inside ``_apply_settings``), ``_apply_settings`` auto-loads when an attached handle has ``autosave_on_del=True and path``. The module's ``needs_load`` after dispatch still drives the cpp-rt load (the torchbind sibling materializes eagerly in C++). * ``_to_torchbind_handle`` learns to pull ``_torchbind`` from a Python ``RuntimeCacheHandle`` wrapper rather than constructing a fresh torchbind sibling -- otherwise CM enter/exit would orphan the cpp-rt cache pointer on every re-dispatch. * ``_materialize_implicit_handle`` gains a no-op fast path for "incoming ``rc`` is the wrapper we already own", which is exactly the shape of ``mod.runtime_settings = current`` (CM enter with no override on ``runtime_cache``). Without it the helper would relinquish ownership and save-then-rebuild on every CM step. * Tests: 2 introspection tests now reach the handle through ``module._implicit_cache_handle`` instead of ``engine._implicit_cache_handle``; new ``_find_python_trt_module`` helper. The 65-test runtime suite stays green on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Five small clean-ups from the latest review pass; no behavior change. * ``TorchTensorRTModule.runtime_settings`` setter: drop the ``if self.engine is None`` early return -- ``_dispatch_runtime_settings_to_engine`` already no-ops on a None engine, so the setter collapses to two lines. * ``_materialize_implicit_handle``: ``getattr(old, "path", None) == rc`` was defensive paranoia. ``old`` is always a ``RuntimeCacheHandle`` here, so ``old.path == rc`` is enough. * ``TRTEngine._num_execution_contexts_created``: initialized to ``0`` in ``__init__`` and ``__setstate__`` instead of being lazily summoned by ``getattr``. Increment is now ``+= 1`` and the getter returns the attribute directly. * ``TRTRuntimeConfig.set_settings``: ``self._live = None`` becomes ``self.reset()`` -- one fewer place to remember which fields ``reset()`` clears. All 65 runtime tests stay green on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

…ross-compile * ``RuntimeSettings.cpp::{ds,cg}_strategy_name``: a ``size_t`` cast wraps negative underlying values to a huge unsigned, so a single ``i < size`` check covers both ends. No ``std::clamp``, no separate ``i < 0`` arm, fewer casts at the call site. * ``dynamo._compiler.cross_compile_for_windows``: dropped the ``runtime_settings`` keyword. Runtime settings are runtime-only knobs applied at ``IExecutionContext`` creation; the cross-compiled engine is consumed on Windows where the caller controls them via ``mod.runtime_settings = ...`` or the ``runtime_config`` CM. Passing them at cross-compile time was a no-op signal. ``compile`` and ``compile_module`` still accept the kwarg for the same-platform flow. All 65 runtime tests stay green on TRT-RTX 1.5.0.103.

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions Bot added component: api [Python] component: core component: dynamo component: runtime component: tests component: conversion labels Jun 3, 2026

github-actions Bot approved these changes Jun 3, 2026

View reviewed changes

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_cuda_graph_strategy.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_dynamic_shapes_kernel_strategy.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_runtime_settings.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_runtime_settings.py Outdated

tp5uiuc commented Jun 3, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/conversion/_conversion.py Outdated

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc marked this pull request as draft June 4, 2026 02:22

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc commented Jun 4, 2026

View reviewed changes

Comment thread core/runtime/register_jit_hooks.cpp Outdated

tp5uiuc commented Jun 4, 2026

View reviewed changes

Comment thread core/runtime/RuntimeSettings.cpp Outdated

github-actions Bot approved these changes Jun 4, 2026

View reviewed changes

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread core/runtime/TRTEngine.h Outdated

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread core/runtime/TRTEngine.h Outdated

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread core/runtime/TRTRuntimeConfig.h Outdated

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread core/runtime/TRTRuntimeConfig.cpp Outdated

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py Outdated

github-actions Bot approved these changes Jun 7, 2026

View reviewed changes

tp5uiuc commented Jun 7, 2026

View reviewed changes

Comment thread core/runtime/RuntimeSettings.cpp Outdated

tp5uiuc commented Jun 9, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py Outdated

tp5uiuc commented Jun 9, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py Outdated

tp5uiuc commented Jun 9, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/runtime/_TRTEngine.py Outdated

tp5uiuc commented Jun 9, 2026

View reviewed changes

Comment thread py/torch_tensorrt/dynamo/_compiler.py Outdated

github-actions Bot approved these changes Jun 9, 2026

View reviewed changes

tp5uiuc commented Jun 9, 2026

View reviewed changes

Comment thread py/torch_tensorrt/runtime/_runtime_config.py Outdated

github-actions Bot approved these changes Jun 9, 2026

View reviewed changes

Conversation

tp5uiuc commented Jun 3, 2026

Summary

Tests

Status

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tp5uiuc Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tp5uiuc commented Jun 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tp5uiuc Jun 7, 2026 •

edited

Loading