Skip to content

Add graph reinit on failure#4237

Open
atobiszei wants to merge 10 commits into
mainfrom
atobisze_mp_reinit_graph
Open

Add graph reinit on failure#4237
atobiszei wants to merge 10 commits into
mainfrom
atobisze_mp_reinit_graph

Conversation

@atobiszei
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings May 22, 2026 14:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automatic MediaPipe graph reinitialization on inference failure when using the graph queue/pool, to prevent returning an errored (“poisoned”) graph instance back to the pool for reuse.

Changes:

  • Introduces GraphHelper::reinitialize() plus an RAII GraphReinitGuard that rebuilds a fresh CalculatorGraph unless dismissed on the success path.
  • Hooks the guard into queue-based unary and streaming inference paths to trigger graph rebuild after failures.
  • Moves per-graph shutdown logic into GraphHelper’s destructor and simplifies GraphQueue destructor.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/mediapipe_internal/mediapipegraphexecutor.hpp Adds GraphReinitGuard around queue-path inference and dismisses it on success.
src/mediapipe_internal/graphqueue.hpp Adds GraphHelper destructor, reinitialize() declaration, and new GraphReinitGuard type.
src/mediapipe_internal/graphqueue.cpp Implements GraphHelper shutdown + reinit logic; defaults GraphQueue destructor.
Comments suppressed due to low confidence (2)

src/mediapipe_internal/graphqueue.cpp:142

  • reinitialize() sets graph to nullptr on ObserveOutputStream failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
        if (!absStatus.ok()) {
            SPDLOG_ERROR("Graph reinitialize: ObserveOutputStream failed: {}", absStatus.ToString());
            graph.reset();
            return;
        }

src/mediapipe_internal/graphqueue.cpp:162

  • reinitialize() sets graph to nullptr on StartRun failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
    if (!absStatus.ok()) {
        SPDLOG_ERROR("Graph reinitialize: StartRun failed: {}", absStatus.ToString());
        graph.reset();
        return;
    }

Comment on lines +93 to +95
~GraphReinitGuard() {
if (!dismissed) {
helper.reinitialize(config, sidePacketMaps);
Comment on lines +110 to +114
void GraphHelper::reinitialize(const ::mediapipe::CalculatorGraphConfig& config, const GraphSidePackets& sidePacketMaps) {
SPDLOG_DEBUG("Reinitializing graph after error");
// Tear down the old graph (best-effort, errors expected since graph is in bad state)
if (this->graph) {
auto absStatus = this->graph->CloseAllPacketSources();
Comment on lines +128 to +132
auto absStatus = graph->Initialize(config);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: Initialize failed: {}", absStatus.ToString());
graph.reset();
return;
graph->Cancel();
}

void GraphHelper::reinitialize(const ::mediapipe::CalculatorGraphConfig& config, const GraphSidePackets& sidePacketMaps) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test that will ensure failed graphs are reusable is missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants