Skip to content

fix(oban): handle non-list stacktraces in Oban error reporter#1071

Merged
whatyouhide merged 1 commit into
masterfrom
fix/oban-error-reporter-non-list-stacktrace
May 27, 2026
Merged

fix(oban): handle non-list stacktraces in Oban error reporter#1071
whatyouhide merged 1 commit into
masterfrom
fix/oban-error-reporter-non-list-stacktrace

Conversation

@whatyouhide
Copy link
Copy Markdown
Collaborator

@whatyouhide whatyouhide commented May 27, 2026

Problem

When an Oban job dies from a gen-call exit, Oban surfaces the {module, function, args} from the exit reason in the :stacktrace field of the [:oban, :job, :exception] telemetry metadata, and not a stacktrace list.

Sentry.Integrations.Oban.ErrorReporter.report/5 passes that value straight into Sentry's :stacktrace option, which is typed as a list. This results in an error (that we saw in production):

** (NimbleOptions.ValidationError) invalid value for :stacktrace option: expected list, got: {NimblePool, :checkout, [#PID<0.493.0>]}
    (sentry) lib/sentry/event.ex:186: Sentry.Event.create_event/1
    (sentry) lib/sentry.ex: Sentry.capture_message/2
    (sentry) lib/sentry/integrations/oban/error_reporter.ex:114: Sentry.Integrations.Oban.ErrorReporter.report/5

The handler raises, and :telemetry then permanently detaches it. The affected node stops reporting all Oban job errors to Sentry until it restarts, so the first such exit silently creates an observability blind spot.

The existing normalization at the top of report/5 only handled the empty-list ([]) case; any other non-list value fell straight through.

When an Oban job dies from a gen-call exit (for example a GenServer.call
or a NimblePool/Finch checkout timeout), Oban surfaces the
`{module, function, args}` from the exit reason in the `:stacktrace`
field of the `[:oban, :job, :exception]` telemetry metadata rather than a
stacktrace list.

`Sentry.Integrations.Oban.ErrorReporter` forwarded that value straight
into Sentry's `:stacktrace` option, which only accepts a list. The
resulting `NimbleOptions.ValidationError` crashed the telemetry handler,
and `:telemetry` then permanently detached it — so the affected node
stopped reporting all Oban job errors to Sentry until restart.

Coerce any non-list stacktrace to an empty list before use, which then
falls back to the synthesized worker frame just like the existing
empty-stacktrace path.
@whatyouhide whatyouhide force-pushed the fix/oban-error-reporter-non-list-stacktrace branch from 3e1a19c to 348d047 Compare May 27, 2026 09:44
@whatyouhide whatyouhide marked this pull request as ready for review May 27, 2026 09:44
@whatyouhide
Copy link
Copy Markdown
Collaborator Author

Ah man @solnic just saw now that this overlaps #1064 (which I had not seen). Tests and comments here are a bit more thorough but up to you.

@solnic
Copy link
Copy Markdown
Collaborator

solnic commented May 27, 2026

@whatyouhide no worries, great that we know what is causing it!

@whatyouhide whatyouhide merged commit f80a33e into master May 27, 2026
13 checks passed
@whatyouhide whatyouhide deleted the fix/oban-error-reporter-non-list-stacktrace branch May 27, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants