feat(relocation) Use a shared bucket for relocations#116108
Conversation
There were more file moves :| Instead of copying all that data around lets put the upload/export in the right place to start with. Now that we have a shared bucket there is no reason to move files around.
With a shared bucket we don't need ot shuffle files around nearly as much. This should address both the slow RPC durations as well as the overall performance of relocations.
Transfer operations don't touch relocation blobs anymore.
The storage backend caches file contents and this cost me an hour of debugging.
Backend Test FailuresFailures on
|
| # was stored in the shared bucket, create a RelocationFile record | ||
| # so that the import process can begin. | ||
| # TODO(cells) Remove this once RelocationFile.file is optional. | ||
| file = File.objects.create(name="stub", type=RELOCATION_FILE_TYPE, size=10000) |
There was a problem hiding this comment.
Bug: A stub File for SAAS-to-SAAS relocations is created with a hardcoded size=10000, causing all such relocations to be miscategorized as "small" for throttling purposes.
Severity: MEDIUM
Suggested Fix
Instead of creating a stub File with a hardcoded size, the actual size of the relocation export should be determined and stored. This will ensure the should_throttle_relocation function correctly categorizes relocations and applies the throttling limits as intended.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/sentry/relocation/services/relocation_export/impl.py#L112
Potential issue: For SAAS-to-SAAS relocations, a stub `File` object is created with a
hardcoded `size=10000`. The throttling logic in `should_throttle_relocation` uses this
size to categorize relocations. Consequently, all SAAS-to-SAAS relocations, regardless
of their actual size, are classified as "small". This skews the throttling counts, as
large relocations are not counted against the "medium" or "large" limits. This could
lead to bypassing the daily quota for large self-hosted relocations, as previous large
SAAS-to-SAAS relocations no longer contribute to the limit.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0ececbc. Configure here.
| ) | ||
| RelocationFile.objects.create( | ||
| relocation=new_relocation, | ||
| file=file, |
There was a problem hiding this comment.
Retry flow breaks due to mismatched bucket path
High Severity
When retrying a relocation, the new RelocationFile copies bucket_path from the old relocation (e.g. runs/{old_uuid}/in/raw-relocation-data.tar), but the new relocation gets a different UUID. The old preprocessing_transfer code used to copy the raw data file from Django's filestore into runs/{new_uuid}/in/, but that copy step was removed in this PR. Now preprocessing_complete checks runs/{new_uuid}/in/raw-relocation-data.tar and Cloud Build copies from runs/{new_uuid}/in/ — but the file only exists at the old UUID's path. Retried relocations will always fail with a FileNotFoundError.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0ececbc. Configure here.


We've had a number of problems with relocations timing out or consuming too much memory as export files are moved between control + cells. The RPC calls and file operations would be much simpler if there was only a single shared bucket, instead of one bucket per-cell. Futhermore, I've removed all the of the file copying which should improve performance and reliability.
We've already provisioned a new shared bucket and configuration changes to use that bucket are in getsentry/ops#20737
There is still some cleanup work to be done and I've added some TODOs for what I plan to address in the short term.
Refs INFRENG-318