Skip to content

perf improvements#806

Closed
LadyBluenotes wants to merge 16 commits intomainfrom
github-calls
Closed

perf improvements#806
LadyBluenotes wants to merge 16 commits intomainfrom
github-calls

Conversation

@LadyBluenotes
Copy link
Copy Markdown
Member

@LadyBluenotes LadyBluenotes commented Apr 1, 2026

Summary by CodeRabbit

  • Performance

    • Optimized file prefetching in the code explorer with delayed, cancellable scheduling to reduce unnecessary loads.
  • Reliability

    • Enhanced resilience for documentation and content retrieval when GitHub is unavailable, with graceful fallbacks.
    • Improved sponsor data availability during GitHub service disruptions.
  • User Experience

    • Clearer copy-to-clipboard notifications indicating whether content comes from the markdown endpoint or GitHub.
    • Improved internal doc navigation preload behavior for faster menu-driven page loads.

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 1, 2026

Deploy Preview for tanstack ready!

Name Link
🔨 Latest commit a4c638a
🔍 Latest deploy log https://app.netlify.com/projects/tanstack/deploys/69ce900dfdc5d90008a0429a
😎 Deploy Preview https://deploy-preview-806--tanstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 30 (🔴 down 19 from production)
Accessibility: 90 (no change from production)
Best Practices: 75 (🔴 down 8 from production)
SEO: 97 (no change from production)
PWA: 70 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c48541df-8524-434f-bf7b-c36795713e6b

📥 Commits

Reviewing files that changed from the base of the PR and between 22d43ad and e754f6f.

📒 Files selected for processing (1)
  • src/components/DocsLayout.tsx

📝 Walkthrough

Walkthrough

Handler refactor: discovery logic was extracted into two utilities—NPM and GitHub discovery. The background handler now delegates to these functions, aggregates their metrics and errors (prefixed by source), and enqueues discovered intent versions based on utility results.

Changes

Cohort / File(s) Summary
Intent discovery extraction
netlify/functions/sync-intent-discover-background.ts, src/utils/intent-admin.server.ts, src/utils/intent-discovery.server.ts
Moved NPM/GitHub discovery into discoverIntentPackagesFromNpm() and discoverIntentPackagesViaGitHub(); handler and admin now delegate to these helpers and aggregate metrics/errors with npm/ or github/ prefixes.
GitHub API error model
src/server/github.ts
Added GitHubApiErrorKind, GitHubApiError class, normalizeGitHubApiError, and isRecoverableGitHubApiError to standardize GitHub API error classification (including rate-limit parsing).
Content error classification & recursive tree
src/utils/documents.server.ts
Added GitHubContentError and related guards; switched remote traversal to recursive git tree (/git/trees/...) and integrated stale-cache fallback usage for content fetches.
Cache + stale-fallback and dedupe
src/utils/cache.server.ts
Introduced fetchCachedWithStaleFallback, pendingDocCache deduplication, getCached/getStaleCached, and changed LRU to retain stale entries (deleteOnStale: false).
Resilient content fallbacks
src/utils/docs.ts, src/utils/config.ts, src/server/sponsors.ts
Wrap repo file fetches with recoverable-error detection; return placeholder/EMPTY_CONFIG or metadata-only sponsors on recoverable GitHub content errors and log warnings.
GitHub examples & repo helpers
src/utils/github-example.server.ts, src/utils/github-repo.server.ts
Added cached branch→tree resolution (fetchExampleGitTree), concurrency-limited raw file fetches, and getAuthenticatedGitHubLogin (token-hash+cache) used by repo name checks.
Prefetch scheduling & query optimization
src/components/FileExplorer.tsx, src/routes/$libraryId/$version.docs.framework.$framework.examples.$.tsx
Added delayed, cancellable prefetch intent (PREFETCH_INTENT_DELAY_MS, schedulePrefetchIntent, clearPrefetchIntent) and updated prefetchFileContent to skip when path is current or React Query already has success/fetching state.
Copy/URL resolution changes
src/components/CopyPageDropdown.tsx, src/components/CopyMarkdownButton.tsx
Unified sourceMarkdownUrl resolution (prefer site markdown vs GitHub raw per repo) and removed CopyMarkdownButton implementation while keeping useCopyButton.
Minor validations & formatting
src/routes/api/builder/deploy/check-name.ts, src/builder/api/remote.ts, src/components/DocsLayout.tsx
Added validateRepoName usage in name check endpoint, line-ending/formatting adjustments in remote API file, and changed Link preload to "intent" for doc nav items.

Sequence Diagram(s)

sequenceDiagram
  actor Scheduler
  participant Handler
  participant NpmWorker as "NPM Discovery"
  participant GhWorker as "GitHub Discovery"
  participant NPM as "NPM Registry"
  participant GH as "GitHub API"
  participant DB
  participant Queue
  participant Logger

  Scheduler->>Handler: trigger discovery (cron/background)
  Handler->>NpmWorker: call discoverIntentPackagesFromNpm()
  Handler->>GhWorker: call discoverIntentPackagesViaGitHub()
  NpmWorker->>NPM: search packages / fetch packuments
  NpmWorker->>DB: upsert package (unverified/verified)
  NpmWorker->>Queue: enqueue selected versions
  NpmWorker->>Logger: return npm metrics/errors

  GhWorker->>GH: search code / fetch package.json / fetch tarballs
  GhWorker->>DB: upsert package (unverified/verified)
  GhWorker->>Queue: enqueue selected versions
  GhWorker->>Logger: return github metrics/errors

  NpmWorker-->>Handler: npm result (counts, errors)
  GhWorker-->>Handler: github result (counts, errors)
  Handler->>Logger: log aggregated metrics and prefixed errors ("npm/...", "github/...")
  Handler->>Scheduler: complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through trees of code and found,

caches that keep the stale around.
Discovery split, both paths set free,
enqueued the skills for you and me.
A rabbit cheers—resilience crowned! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is vague and generic, using a non-descriptive term 'perf improvements' that does not convey meaningful information about the substantial architectural changes in the changeset. Use a more specific title that highlights a key change, such as 'Extract intent discovery logic into reusable server utilities' or 'Refactor cache utilities and add stale-fallback support for GitHub API errors'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch github-calls

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LadyBluenotes LadyBluenotes marked this pull request as ready for review April 1, 2026 23:10
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (9)
src/utils/intent-discovery.server.ts (2)

169-170: Redundant database operations.

upsertIntentPackage({ name: pkgName, verified: true }) on line 169 already sets verified: true during the upsert. The subsequent markPackageVerified(pkgName) on line 170 performs an additional database update to set the same flag. One of these calls is unnecessary.

♻️ Remove redundant call
       results.hadSkills++

       await upsertIntentPackage({ name: pkgName, verified: true })
-      await markPackageVerified(pkgName)

       results.enqueued += await enqueueDiscoveredVersions(pkgName)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/intent-discovery.server.ts` around lines 169 - 170, The code calls
upsertIntentPackage({ name: pkgName, verified: true }) and then immediately
calls markPackageVerified(pkgName), causing a redundant DB update; remove the
extra update by deleting the markPackageVerified(pkgName) call (or alternatively
stop setting verified:true in the upsert if you intend to verify separately),
keeping upsertIntentPackage as the single operation that sets verified for
pkgName.

36-51: Consider parallelizing version enqueueing for better throughput.

The sequential await within the loop processes versions one at a time. For packages with many versions, this could be slow. Since this is a background task, parallelizing with Promise.all or batched concurrency would improve throughput.

♻️ Suggested parallel approach
 async function enqueueDiscoveredVersions(packageName: string) {
   const packument = await fetchPackument(packageName)
   const knownVersions = await getKnownVersions(packageName)
   const versionsToEnqueue = selectVersionsToSync(packument, knownVersions)

-  for (const { version, tarball, publishedAt } of versionsToEnqueue) {
-    await enqueuePackageVersion({
+  await Promise.all(
+    versionsToEnqueue.map(({ version, tarball, publishedAt }) =>
+      enqueuePackageVersion({
         packageName,
         version,
         tarballUrl: tarball,
         publishedAt,
-    })
-  }
+      })
+    )
+  )

   return versionsToEnqueue.length
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/intent-discovery.server.ts` around lines 36 - 51,
enqueueDiscoveredVersions is currently awaiting enqueuePackageVersion serially
inside the for-loop, which is slow for many versions; change it to run the
enqueues in parallel by mapping versionsToEnqueue to promises and using
Promise.all or Promise.allSettled (or a small concurrency limiter like p-map) to
execute enqueuePackageVersion concurrently, ensuring you still await the
combined promise and handle/report per-item errors; keep references to
enqueueDiscoveredVersions, selectVersionsToSync, and enqueuePackageVersion when
locating the change.
src/utils/config.ts (1)

54-56: Return a fresh fallback config instead of the shared singleton.

EMPTY_CONFIG.sections is shared across requests. If any downstream code mutates the config in place, later requests will inherit that state. Clone or freeze the fallback before returning it.

Also applies to: 71-79

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/config.ts` around lines 54 - 56, The EMPTY_CONFIG constant (type
ConfigSchema) is a shared mutable fallback which can be mutated by downstream
code; change code that returns this fallback to return a fresh immutable
instance instead of the singleton. Locate usages that return EMPTY_CONFIG (and
the similar fallback in the 71-79 area) and replace them so they either return a
shallow/deep clone (e.g., new object with sections: [...EMPTY_CONFIG.sections])
or return a frozen copy (Object.freeze({ ...EMPTY_CONFIG, sections:
Object.freeze([...EMPTY_CONFIG.sections])})), ensuring callers get a non-shared,
immutable ConfigSchema.
src/server/sponsors.ts (2)

31-32: TTL of 1ms in development effectively disables caching.

The 1ms TTL in development means the cache will almost always be stale, forcing a fetch on every request. Consider using a slightly higher value (e.g., 5000ms) to still exercise the caching logic during development while keeping data relatively fresh.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/sponsors.ts` around lines 31 - 32, SPONSOR_CACHE_TTL_MS is set to
1ms in development which effectively disables caching; change the development
value to a small but meaningful TTL (e.g., 5000) so the cache exercise runs in
dev while keeping prod as 5 * 60 * 1000; update the constant definition
(SPONSOR_CACHE_TTL_MS) to use 5000 when process.env.NODE_ENV === 'development'
and leave the production expression unchanged.

92-125: mergeSponsorsWithMetadata mutates the input sponsors array.

The function uses Object.assign and push to modify the passed-in array. If the caller expects immutability or reuses the array, this could cause subtle bugs. Consider working on a copy, or document the mutation clearly.

Proposed fix for immutability
 function mergeSponsorsWithMetadata(
   sponsors: Sponsor[],
   sponsorsMeta: SponsorMeta[],
 ) {
+  const merged = [...sponsors]
   sponsorsMeta.forEach((sponsorMeta: SponsorMeta) => {
-    const matchingSponsor = sponsors.find((d) => d.login === sponsorMeta.login)
+    const matchingSponsor = merged.find((d) => d.login === sponsorMeta.login)
     // ... rest of logic operates on `merged`
   })
-  sponsors.sort(...)
+  merged.sort(...)
-  return sponsors
+  return merged
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/sponsors.ts` around lines 92 - 125, mergeSponsorsWithMetadata
currently mutates the input sponsors array via Object.assign and push; change it
to work on a copy and return a new array instead: clone the input (e.g., const
merged = sponsors.map(s => ({ ...s }))), then for each sponsorMeta locate the
cloned item by login and replace/update it immutably (create a new object with
spreads, not Object.assign), and if missing add a new object to the cloned array
(use merged.push with a newly created object), finally sort the cloned array and
return it; ensure you reference mergeSponsorsWithMetadata, Sponsor, SponsorMeta,
matchingSponsor, and sponsorsMeta in your changes.
src/utils/documents.server.ts (2)

9-10: TTL of 1ms in development effectively disables caching.

Same as in sponsors.ts—consider a higher value for development to exercise caching logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/documents.server.ts` around lines 9 - 10, DOC_FILE_CACHE_TTL_MS is
set to 1ms in development which effectively disables caching; update the
development branch of the constant so it uses a meaningful TTL (for example a
few seconds or a minute) instead of 1, e.g., change the ternary from
process.env.NODE_ENV === 'development' ? 1 : ... to use a larger dev TTL
(matching the value used in sponsors.ts) so the caching logic in code that reads
DOC_FILE_CACHE_TTL_MS actually gets exercised.

728-730: Branch 404 returns empty array—callers may not distinguish from empty tree.

When the branch doesn't exist, returning [] is later treated the same as an empty tree by fetchApiContentsRemote (line 901-902 returns null). This works, but returning null directly here would be more explicit about the distinction.

Proposed clarification
   if (branchResponse.status === 404) {
-    return []
+    // Branch doesn't exist; let caller handle as missing content
+    return []
   }

Or throw a not_found error to let the stale-fallback mechanism handle it appropriately (though not_found is not recoverable, so it would propagate).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/documents.server.ts` around lines 728 - 730, The branch 404 handler
currently returns an empty array which is indistinguishable from a legitimately
empty tree; instead, change the branch-not-found case to return null (or throw a
not_found error if you prefer fallback behavior) where branchResponse.status ===
404 is checked so callers can distinguish "branch missing" from "empty tree";
update the behavior at the branchResponse check in this file and ensure
fetchApiContentsRemote (the caller that currently treats [] as null at lines
around 901–902) will receive null (or propagate the not_found) to preserve the
intended stale-fallback logic.
src/utils/cache.server.ts (2)

94-109: Same race-condition pattern in fetchCached as in fetchCachedWithStaleFallback.

Both functions register the promise in pendingDocCache after the promise chain is constructed. Since this happens synchronously (no await between creation and set), concurrent callers in the same event-loop tick would see the check at line 88-92 pass and could theoretically start duplicate fetches.

In practice, Node.js single-threaded execution makes this unlikely within a tick, but if opts.fn() is synchronous or microtask-based, the window exists.

Alternative: use IIFE for cleaner ordering
   const resultPromise = (async () => {
     try {
       const result = await opts.fn()
       docCache.set(opts.key, result, { ttl: opts.ttl })
       return result
     } finally {
       pendingDocCache.delete(opts.key)
     }
   })()

   pendingDocCache.set(opts.key, resultPromise)
   return resultPromise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/cache.server.ts` around lines 94 - 109, The race happens because
both fetchCached and fetchCachedWithStaleFallback create the promise chain and
only then register it in pendingDocCache; fix by creating and registering the
promise before invoking opts.fn() so concurrent callers see the pending promise
immediately. Concretely, wrap the work in an async IIFE (or new Promise) named
e.g. resultPromise that calls await opts.fn(), sets docCache
(docCache.set(opts.key,...)), and in finally deletes pendingDocCache, then call
pendingDocCache.set(opts.key, resultPromise) before invoking opts.fn() (i.e.,
ensure the IIFE is created and assigned to resultPromise and stored in
pendingDocCache prior to awaiting/starting opts.fn()); apply same change to
fetchCached and fetchCachedWithStaleFallback.

20-26: getCached returns undefined for falsy cached values.

If undefined is a valid cached value, this implementation cannot distinguish between "not cached" and "cached as undefined". This is likely fine for your use case (caching objects), but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/cache.server.ts` around lines 20 - 26, The current getCached
implementation can’t reliably distinguish “not cached” from “cached as
undefined” unless you explicitly check presence; update getCached to use a
presence check on docCache rather than rely on truthiness — if docCache is a Map
use docCache.has(key) before returning docCache.get(key) as T, otherwise (if
docCache is a plain object) use Object.prototype.hasOwnProperty.call(docCache,
key) and return docCache[key] as T; keep the same function name getCached and
ensure the presence check is performed so cached undefined is returned rather
than treated as missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/components/CopyPageDropdown.tsx`:
- Around line 182-184: The toast source label in CopyPageDropdown is currently
chosen based on repo, which mislabels the fallback case where content is fetched
from pageMarkdownUrl; update the ternary that builds the toast message to check
whether pageMarkdownUrl (or the actual fetched URL variable) is truthy and use
'Markdown content copied from markdown endpoint' when it is, otherwise fall back
to the GitHub-specific message (still keep the special-case for repo ===
'tanstack/tanstack.com' if intended). Locate the ternary near the repo variable
usage inside the CopyPageDropdown component and replace the repo-based condition
with a pageMarkdownUrl-based condition so the label reflects the real source.

In `@src/routes/api/builder/deploy/check-name.ts`:
- Around line 23-28: The handler currently returns a 200 JSON payload when
validateRepoName(name) fails, which should be a client error; update the code
path where validation.valid is false (the block using validateRepoName and
Response.json) to return the same JSON body but with an HTTP 400 status (e.g.,
Response.json({...}, { status: 400 })) so invalid repository names are signaled
consistently as bad requests.

In `@src/server/github.ts`:
- Around line 112-113: The function isRecoverableGitHubApiError currently treats
all GitHubApiError instances as recoverable; change it to inspect error.kind and
return false for authorization failures by checking that the error is a
GitHubApiError and that error.kind is not 'unauthorized' or 'forbidden',
returning true only for other kinds; reference the isRecoverableGitHubApiError
function and the GitHubApiError type/kind property when making the change.

In `@src/server/sponsors.ts`:
- Around line 39-66: isRecoverableGitHubApiError currently treats all
GitHubApiError instances (including kind === 'unauthorized') as recoverable,
causing 401s to silently fall back in getSponsors logic (the
fetchCachedWithStaleFallback call and the catch that calls
getSponsorsFromMetadataOnly). Modify isRecoverableGitHubApiError so it returns
false for unauthorized errors (e.g., error.kind === 'unauthorized' or status ===
401) and only returns true for transient server/rate-limit errors; this will
cause authentication failures to be thrown instead of using stale/metadata-only
sponsors.

In `@src/utils/docs.ts`:
- Around line 35-54: The fallback generator buildUnavailableFile currently
returns language comment lines for non-Markdown files which can produce invalid
syntax for arbitrary assets; update buildUnavailableFile so that when isMarkdown
is false it returns a neutral plain-text fallback (e.g., a short human-readable
message with filename and "Content temporarily unavailable") instead of
language-specific comment markers, using the existing filePath and lowerFilePath
checks to detect Markdown and preserving the Markdown branch as-is; ensure the
returned string is plain text that won't break copy/paste or viewers for assets
like package.json, styles.css, or index.html.

In `@src/utils/documents.server.ts`:
- Around line 772-780: The code currently only logs when treeData.truncated is
true inside fetchGitTreeRecursive, which can silently omit files; update
fetchGitTreeRecursive to handle truncation by either (a) throwing a descriptive
error (include repo and branch) so callers can fail-fast, or (b) returning an
explicit status/flag up the call chain (e.g., {tree, truncated: true}) so
callers can decide, or (c) implement a fallback to a paginated/content-based
fetch for large repos; locate the check of treeData.truncated and replace the
console.warn with one of these concrete behaviors and ensure callers of
fetchGitTreeRecursive (and any code expecting a complete tree) are updated to
handle the new error/flag.

In `@src/utils/github-example.server.ts`:
- Around line 64-67: The fetch calls (e.g., the branchResponse fetch for
`https://api.github.com/repos/${repo}/branches/${branch}` and the other
GitHub/raw fetches around the other locations) lack timeouts and can hang; fix
by using an AbortController for each fetch: create an AbortController, pass
controller.signal into fetch, set a timeout (e.g., 10_000 ms) that calls
controller.abort(), and ensure you clear the timeout in a finally block so you
don’t leak timers; apply this same pattern to the other fetches in this file
(the fetch that uses `headers` at the branchResponse site and the subsequent
raw.githubusercontent.com fetchs referenced at the other ranges).
- Around line 97-104: The current fetchExampleFiles logic treats
treeData.truncated as a warning and returns incomplete treeData.tree; change
this to treat truncation as a failure: when treeData.truncated is true (check
treeData.truncated inside fetchExampleFiles), do not return treeData.tree but
instead either (A) implement the non-recursive fallback — fetch the top-level
tree entries and then fetch each sub-tree individually (non-recursive API calls)
recursively to assemble the full tree, or (B) throw a descriptive error
(including repo and branch) so callers can fallback to cloning or other
strategies; ensure the error/log includes repo, branch and the truncated flag so
the caller can decide how to proceed rather than silently returning partial
data.

In `@src/utils/github-repo.server.ts`:
- Around line 117-143: The current getAuthenticatedGitHubLogin() call uses
fetchCached(...) and will store null on transient /user failures; change to
avoid poisoning the cache by either (a) replacing fetchCached(...) with
fetchCachedWithStaleFallback(...) so stale values are returned on failures, or
(b) keep using fetchCached(...) but ensure the inner fn never returns null
(e.g., on non-ok or missing login throw or return a sentinel that prevents cache
writes). Update the async fn inside the fetchCached/fetchCachedWithStaleFallback
call (the one creating AbortController, calling
fetch('https://api.github.com/user'), and returning user.login) to implement the
chosen approach so null results are not written to the cache for the key
`github-user-login:${tokenHash}`.

In `@src/utils/intent-discovery.server.ts`:
- Around line 127-136: The loop over candidates that calls
fetch(`https://api.github.com/repos/${repo}/contents/${path}`) (variables:
candidates, repo, path, contentRes, ghHeaders) can trigger GitHub secondary rate
limits; add rate limiting by inserting a small delay between requests (e.g.,
await sleep(200-500ms) after each fetch) or integrate a rate-limiter/pool (e.g.,
p-limit or Bottleneck) to throttle concurrent/serial requests, ensuring the
delay or limiter is applied around the fetch call in the same loop so sequential
requests to the GitHub contents API are slowed to avoid 403s.

---

Nitpick comments:
In `@src/server/sponsors.ts`:
- Around line 31-32: SPONSOR_CACHE_TTL_MS is set to 1ms in development which
effectively disables caching; change the development value to a small but
meaningful TTL (e.g., 5000) so the cache exercise runs in dev while keeping prod
as 5 * 60 * 1000; update the constant definition (SPONSOR_CACHE_TTL_MS) to use
5000 when process.env.NODE_ENV === 'development' and leave the production
expression unchanged.
- Around line 92-125: mergeSponsorsWithMetadata currently mutates the input
sponsors array via Object.assign and push; change it to work on a copy and
return a new array instead: clone the input (e.g., const merged = sponsors.map(s
=> ({ ...s }))), then for each sponsorMeta locate the cloned item by login and
replace/update it immutably (create a new object with spreads, not
Object.assign), and if missing add a new object to the cloned array (use
merged.push with a newly created object), finally sort the cloned array and
return it; ensure you reference mergeSponsorsWithMetadata, Sponsor, SponsorMeta,
matchingSponsor, and sponsorsMeta in your changes.

In `@src/utils/cache.server.ts`:
- Around line 94-109: The race happens because both fetchCached and
fetchCachedWithStaleFallback create the promise chain and only then register it
in pendingDocCache; fix by creating and registering the promise before invoking
opts.fn() so concurrent callers see the pending promise immediately. Concretely,
wrap the work in an async IIFE (or new Promise) named e.g. resultPromise that
calls await opts.fn(), sets docCache (docCache.set(opts.key,...)), and in
finally deletes pendingDocCache, then call pendingDocCache.set(opts.key,
resultPromise) before invoking opts.fn() (i.e., ensure the IIFE is created and
assigned to resultPromise and stored in pendingDocCache prior to
awaiting/starting opts.fn()); apply same change to fetchCached and
fetchCachedWithStaleFallback.
- Around line 20-26: The current getCached implementation can’t reliably
distinguish “not cached” from “cached as undefined” unless you explicitly check
presence; update getCached to use a presence check on docCache rather than rely
on truthiness — if docCache is a Map use docCache.has(key) before returning
docCache.get(key) as T, otherwise (if docCache is a plain object) use
Object.prototype.hasOwnProperty.call(docCache, key) and return docCache[key] as
T; keep the same function name getCached and ensure the presence check is
performed so cached undefined is returned rather than treated as missing.

In `@src/utils/config.ts`:
- Around line 54-56: The EMPTY_CONFIG constant (type ConfigSchema) is a shared
mutable fallback which can be mutated by downstream code; change code that
returns this fallback to return a fresh immutable instance instead of the
singleton. Locate usages that return EMPTY_CONFIG (and the similar fallback in
the 71-79 area) and replace them so they either return a shallow/deep clone
(e.g., new object with sections: [...EMPTY_CONFIG.sections]) or return a frozen
copy (Object.freeze({ ...EMPTY_CONFIG, sections:
Object.freeze([...EMPTY_CONFIG.sections])})), ensuring callers get a non-shared,
immutable ConfigSchema.

In `@src/utils/documents.server.ts`:
- Around line 9-10: DOC_FILE_CACHE_TTL_MS is set to 1ms in development which
effectively disables caching; update the development branch of the constant so
it uses a meaningful TTL (for example a few seconds or a minute) instead of 1,
e.g., change the ternary from process.env.NODE_ENV === 'development' ? 1 : ...
to use a larger dev TTL (matching the value used in sponsors.ts) so the caching
logic in code that reads DOC_FILE_CACHE_TTL_MS actually gets exercised.
- Around line 728-730: The branch 404 handler currently returns an empty array
which is indistinguishable from a legitimately empty tree; instead, change the
branch-not-found case to return null (or throw a not_found error if you prefer
fallback behavior) where branchResponse.status === 404 is checked so callers can
distinguish "branch missing" from "empty tree"; update the behavior at the
branchResponse check in this file and ensure fetchApiContentsRemote (the caller
that currently treats [] as null at lines around 901–902) will receive null (or
propagate the not_found) to preserve the intended stale-fallback logic.

In `@src/utils/intent-discovery.server.ts`:
- Around line 169-170: The code calls upsertIntentPackage({ name: pkgName,
verified: true }) and then immediately calls markPackageVerified(pkgName),
causing a redundant DB update; remove the extra update by deleting the
markPackageVerified(pkgName) call (or alternatively stop setting verified:true
in the upsert if you intend to verify separately), keeping upsertIntentPackage
as the single operation that sets verified for pkgName.
- Around line 36-51: enqueueDiscoveredVersions is currently awaiting
enqueuePackageVersion serially inside the for-loop, which is slow for many
versions; change it to run the enqueues in parallel by mapping versionsToEnqueue
to promises and using Promise.all or Promise.allSettled (or a small concurrency
limiter like p-map) to execute enqueuePackageVersion concurrently, ensuring you
still await the combined promise and handle/report per-item errors; keep
references to enqueueDiscoveredVersions, selectVersionsToSync, and
enqueuePackageVersion when locating the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 796c578e-948a-47b3-9bbb-47688c54d089

📥 Commits

Reviewing files that changed from the base of the PR and between e7ac541 and 22d43ad.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (17)
  • netlify/functions/sync-intent-discover-background.ts
  • src/builder/api/remote.ts
  • src/components/CopyMarkdownButton.tsx
  • src/components/CopyPageDropdown.tsx
  • src/components/FileExplorer.tsx
  • src/routes/$libraryId/$version.docs.framework.$framework.examples.$.tsx
  • src/routes/api/builder/deploy/check-name.ts
  • src/server/github.ts
  • src/server/sponsors.ts
  • src/utils/cache.server.ts
  • src/utils/config.ts
  • src/utils/docs.ts
  • src/utils/documents.server.ts
  • src/utils/github-example.server.ts
  • src/utils/github-repo.server.ts
  • src/utils/intent-admin.server.ts
  • src/utils/intent-discovery.server.ts
💤 Files with no reviewable changes (1)
  • src/components/CopyMarkdownButton.tsx

Comment on lines +182 to +184
repo === 'tanstack/tanstack.com'
? 'Markdown content copied from markdown endpoint'
: 'Markdown content copied from GitHub',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Source label can be incorrect on fallback path.

On Line 182, the message is derived from repo, not the actual fetched URL. If repo is set but branch/filePath is missing, content comes from pageMarkdownUrl yet the toast says GitHub.

Proposed fix
-      await copyContent(
-        content,
-        repo === 'tanstack/tanstack.com'
-          ? 'Markdown content copied from markdown endpoint'
-          : 'Markdown content copied from GitHub',
-      )
+      const sourceLabel =
+        urlToFetch === pageMarkdownUrl
+          ? 'Markdown content copied from markdown endpoint'
+          : 'Markdown content copied from GitHub'
+      await copyContent(content, sourceLabel)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
repo === 'tanstack/tanstack.com'
? 'Markdown content copied from markdown endpoint'
: 'Markdown content copied from GitHub',
const sourceLabel =
urlToFetch === pageMarkdownUrl
? 'Markdown content copied from markdown endpoint'
: 'Markdown content copied from GitHub'
await copyContent(content, sourceLabel)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/components/CopyPageDropdown.tsx` around lines 182 - 184, The toast source
label in CopyPageDropdown is currently chosen based on repo, which mislabels the
fallback case where content is fetched from pageMarkdownUrl; update the ternary
that builds the toast message to check whether pageMarkdownUrl (or the actual
fetched URL variable) is truthy and use 'Markdown content copied from markdown
endpoint' when it is, otherwise fall back to the GitHub-specific message (still
keep the special-case for repo === 'tanstack/tanstack.com' if intended). Locate
the ternary near the repo variable usage inside the CopyPageDropdown component
and replace the repo-based condition with a pageMarkdownUrl-based condition so
the label reflects the real source.

Comment on lines +23 to +28
const validation = validateRepoName(name)
if (!validation.valid) {
return Response.json({
available: false,
error: validation.error ?? 'Invalid repository name',
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Return a 400 for invalid repository names.

Right now this path responds with 200, unlike the other input/auth failures in this handler. That makes invalid input look like a successful availability check to callers and intermediaries.

Suggested fix
         const validation = validateRepoName(name)
         if (!validation.valid) {
-          return Response.json({
-            available: false,
-            error: validation.error ?? 'Invalid repository name',
-          })
+          return Response.json(
+            {
+              available: false,
+              error: validation.error ?? 'Invalid repository name',
+            },
+            { status: 400 },
+          )
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const validation = validateRepoName(name)
if (!validation.valid) {
return Response.json({
available: false,
error: validation.error ?? 'Invalid repository name',
})
const validation = validateRepoName(name)
if (!validation.valid) {
return Response.json(
{
available: false,
error: validation.error ?? 'Invalid repository name',
},
{ status: 400 },
)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/api/builder/deploy/check-name.ts` around lines 23 - 28, The
handler currently returns a 200 JSON payload when validateRepoName(name) fails,
which should be a client error; update the code path where validation.valid is
false (the block using validateRepoName and Response.json) to return the same
JSON body but with an HTTP 400 status (e.g., Response.json({...}, { status: 400
})) so invalid repository names are signaled consistently as bad requests.

Comment on lines +112 to +113
export function isRecoverableGitHubApiError(error: unknown) {
return error instanceof GitHubApiError
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't classify auth failures as recoverable.

This helper currently returns true for unauthorized and forbidden too, so any retry/stale-fallback path built on it will treat broken credentials and missing scopes like transient outages. Check error.kind, not just the class.

Suggested fix
 export function isRecoverableGitHubApiError(error: unknown) {
-  return error instanceof GitHubApiError
+  return (
+    error instanceof GitHubApiError &&
+    error.kind !== 'unauthorized' &&
+    error.kind !== 'forbidden'
+  )
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function isRecoverableGitHubApiError(error: unknown) {
return error instanceof GitHubApiError
export function isRecoverableGitHubApiError(error: unknown) {
return (
error instanceof GitHubApiError &&
error.kind !== 'unauthorized' &&
error.kind !== 'forbidden'
)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/github.ts` around lines 112 - 113, The function
isRecoverableGitHubApiError currently treats all GitHubApiError instances as
recoverable; change it to inspect error.kind and return false for authorization
failures by checking that the error is a GitHubApiError and that error.kind is
not 'unauthorized' or 'forbidden', returning true only for other kinds;
reference the isRecoverableGitHubApiError function and the GitHubApiError
type/kind property when making the change.

Comment on lines +39 to +66
try {
sponsors = await fetchCachedWithStaleFallback({
key: 'sponsors',
ttl: SPONSOR_CACHE_TTL_MS,
fn: getSponsors,
shouldFallbackToStale: isRecoverableGitHubApiError,
onStaleFallback: (error) => {
console.warn(
'[getSponsorsForSponsorPack] Serving stale sponsors after GitHub failure:',
{
message: error instanceof Error ? error.message : String(error),
},
)
},
})
} catch (error) {
if (isRecoverableGitHubApiError(error)) {
console.warn(
'[getSponsorsForSponsorPack] Falling back to metadata-only sponsors:',
{
message: error.message,
},
)
sponsors = await getSponsorsFromMetadataOnly()
} else {
throw error
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how isRecoverableGitHubApiError is used across the codebase
rg -n -A5 'isRecoverableGitHubApiError' --type=ts

Repository: TanStack/tanstack.com

Length of output: 1608


🏁 Script executed:

#!/bin/bash
# Find GitHubApiError class definition and understand its structure
rg -n -B2 -A10 'class GitHubApiError' --type=ts

Repository: TanStack/tanstack.com

Length of output: 614


🏁 Script executed:

#!/bin/bash
# Check how GitHubApiError is created and what kind values exist
rg -n 'kind.*:' src/server/github.ts --type=ts | head -30

Repository: TanStack/tanstack.com

Length of output: 268


🏁 Script executed:

#!/bin/bash
# Check all usages of isRecoverableGitHubApiError to understand impact
rg -n 'isRecoverableGitHubApiError' --type=ts -B2 -A2

Repository: TanStack/tanstack.com

Length of output: 1146


🏁 Script executed:

#!/bin/bash
# Find all usages of isRecoverableGitHubApiError in the entire codebase
find . -type f \( -name "*.ts" -o -name "*.js" -o -name "*.tsx" -o -name "*.jsx" \) | xargs grep -l 'isRecoverableGitHubApiError' 2>/dev/null

Repository: TanStack/tanstack.com

Length of output: 112


🏁 Script executed:

#!/bin/bash
# Look at the error creation logic in github.ts to understand when each kind is assigned
rg -n -B3 -A3 "kind: 'unauthorized'" src/server/github.ts --type=ts

Repository: TanStack/tanstack.com

Length of output: 237


🏁 Script executed:

#!/bin/bash
# Check the fetchCachedWithStaleFallback function to understand fallback logic
rg -n -B5 -A15 'fetchCachedWithStaleFallback' src/server/sponsors.ts --type=ts | head -50

Repository: TanStack/tanstack.com

Length of output: 1273


🏁 Script executed:

#!/bin/bash
# Get full error normalization function to understand error handling
rg -n -B2 -A25 'export function normalizeGitHubApiError' src/server/github.ts --type=ts

Repository: TanStack/tanstack.com

Length of output: 981


Unauthorized (401) errors will silently fall back to stale/metadata-only sponsors instead of surfacing the authentication failure.

The isRecoverableGitHubApiError function returns true for all GitHubApiError instances, including those with kind: 'unauthorized' (401 status). This means authentication misconfigurations—invalid tokens, expired credentials, or insufficient permissions—will be masked indefinitely, with stale sponsors or metadata-only fallbacks served without alerting operators to the root cause.

Exclude unauthorized from recoverable errors so authentication failures bubble up:

Fix in src/server/github.ts
 export function isRecoverableGitHubApiError(error: unknown) {
-  return error instanceof GitHubApiError
+  return error instanceof GitHubApiError && error.kind !== 'unauthorized'
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/sponsors.ts` around lines 39 - 66, isRecoverableGitHubApiError
currently treats all GitHubApiError instances (including kind ===
'unauthorized') as recoverable, causing 401s to silently fall back in
getSponsors logic (the fetchCachedWithStaleFallback call and the catch that
calls getSponsorsFromMetadataOnly). Modify isRecoverableGitHubApiError so it
returns false for unauthorized errors (e.g., error.kind === 'unauthorized' or
status === 401) and only returns true for transient server/rate-limit errors;
this will cause authentication failures to be thrown instead of using
stale/metadata-only sponsors.

Comment on lines +35 to +54
function buildUnavailableFile(filePath: string) {
const lowerFilePath = filePath.toLowerCase()
const isMarkdown = lowerFilePath.endsWith('.md')

if (isMarkdown) {
return [
'# Content temporarily unavailable',
'',
'This file could not be refreshed from GitHub right now.',
'Please try again shortly.',
].join('\n')
}

return [
'// Content temporarily unavailable',
`// ${filePath}`,
'// This file could not be refreshed from GitHub right now.',
'// Please try again shortly.',
].join('\n')
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use a neutral fallback body for non-Markdown files.

fetchFile() can serve arbitrary assets, so a package.json, styles.css, or index.html placeholder now becomes syntactically wrong content. Plain text or extension-specific comment styles would avoid breaking copy/paste and viewers in degraded mode.

Suggested fix
   if (isMarkdown) {
     return [
       '# Content temporarily unavailable',
       '',
       'This file could not be refreshed from GitHub right now.',
       'Please try again shortly.',
     ].join('\n')
   }

   return [
-    '// Content temporarily unavailable',
-    `// ${filePath}`,
-    '// This file could not be refreshed from GitHub right now.',
-    '// Please try again shortly.',
+    'Content temporarily unavailable',
+    filePath,
+    'This file could not be refreshed from GitHub right now.',
+    'Please try again shortly.',
   ].join('\n')
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function buildUnavailableFile(filePath: string) {
const lowerFilePath = filePath.toLowerCase()
const isMarkdown = lowerFilePath.endsWith('.md')
if (isMarkdown) {
return [
'# Content temporarily unavailable',
'',
'This file could not be refreshed from GitHub right now.',
'Please try again shortly.',
].join('\n')
}
return [
'// Content temporarily unavailable',
`// ${filePath}`,
'// This file could not be refreshed from GitHub right now.',
'// Please try again shortly.',
].join('\n')
}
function buildUnavailableFile(filePath: string) {
const lowerFilePath = filePath.toLowerCase()
const isMarkdown = lowerFilePath.endsWith('.md')
if (isMarkdown) {
return [
'# Content temporarily unavailable',
'',
'This file could not be refreshed from GitHub right now.',
'Please try again shortly.',
].join('\n')
}
return [
'Content temporarily unavailable',
filePath,
'This file could not be refreshed from GitHub right now.',
'Please try again shortly.',
].join('\n')
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/docs.ts` around lines 35 - 54, The fallback generator
buildUnavailableFile currently returns language comment lines for non-Markdown
files which can produce invalid syntax for arbitrary assets; update
buildUnavailableFile so that when isMarkdown is false it returns a neutral
plain-text fallback (e.g., a short human-readable message with filename and
"Content temporarily unavailable") instead of language-specific comment markers,
using the existing filePath and lowerFilePath checks to detect Markdown and
preserving the Markdown branch as-is; ensure the returned string is plain text
that won't break copy/paste or viewers for assets like package.json, styles.css,
or index.html.

Comment on lines +772 to +780
if (treeData.truncated) {
console.warn(
'[fetchGitTreeRecursive] Git tree response was truncated:',
{
repo,
branch,
},
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Truncated tree response is logged but not handled.

When GitHub truncates the tree (>100k entries or >7MB), the response may be incomplete. The warning is helpful, but depending on repository size, this could silently omit files. Consider documenting this limitation or implementing pagination for very large repos.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/documents.server.ts` around lines 772 - 780, The code currently
only logs when treeData.truncated is true inside fetchGitTreeRecursive, which
can silently omit files; update fetchGitTreeRecursive to handle truncation by
either (a) throwing a descriptive error (include repo and branch) so callers can
fail-fast, or (b) returning an explicit status/flag up the call chain (e.g.,
{tree, truncated: true}) so callers can decide, or (c) implement a fallback to a
paginated/content-based fetch for large repos; locate the check of
treeData.truncated and replace the console.warn with one of these concrete
behaviors and ensure callers of fetchGitTreeRecursive (and any code expecting a
complete tree) are updated to handle the new error/flag.

Comment on lines +64 to +67
const branchResponse = await fetch(
`https://api.github.com/repos/${repo}/branches/${branch}`,
{ headers },
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add timeouts to the new GitHub/raw fetches.

These external calls now sit directly on the request path with no timeout. A stalled GitHub or raw.githubusercontent.com response can hang the deploy flow indefinitely and tie up workers.

Suggested fix
+const GITHUB_FETCH_TIMEOUT_MS = 5000
+
+async function fetchWithTimeout(url: string, init: RequestInit) {
+  const controller = new AbortController()
+  const timeout = setTimeout(() => controller.abort(), GITHUB_FETCH_TIMEOUT_MS)
+  try {
+    return await fetch(url, {
+      ...init,
+      signal: controller.signal,
+    })
+  } finally {
+    clearTimeout(timeout)
+  }
+}
+
 async function fetchExampleGitTree(
   repo: string,
   branch: string,
 ): Promise<Array<GitHubTreeEntry>> {
   return fetchCached({
@@
-      const branchResponse = await fetch(
+      const branchResponse = await fetchWithTimeout(
         `https://api.github.com/repos/${repo}/branches/${branch}`,
         { headers },
       )
@@
-      const treeResponse = await fetch(
+      const treeResponse = await fetchWithTimeout(
         `https://api.github.com/repos/${repo}/git/trees/${treeSha}?recursive=1`,
         { headers },
       )
@@
-  const response = await fetch(href, {
+  const response = await fetchWithTimeout(href, {
     headers: { 'User-Agent': `example-fetch:${repo}` },
   })

Also applies to: 82-85, 119-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/github-example.server.ts` around lines 64 - 67, The fetch calls
(e.g., the branchResponse fetch for
`https://api.github.com/repos/${repo}/branches/${branch}` and the other
GitHub/raw fetches around the other locations) lack timeouts and can hang; fix
by using an AbortController for each fetch: create an AbortController, pass
controller.signal into fetch, set a timeout (e.g., 10_000 ms) that calls
controller.abort(), and ensure you clear the timeout in a finally block so you
don’t leak timers; apply this same pattern to the other fetches in this file
(the fetch that uses `headers` at the branchResponse site and the subsequent
raw.githubusercontent.com fetchs referenced at the other ranges).

Comment on lines +97 to +104
if (treeData.truncated) {
console.warn('[fetchExampleFiles] Git tree response was truncated:', {
repo,
branch,
})
}

return treeData.tree
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub REST API recursive git tree truncated field meaning and recommended handling for incomplete tree responses

💡 Result:

The truncated field in GitHub REST API responses for the Get a tree endpoint (GET /repos/{owner}/{repo}/git/trees/{tree_sha}) is a boolean that indicates whether the returned tree array was truncated due to exceeding GitHub's maximum limits when using the recursive parameter. Meaning: If truncated is true, the number of items in the tree array exceeded the limit of 100,000 entries or 7 MB total size. The response contains only a partial list of the full recursive tree. Recommended handling for incomplete (truncated) responses: - Do not rely on the response as complete. - Fetch sub-trees individually using non-recursive requests: Call the endpoint without recursive (or explicitly omit it) to get immediate children, then recursively fetch each sub-tree (type: "tree") one at a time until all are obtained. - Alternative: Clone the repository locally and iterate over the Git data directly for full access without API limits. This applies specifically to recursive mode; non-recursive trees do not truncate in the same way.

Citations:


Treat a truncated tree as failure, not success.

According to GitHub's REST API documentation, when truncated is true, the tree array is incomplete and the response contains only a partial list. The current code logs a warning but continues returning this incomplete data, which can silently ship broken example payloads with missing files. Recommended approaches when truncation occurs:

  • Fetch sub-trees individually using non-recursive requests, then recursively fetch each sub-tree one at a time
  • Alternatively, clone the repository locally and iterate over the Git data directly to bypass API limits

Do not return incomplete tree data as if it were complete.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/github-example.server.ts` around lines 97 - 104, The current
fetchExampleFiles logic treats treeData.truncated as a warning and returns
incomplete treeData.tree; change this to treat truncation as a failure: when
treeData.truncated is true (check treeData.truncated inside fetchExampleFiles),
do not return treeData.tree but instead either (A) implement the non-recursive
fallback — fetch the top-level tree entries and then fetch each sub-tree
individually (non-recursive API calls) recursively to assemble the full tree, or
(B) throw a descriptive error (including repo and branch) so callers can
fallback to cloning or other strategies; ensure the error/log includes repo,
branch and the truncated flag so the caller can decide how to proceed rather
than silently returning partial data.

Comment on lines +117 to +143
return fetchCached({
key: `github-user-login:${tokenHash}`,
ttl: GITHUB_LOGIN_CACHE_TTL_MS,
fn: async () => {
const controller = new AbortController()
const timeout = setTimeout(() => controller.abort(), 5000)

try {
const userResponse = await fetch('https://api.github.com/user', {
headers: {
Authorization: `Bearer ${accessToken}`,
Accept: 'application/vnd.github.v3+json',
},
signal: controller.signal,
})

if (!userResponse.ok) {
return null
}

const user = (await userResponse.json()) as GitHubUserResponse
return user.login ?? null
} finally {
clearTimeout(timeout)
}
},
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd 'cache.server.ts' src | xargs -r sed -n '1,260p'
echo
rg -n -C3 'fetchCached\(' src/utils

Repository: TanStack/tanstack.com

Length of output: 12209


Transient /user failures will poison the cache with null for the entire TTL.

fetchCached() stores any result from fn() without filtering null values. If a transient error (5xx, rate-limit) occurs on the GitHub /user endpoint, getAuthenticatedGitHubLogin() returns null, which gets cached. During the TTL window, checkRepoNameAvailable() will treat the missing login as available for every repository name check with that token.

Consider using fetchCachedWithStaleFallback() or skip caching null results.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/github-repo.server.ts` around lines 117 - 143, The current
getAuthenticatedGitHubLogin() call uses fetchCached(...) and will store null on
transient /user failures; change to avoid poisoning the cache by either (a)
replacing fetchCached(...) with fetchCachedWithStaleFallback(...) so stale
values are returned on failures, or (b) keep using fetchCached(...) but ensure
the inner fn never returns null (e.g., on non-ok or missing login throw or
return a sentinel that prevents cache writes). Update the async fn inside the
fetchCached/fetchCachedWithStaleFallback call (the one creating AbortController,
calling fetch('https://api.github.com/user'), and returning user.login) to
implement the chosen approach so null results are not written to the cache for
the key `github-user-login:${tokenHash}`.

Comment on lines +127 to +136
for (const { repo, path } of candidates.map((item) => ({
repo: item.repository.full_name,
path: item.path,
}))) {
try {
const contentRes = await fetch(
`https://api.github.com/repos/${repo}/contents/${path}`,
{ headers: ghHeaders },
)
if (!contentRes.ok) continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

GitHub API rate limiting risk with sequential rapid requests.

The loop makes a GitHub contents API call for each candidate without any delay. GitHub's secondary rate limits can trigger when making ~100+ requests in quick succession, even with a valid token. This could cause sporadic 403 Forbidden responses.

Consider adding a small delay between requests or using a rate-limiting utility. Since this runs every 6 hours, a slight slowdown won't impact functionality.

🛡️ Simple throttle example
+const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms))
+
 for (const { repo, path } of candidates.map((item) => ({
   repo: item.repository.full_name,
   path: item.path,
 }))) {
   try {
+    await sleep(100) // Throttle to ~10 req/sec
     const contentRes = await fetch(
       `https://api.github.com/repos/${repo}/contents/${path}`,
       { headers: ghHeaders },
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const { repo, path } of candidates.map((item) => ({
repo: item.repository.full_name,
path: item.path,
}))) {
try {
const contentRes = await fetch(
`https://api.github.com/repos/${repo}/contents/${path}`,
{ headers: ghHeaders },
)
if (!contentRes.ok) continue
const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms))
for (const { repo, path } of candidates.map((item) => ({
repo: item.repository.full_name,
path: item.path,
}))) {
try {
await sleep(100) // Throttle to ~10 req/sec
const contentRes = await fetch(
`https://api.github.com/repos/${repo}/contents/${path}`,
{ headers: ghHeaders },
)
if (!contentRes.ok) continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/intent-discovery.server.ts` around lines 127 - 136, The loop over
candidates that calls
fetch(`https://api.github.com/repos/${repo}/contents/${path}`) (variables:
candidates, repo, path, contentRes, ghHeaders) can trigger GitHub secondary rate
limits; add rate limiting by inserting a small delay between requests (e.g.,
await sleep(200-500ms) after each fetch) or integrate a rate-limiter/pool (e.g.,
p-limit or Bottleneck) to throttle concurrent/serial requests, ensuring the
delay or limiter is applied around the fetch call in the same loop so sequential
requests to the GitHub contents API are slowed to avoid 403s.

@tannerlinsley
Copy link
Copy Markdown
Member

Closing this out because the prod fix landed directly on .

We plucked the relevant pieces from this PR to preserve credit:

  • \ fix: update preload attribute to 'intent' in DocsLayout
  • \ perf: route page markdown copy through first-party docs endpoints

The remaining docs/GitHub work was folded into the broader production fix in , which moved docs fetching behind the DB-backed GitHub cache, restored redirect support safely, tightened Start import boundaries, and cleaned up the client/server graph issues that showed up while landing this.

Thanks for the original perf work here. It directly informed the final shape.

@tannerlinsley
Copy link
Copy Markdown
Member

Correcting the earlier comment because the shell ate the backticks.

Closing this out because the prod fix landed directly on main.

We plucked the relevant pieces from this PR to preserve credit:

  • cb703a8 fix: update preload attribute to intent in DocsLayout
  • b3bb6ef perf: route page markdown copy through first-party docs endpoints

The remaining docs/GitHub work was folded into the broader production fix in be58fdf, which moved docs fetching behind the DB-backed GitHub cache, restored redirect support safely, tightened Start import boundaries, and cleaned up the client/server graph issues that showed up while landing this.

Thanks for the original perf work here. It directly informed the final shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants