Skip to content

Conversation

@mydea
Copy link
Member

@mydea mydea commented Sep 10, 2025

This adds a better way to ensure we flush in node-based vercel functions.

Note that this does not work in edge functions, so we need to keep the vercelWaitUntil code around for this - but we can noop it in node and instead rely on the better way to handle this, I believe.

@chargome when you're back, can you verify if this makes sense? 😅

Closes #17567

@mydea mydea self-assigned this Sep 10, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 10, 2025

size-limit report 📦

Path Size % Change Change
@sentry/browser 24.84 kB - -
@sentry/browser - with treeshaking flags 23.34 kB - -
@sentry/browser (incl. Tracing) 41.62 kB - -
@sentry/browser (incl. Tracing, Profiling) 46.21 kB - -
@sentry/browser (incl. Tracing, Replay) 80.19 kB - -
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 69.93 kB - -
@sentry/browser (incl. Tracing, Replay with Canvas) 84.87 kB - -
@sentry/browser (incl. Tracing, Replay, Feedback) 97.12 kB - -
@sentry/browser (incl. Feedback) 41.56 kB - -
@sentry/browser (incl. sendFeedback) 29.53 kB - -
@sentry/browser (incl. FeedbackAsync) 34.52 kB - -
@sentry/browser (incl. Metrics) 25.87 kB - -
@sentry/browser (incl. Logs) 26.09 kB - -
@sentry/browser (incl. Metrics & Logs) 26.75 kB - -
@sentry/react 26.56 kB - -
@sentry/react (incl. Tracing) 43.81 kB - -
@sentry/vue 29.3 kB - -
@sentry/vue (incl. Tracing) 43.43 kB - -
@sentry/svelte 24.85 kB - -
CDN Bundle 27.27 kB - -
CDN Bundle (incl. Tracing) 42.26 kB - -
CDN Bundle (incl. Tracing, Replay) 78.97 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) 84.42 kB - -
CDN Bundle - uncompressed 80.08 kB - -
CDN Bundle (incl. Tracing) - uncompressed 125.51 kB - -
CDN Bundle (incl. Tracing, Replay) - uncompressed 242.05 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 254.81 kB - -
@sentry/nextjs (client) 46.17 kB - -
@sentry/sveltekit (client) 41.99 kB - -
@sentry/node-core 51.69 kB +0.06% +30 B 🔺
@sentry/node 161.61 kB +0.02% +32 B 🔺
@sentry/node - without tracing 93.12 kB +0.04% +34 B 🔺
@sentry/aws-serverless 108.63 kB +0.03% +30 B 🔺

View base workflow run

@github-actions
Copy link
Contributor

github-actions bot commented Sep 10, 2025

node-overhead report 🧳

Note: This is a synthetic benchmark with a minimal express app and does not necessarily reflect the real-world performance impact in an application.

Scenario Requests/s % of Baseline Prev. Requests/s Change %
GET Baseline 8,947 - 8,642 +4%
GET With Sentry 1,675 19% 1,678 -0%
GET With Sentry (error only) 6,023 67% 5,935 +1%
POST Baseline 1,190 - 1,171 +2%
POST With Sentry 580 49% 575 +1%
POST With Sentry (error only) 1,050 88% 1,034 +2%
MYSQL Baseline 3,277 - 3,305 -1%
MYSQL With Sentry 456 14% 457 -0%
MYSQL With Sentry (error only) 2,689 82% 2,656 +1%

View base workflow run

if (process.env.VERCEL) {
process.on('SIGTERM', async () => {
// We have 500ms for processing here, so we try to make sure to have enough time to send the events
await client.flush(200);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this on a vercel function? Generally makes sense but we could bump this more up I'd say?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not test this so far at all, maybe you can take this PR over when you have some time and properly test it. The reason I went with 200 here is because this is just the timeout used for finishing events, we still need time to actually make the http request so there needs to be some wiggle room -to the max. 500ms - if we spend 500ms on processing events, we have no more time to actually send the events 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that makes sense I thought this time is included already, I'll take it over

@chargome chargome assigned chargome and unassigned mydea Sep 18, 2025
@mydea
Copy link
Member Author

mydea commented Sep 18, 2025

@chargome this PR may also fix this: #17689

@chargome chargome marked this pull request as ready for review December 30, 2025 12:54
Comment on lines 18 to +20
export function vercelWaitUntil(task: Promise<unknown>): void {
// We only flush manually in Vercel Edge runtime
// In Node runtime, we use process.on('SIGTERM') instead

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, that's the whole point of this PR my friend

Comment on lines +148 to +153
if (process.env.VERCEL) {
process.on('SIGTERM', async () => {
// We have 500ms for processing here, so we try to make sure to have enough time to send the events
await client.flush(200);
});
}

This comment was marked as outdated.

// We have 500ms for processing here, so we try to make sure to have enough time to send the events
await client.flush(200);
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple SIGTERM handlers accumulate on repeated init calls

Each call to init() on Vercel registers a new SIGTERM handler using process.on('SIGTERM', ...) without checking if one was already registered. Since calling init() multiple times is a supported scenario (as confirmed by the existing multiple-init integration test), this can lead to multiple handlers being registered. When SIGTERM is received, each handler will call client.flush(200) on its respective client, potentially causing multiple concurrent flush operations within the 500ms Vercel shutdown window, which could lead to race conditions or unexpected behavior.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can safely ignore this case.

Comment on lines +54 to 58
// Note: vercelWaitUntil only does something in Vercel Edge runtime
// In Node runtime, we use process.on('SIGTERM') instead
// @ts-expect-error This is not typed
if (GLOBAL_OBJ[Symbol.for('@vercel/request-context')]) {
// Vercel has a waitUntil equivalent that works without execution context

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we instead moved the flushing this to node-core


beforeEach(() => {
globalWithEdgeRuntime.EdgeRuntime = 'edge-runtime';
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests don't verify new non-Edge early return behavior (Bugbot Rules)

Per the review rules requiring "tests actually test the newly added behaviour": The beforeEach sets EdgeRuntime = 'edge-runtime' for all tests, so none of them exercise the newly added early return path when EdgeRuntime is not a string (the Node runtime case). The core behavior change of this PR — vercelWaitUntil becoming a no-op in Node runtime — has no test coverage. A test verifying that waitUntil is NOT called when EdgeRuntime is undefined would validate this new behavior.

Fix in Cursor Fix in Web

Comment on lines +149 to +154
process.on('SIGTERM', async () => {
// We have 500ms for processing here, so we try to make sure to have enough time to send the events
await client.flush(200);
});
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The async SIGTERM handler for process.on() is not awaited, creating a race condition where the process may exit before client.flush() completes, leading to data loss.
Severity: CRITICAL | Confidence: High

🔍 Detailed Analysis

The SIGTERM handler is an async function, but Node.js's process.on() does not wait for the returned Promise to resolve. When a SIGTERM signal is received, the client.flush(200) operation is initiated, but the process may terminate before this asynchronous operation completes. This creates a race condition that can lead to the loss of buffered events that were supposed to be sent to Sentry before shutdown. The integration tests mask this issue by using setInterval to artificially keep the process alive, which is not a reliable mechanism for production environments.

💡 Suggested Fix

To ensure the flush completes, the process exit should be explicitly managed. After the await client.flush(200) call, add process.exit(0) to gracefully terminate the process only after the flush has finished. This prevents the race condition and ensures events are not lost.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: packages/node-core/src/sdk/index.ts#L149-L154

Potential issue: The `SIGTERM` handler is an `async` function, but Node.js's
`process.on()` does not wait for the returned Promise to resolve. When a `SIGTERM`
signal is received, the `client.flush(200)` operation is initiated, but the process may
terminate before this asynchronous operation completes. This creates a race condition
that can lead to the loss of buffered events that were supposed to be sent to Sentry
before shutdown. The integration tests mask this issue by using `setInterval` to
artificially keep the process alive, which is not a reliable mechanism for production
environments.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 8029537

}

// Note: vercelWaitUntil only does something in Vercel Edge runtime
// In Node runtime, we use process.on('SIGTERM') instead
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushIfServerless returns early without flushing in Vercel Node

In Vercel Node runtime, when @vercel/request-context exists, flushIfServerless calls vercelWaitUntil and immediately returns. However, vercelWaitUntil now returns early when EdgeRuntime is not defined (which is the case in Node). This means flushIfServerless returns without awaiting the flush and without registering with waitUntil. The code never reaches the serverless fallback that would await flushWithTimeout(timeout). While the SIGTERM handler provides a safety net at function termination, explicit calls to await flushIfServerless() (e.g., in framework integrations like astro, nuxt, react-router) won't actually wait for the flush to complete in Vercel Node runtime.

Additional Locations (1)

Fix in Cursor Fix in Web

@chargome chargome requested a review from s1gr1d December 30, 2025 14:24
Copy link
Member

@s1gr1d s1gr1d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

return;
}

// Note: vercelWaitUntil only does something in Vercel Edge runtime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this comment be in the other file vercelWaitUntil?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a comment in both places 👍

@chargome chargome merged commit 363a910 into develop Dec 30, 2025
206 checks passed
@chargome chargome deleted the fn/vercel-sigterm branch December 30, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use new process.on('SIGTERM') to flush on Vercel

4 participants