Skip to content

feat: add telemetry for PET process failures and exits#1306

Open
eleanorjboyd wants to merge 2 commits intomicrosoft:mainfrom
eleanorjboyd:regular-ladybug
Open

feat: add telemetry for PET process failures and exits#1306
eleanorjboyd wants to merge 2 commits intomicrosoft:mainfrom
eleanorjboyd:regular-ladybug

Conversation

@eleanorjboyd
Copy link
Member

@eleanorjboyd eleanorjboyd commented Mar 3, 2026

  • Add telemetry and better diagnostics around the pet binary
  • PET_START_FAILED event is now emitted with the error code, reason, platform, and architecture
  • PET_PROCESS_EXIT event is emitted flagging whether the exit was expected (shutdown/restart) or a crash
  • For activation the single broad try-catch was split into two — one specifically for pet startup (which emits telemetry and bails early on failure) and one for manager registration — so errors are attributed correctly and don't bleed into each other.

@eleanorjboyd eleanorjboyd self-assigned this Mar 3, 2026
@eleanorjboyd eleanorjboyd added the bug Issue identified by VS Code Team member as probable bug label Mar 3, 2026
src/extension.ts Outdated
arch: process.arch,
});

window.showErrorMessage(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karthiknadig if we cannot find the binary, is that a game-over scenario ? how should we present / prompt the user? I guess they could enter a interpreter path themselves so maybe its a notification instead?

@vs-code-engineering vs-code-engineering bot added this to the March 2026 milestone Mar 3, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds telemetry and diagnostics around PET (Python Environment Tools) startup failures and process exits to improve failure attribution and troubleshooting during extension activation.

Changes:

  • Emit PET_START_FAILED telemetry (with platform/arch/error info) when PET cannot be started during activation.
  • Emit PET_PROCESS_EXIT telemetry when the PET process exits, tracking restart/dispose vs crash.
  • Add verbose diagnostics for PET binary path resolution and split activation error handling to isolate PET vs manager-registration failures.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/managers/common/nativePythonFinder.ts Adds PET path resolution diagnostics, tracks disposal state, and emits process-exit telemetry.
src/extension.ts Splits activation into PET startup vs manager registration and emits PET startup-failure telemetry.
src/common/telemetry/constants.ts Introduces new telemetry event names and GDPR property mappings for PET events.

Comment on lines +532 to +541
const errnoError = error as NodeJS.ErrnoException;
// Plain Error (no .code) = binary not found by getNativePythonToolsPath.
// Errno error (has .code) = spawn failed (ENOENT, EACCES, EPERM, etc.).
const reason = errnoError.code ? 'spawn_failed' : 'binary_not_found';
sendTelemetryEvent(EventNames.PET_START_FAILED, undefined, {
errorCode: errnoError.code ?? 'UNKNOWN',
reason,
platform: process.platform,
arch: process.arch,
});
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PET_START_FAILED reason classification assumes that any error without an errnoError.code must be "binary_not_found", but createNativePythonFinder() can also throw plain Errors after the binary is found/spawned (e.g., missing stdio streams, JSON-RPC setup failures). This will misattribute failures in telemetry. Consider setting reason to 'unknown' by default and only using 'binary_not_found' for the specific errors thrown by getNativePythonToolsPath() (e.g., by throwing a dedicated error type or attaching a discriminator).

Copilot uses AI. Check for mistakes.
Comment on lines +492 to 497
const wasExpected = this.isRestarting || this.isDisposed;
if (code !== 0) {
this.outputChannel.error(
`[pet] Python Environment Tools exited unexpectedly with code ${code}, signal ${signal}`,
);
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exit handler logs an "unexpected" error whenever code !== 0, but code can be null (terminated by signal) and exits during restart/dispose are expected (wasExpected === true). This can generate noisy error logs on normal shutdown/restart. Consider gating the error log on !wasExpected && code !== 0 (and treating null as non-error unless !wasExpected).

Copilot uses AI. Check for mistakes.
@joaomorenoalt joaomorenoalt removed this from the March 2026 milestone Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue identified by VS Code Team member as probable bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants