Skip to content

Handle ServerNotAvailableException in CollectLinuxCommand process probing#5705

Open
Copilot wants to merge 9 commits intomainfrom
copilot/fix-diagnostics-client-exception
Open

Handle ServerNotAvailableException in CollectLinuxCommand process probing#5705
Copilot wants to merge 9 commits intomainfrom
copilot/fix-diagnostics-client-exception

Conversation

Copy link
Contributor

Copilot AI commented Feb 5, 2026

Summary

Handle ServerNotAvailableException and DiagnosticToolException inCollectLinuxCommand process probing to gracefully handle processes that cannot be resolved or connected to.

Fixes #5694

Problem

DiagnosticsClient.GetProcessInfo() throws ServerNotAvailableException when the diagnostics server is unavailable (process exits between enumeration and probing, connection failures, etc.). Additionally, CommandUtils.ResolveProcess() throws DiagnosticToolException for invalid process IDs or names. These unhandled exceptions caused probe operations to crash.

Solution

Introduce a four-state probe result to handle all outcomes:

  • Supported: Process supports UserEvents IPC command
  • NotSupported: Process does not support UserEvents IPC command (runtime too old)
  • ProcessNotFound: Process could not be resolved (invalid PID, no process with given
    name)
  • ConnectionFailed: Process resolved but unable to connect to diagnostic endpoint

Behavior Changes

Non-probe mode (dotnet-trace collect-linux -p <pid>):

  • ProcessNotFound: [ERROR] Could not resolve process '<id>'.
  • ConnectionFailed: [ERROR] Unable to connect to process '<id>'. The process may have exited or its diagnostic endpoint is not accessible.
  • Both return TracingError

Single-process probe mode (dotnet-trace collect-linux --probe -p <pid>):

  • ProcessNotFound: Could not resolve process '<id>'.
  • ConnectionFailed: Process '<id>' could not be probed. Unable to connect to the process's diagnostic endpoint.
  • Returns Ok (informational output)

Machine-wide probe mode (dotnet-trace collect-linux --probe):

  • Shows "Processes that could not be probed" section when applicable
  • CSV output includes unknown value for unprobed processes
  • Processes that exit between enumeration and probing are handled gracefully

Other Changes

  • Added FormatProcessIdentifier helper - shows name (pid) when name is provided, just pid otherwise
  • Changed ".NET process" to "Process" in messages (probe accepts arbitrary PIDs)
  • Updated --probe option help text to document result categories

Copilot AI changed the title [WIP] Investigate DiagnosticsClient.GetProcessInfo exception Handle ServerNotAvailableException in CollectLinuxCommand process probing Feb 5, 2026
Copilot AI requested a review from mdh1418 February 5, 2026 19:03
@mdh1418 mdh1418 force-pushed the copilot/fix-diagnostics-client-exception branch from 8fe26f7 to fe76ee6 Compare February 6, 2026 17:21
…lpers

Add UserEventsProbeResult enum (Supported/NotSupported) to replace boolean return.
Introduce ProbeProcess helper for probing a single process.
Add GetAndProbeAllProcesses helper that enumerates and probes all published processes.
Update callers in CollectLinux and SupportsCollectLinux to use new helpers.
Update BuildProcessSupportCsv to use UserEventsProbeResult enum.
…cess probing

Add ProcessNotFound and ConnectionFailed values to UserEventsProbeResult enum.
Update ProbeProcess to catch DiagnosticToolException (process resolution failed) and
ServerNotAvailableException (diagnostic endpoint not accessible) separately.
Add FormatProcessIdentifier helper for clean display of process ID/name.
Add unknownProcesses/unknownCsv tracking for processes that could not be probed.
Update probe mode output to show 'Processes that could not be probed' section.
Include 'unknown' value in CSV output for unprobed processes.
Update non-probe mode to show distinct errors for each failure type.
Change '.NET process' to 'Process' in messages since arbitrary PIDs may not be .NET.

Fixes #5694
Document that results are categorized as supported, not supported, or unknown.
Clarify that unknown status occurs when diagnostic endpoint is not accessible.
…iled handling

Update test expectations to match new behavior:
- Add FormatProcessNotFoundError and FormatProcessIdentifier helpers
- Update ResolveProcessExceptions test data for ProcessNotFound handling
- Update probe error test cases for process resolution errors
- Tests now expect ReturnCode.TracingError for failures in non-probe mode
- Tests expect ReturnCode.Ok for probe mode with informational output
@mdh1418 mdh1418 force-pushed the copilot/fix-diagnostics-client-exception branch from fe76ee6 to 0cbcf44 Compare February 6, 2026 19:54
@mdh1418 mdh1418 marked this pull request as ready for review February 6, 2026 20:02
@mdh1418 mdh1418 requested a review from a team as a code owner February 6, 2026 20:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves dotnet-trace collect-linux resilience by handling process-resolution and diagnostics-connection failures during “process probing” so the command no longer crashes when a target process can’t be resolved or connected to (e.g., exits between enumeration and probing, cross-container endpoint issues).

Changes:

  • Replaced boolean “supports” probing with a 4-state probe result (Supported/NotSupported/ProcessNotFound/ConnectionFailed) and updated user-facing output.
  • Updated machine-wide probe to track and report “unknown/unprobed” processes and emit unknown in CSV.
  • Adjusted functional tests to match new probe behaviors/messages (partially—some existing expectations still appear outdated).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
src/Tools/dotnet-trace/CommandLine/Commands/CollectLinuxCommand.cs Introduces multi-state probing, catches DiagnosticToolException/ServerNotAvailableException, updates probe messaging and CSV output.
src/tests/dotnet-trace/CollectLinuxCommandFunctionalTests.cs Updates/extends tests for new probe outcomes and adds helpers for the new process identifier/message formatting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mdh1418 and others added 3 commits March 6, 2026 18:38
…codes

Per review feedback, single-process paths (explicit -p PID or -n NAME)
now call CommandUtils.ResolveProcess separately so argument validation
errors propagate with original specific messages and ArgumentError
return code. ProbeProcess is only used for the resolved PID's runtime
check and connection attempt.

Restore '.NET process(es)' wording in probe output messages.
Remove unused FormatProcessIdentifier helper from both source and tests.
Revert probe error tests to expect ArgumentError with original messages.
Revert ResolveProcessExceptions test data to original error text.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The supportedCsv, unsupportedCsv, and unknownCsv variables are always
non-null when the CsvToConsole and Csv output blocks are reached,
since generateCsv is true for those modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mdh1418 and others added 2 commits March 9, 2026 23:43
Update connection failure messages per review: use 'diagnostic port'
instead of 'diagnostic endpoint', and reword to indicate the process
may not have a .NET diagnostic port rather than implying it exists
but is inaccessible.

Skip processes that exit during name resolution silently rather than
reporting them as unknown, per reviewer suggestion that users wouldn't
find it surprising.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[dotnet-trace][collect-linux] Unable to specifically trace a cross-container process

4 participants