Conversation
…374) ## Description <!-- Briefly describe what this PR does and why --> - Add `winapp ui click` command that uses `SendInput` mouse simulation to click UI elements - Supports `--double` (double-click) and `--right` (right-click) options - Solves the problem where `ui invoke` fails on controls that don't support `InvokePattern` (e.g., column headers, list items) ## Usage Example Single click on a control that doesn't support InvokePattern winapp ui click btn-column1-a3f2 -a myapp # single click by slug winapp ui click "Column1" -a myapp # single click by text search winapp ui click btn-column1-a3f2 -a myapp --double # double-click winapp ui click btn-column1-a3f2 -a myapp --right # right-click ## Type of Change <!-- Keep the applicable line(s), delete the rest --> - ✨ New feature ## Checklist <!-- Delete the ones that do not apply to your changes --> - [x] Tested locally on Windows - [x] [docs/usage.md](../docs/usage.md) updated (if CLI commands changed) ## AI Description <!-- ai-description-start --> _This section is auto-generated by AI when the PR is opened or updated. To opt out, delete this entire section including the marker comments._ <!-- ai-description-end -->
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new winapp ui command group to the WinApp CLI (and the npm wrapper/docs) for cross-app UI automation via Microsoft UI Automation (UIA), including inspection, search, invocation, screenshots, scrolling, focus, and waiting.
Changes:
- Introduces
winapp uicommands + shared options, registers them in DI/command routing, and adds UIA-backed services/models/helpers. - Adds npm API wrappers (
uiClick,uiInspect,uiSearch, etc.) and expands documentation (usage, dedicated UI automation doc, skill templates/schema). - Adds unit tests with fake UI services for the new command handlers.
Reviewed changes
Copilot reviewed 46 out of 46 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| src/winapp-npm/src/winapp-commands.ts | Adds npm wrapper functions/options for winapp ui subcommands. |
| src/winapp-CLI/WinApp.Cli/WinApp.Cli.csproj | Enables app manifest + unsafe blocks for UIA/Win32 interop. |
| src/winapp-CLI/WinApp.Cli/Services/UiSessionService.cs | Implements app/window resolution to create UI automation sessions. |
| src/winapp-CLI/WinApp.Cli/Services/SlugGenerator.cs | Adds semantic slug generation/parsing for stable element selectors. |
| src/winapp-CLI/WinApp.Cli/Services/SelectorService.cs | Parses selector strings into either slug or query. |
| src/winapp-CLI/WinApp.Cli/Services/IUiSessionService.cs | Defines UI session service contract. |
| src/winapp-CLI/WinApp.Cli/Services/IUiAutomationService.cs | Defines UIA operations surface (inspect/search/invoke/etc.). |
| src/winapp-CLI/WinApp.Cli/Services/ISelectorService.cs | Defines selector parsing contract. |
| src/winapp-CLI/WinApp.Cli/NativeMethods.txt | Adds CsWin32 UIA + Win32 input/screenshot APIs. |
| src/winapp-CLI/WinApp.Cli/Models/UiSessionInfo.cs | Adds session model for PID/HWND targeting. |
| src/winapp-CLI/WinApp.Cli/Models/UiElement.cs | Adds UI element model including selector + pattern state fields. |
| src/winapp-CLI/WinApp.Cli/Models/SelectorExpression.cs | Adds selector representation (slug vs query). |
| src/winapp-CLI/WinApp.Cli/Helpers/UiJsonContext.cs | Adds source-generated JSON context + --json output models. |
| src/winapp-CLI/WinApp.Cli/Helpers/MouseInput.cs | Adds SendInput-based mouse click helper for ui click. |
| src/winapp-CLI/WinApp.Cli/Helpers/HostBuilderExtensions.cs | Registers UI services and commands in DI + command pipeline. |
| src/winapp-CLI/WinApp.Cli/Commands/WinAppRootCommand.cs | Adds ui to root subcommands + help grouping. |
| src/winapp-CLI/WinApp.Cli/Commands/UiWaitForCommand.cs | Adds ui wait-for command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiStatusCommand.cs | Adds ui status command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiSetValueCommand.cs | Adds ui set-value command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiSearchCommand.cs | Adds ui search command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiScrollIntoViewCommand.cs | Adds ui scroll-into-view command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiScrollCommand.cs | Adds ui scroll command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiScreenshotCommand.cs | Adds ui screenshot command + PNG encoding path. |
| src/winapp-CLI/WinApp.Cli/Commands/UiListWindowsCommand.cs | Adds ui list-windows command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiInvokeCommand.cs | Adds ui invoke command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiInspectCommand.cs | Adds ui inspect command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiGetPropertyCommand.cs | Adds ui get-property command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiGetFocusedCommand.cs | Adds ui get-focused command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiFocusCommand.cs | Adds ui focus command. |
| src/winapp-CLI/WinApp.Cli/Commands/UiCommand.cs | Adds ui command group and wires subcommands. |
| src/winapp-CLI/WinApp.Cli/Commands/UiClickCommand.cs | Adds ui click command using mouse simulation. |
| src/winapp-CLI/WinApp.Cli/Commands/SharedUiOptions.cs | Adds shared options/arguments used across ui subcommands. |
| src/winapp-CLI/WinApp.Cli/app.manifest | Adds PerMonitorV2 DPI awareness for accurate coordinates/screenshot behavior. |
| src/winapp-CLI/WinApp.Cli.Tests/UiCommandTests.cs | Adds handler-level tests for ui subcommands using fakes. |
| src/winapp-CLI/WinApp.Cli.Tests/SelectorServiceTests.cs | Adds tests for slug-vs-query selector parsing. |
| src/winapp-CLI/WinApp.Cli.Tests/FakeUiServices.cs | Adds fake UIA + session services for unit tests. |
| scripts/generate-llm-docs.ps1 | Adds ui-automation skill mapping + generation inputs. |
| llms.txt | Documents new ui-automation skill presence. |
| docs/usage.md | Adds ui section to general CLI usage docs. |
| docs/ui-automation.md | Adds full UI automation documentation page. |
| docs/npm-usage.md | Documents new npm wrapper APIs/options. |
| docs/fragments/skills/winapp-cli/ui-automation.md | Adds hand-authored skill template content for UI automation. |
| docs/cli-schema.json | Adds CLI schema entries for the new ui command group. |
| .github/plugin/skills/winapp-cli/ui-automation/SKILL.md | Adds generated Copilot skill doc for UI automation. |
| .github/plugin/agents/winapp.agent.md | Expands agent scope/docs to include UI automation workflows. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Build Metrics ReportBinary Sizes
Test Results✅ 690 passed out of 690 tests in 327.3s (+28 tests, -10.6s vs. baseline) Test Coverage❌ 20.9% line coverage, 37% branch coverage · CLI Startup Time39ms median (x64, Updated 2026-04-08 06:20:35 UTC · commit |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…dance Phase 1 of UI automation improvements based on agent feedback: - set-value: remove --text flag, make value a positional argument New syntax: winapp ui set-value <selector> <value> -a <app> Eliminates the #1 syntax confusion (8+ agent trials wasted) - Create Helpers/UiErrors.cs with standardized error templates applied consistently across all 14 Ui*Command.cs files: MissingApp, MissingSelector, ElementNotFound, StaleElement, GenericError - Error messages now recommend AutomationId for stable targeting - Update all docs: ui-automation.md, usage.md, npm-usage.md, cli-schema.json (regenerated), SKILL.md (regenerated), agent.md, skill fragment template - Update npm wrapper: text property -> value property - Update tests for new positional syntax Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Promote unique AutomationIds to selectors in inspect/search output: - After building tree, FindAll(TrueCondition) checks AutomationId uniqueness across the full UIA tree - Unique AutomationIds replace generated slugs as selectors - Adds exact AutomationId match in FindSingleElementAsync for faster, unambiguous resolution Improve inspect output clarity: - Wrap selectors in [brackets] for visual distinction - Add 2-line header explaining format and selector types - Skip quoted name when it equals the selector (no redundancy) - Apply brackets consistently in inspect, search, get-focused Update docs: selectors section rewritten to explain AutomationId vs slug selectors, new inspect output format documented. Regenerate cli-schema.json and SKILL.md files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace brackets with colored output for visual clarity: - Selector: bold cyan (most important, what to copy) - Name: green (display text) - value: yellow (editable content) - State/bounds: gray (secondary metadata) Move legend from header to footer, merged with element count: 'Found 10 elements (depth 3). Use the first word as selector, e.g.: winapp ui invoke TabView -a terminal' Footer dynamically picks first interactive element for the example. Drop brackets to avoid agents copying them as part of the selector. Apply same color scheme to inspect, search, and get-focused output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Spectre.Console uses [[ and ]] to escape literal brackets in markup, not backslash. Fixes 'Could not find color or style' crash when elements have [collapsed], [disabled], [scroll:v] etc. state markers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When --app matches multiple windows, auto-select the best window and proceed with a warning instead of throwing a blocking error. This eliminates a wasted tool call for agents. Selection heuristic: - Prefer the foreground window (GetForegroundWindow) - Fall back to the largest window (GetWindowRect area) Warning output shows all windows with metadata: - Label: window/popup/dialog (from Win32 class name) - Size: from GetWindowRect - Foreground marker - Owner HWND: from GetWindow(GW_OWNER) — works across processes, linking WinUI 3 file pickers (PickerHost.exe) to parent app - Win32 class name in brackets (debug info) Also update list-windows command to show same metadata per window. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nappcli into nmetulev/ui-command
When --app matches multiple windows, auto-select the best window and proceed with a warning instead of throwing a blocking error. This eliminates a wasted tool call for agents. Selection heuristic: - Prefer the foreground window (GetForegroundWindow) - Fall back to the largest window (GetWindowRect area) Warning output shows all windows with colored metadata: - HWND in cyan, selected line bold with (selected) label - Label: window/popup/dialog (from Win32 class name) - Size: from GetWindowRect - Foreground marker in green - Owner HWND: from GetWindow(GW_OWNER) — works across processes, linking WinUI 3 file pickers (PickerHost.exe) to parent app - Win32 class name in brackets (debug info) Also update list-windows command to show same colored metadata. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nappcli into nmetulev/ui-command
Replace manual bracket escaping with Spectre's built-in Markup.Escape() which handles all special characters. Fixes broken output when element names, values, or window titles contain characters that Spectre interprets as markup. Applied to: UiInspectCommand, UiSearchCommand, UiGetFocusedCommand, UiListWindowsCommand, UiSessionService. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Element values from UIA can contain carriage returns, newlines, and tabs (e.g., Notepad's Document element value='terminal\r'). These break Spectre MarkupLine rendering mid-line. Replace \r\n, \r, \n with ↵ and \t with → for display. Applied to both inspect and search output via EscapeMarkup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the WinUI 3 content-pane drill-down heuristic from GetRootElement. The heuristic picked the largest child Pane as the root, which skipped menu bars, title bars, status bars, and other chrome elements. For Notepad, this showed only 2 elements (editor pane + document) instead of the full tree. Now ElementFromHandle returns the window element directly, giving the complete UI tree including menus and chrome. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When screenshot targets an app with multiple windows (e.g., app + open dialog), capture each window to a separate file: screenshot.png — main window (largest) screenshot-dialog-1.png — dialog/popup Cross-process dialog detection via GetWindow(GW_OWNER) finds file pickers in PickerHost.exe and other owned windows. Single window behavior unchanged. Direct -w targeting unchanged. Element cropping (selector arg) uses single-window path. JSON output returns array of UiScreenshotResult when multiple windows are captured. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
set-value, focus, scroll-into-view, and get-property were logging the internal sequential ID (e0, e1) instead of the selector (AutomationId or slug). Now uses element.Selector ?? element.Id consistently, matching invoke and click commands. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
set-value, focus, scroll-into-view, and get-property were logging the internal sequential ID (e0, e1) instead of the selector (AutomationId or slug). Now uses element.Selector ?? element.Id consistently, matching invoke and click commands. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nappcli into nmetulev/ui-command
New command: winapp ui get-text <selector> -a <app> Reads text from elements using 3-tier fallback: 1. TextPattern.DocumentRange.GetText(-1) — full text from RichEditBox/Document controls (not accessible via ValuePattern) 2. ValuePattern.CurrentValue — simple text from TextBox/ComboBox 3. element.Name — display text from labels/static elements This fills a real gap: RichEditBox controls expose text only via TextPattern, which get-property doesn't query. Agents previously had no way to read rich text content. Registered in UiCommand, HostBuilderExtensions, UiJsonContext. Added GetTextAsync to IUiAutomationService + implementation. Added IUIAutomationTextPattern/IUIAutomationTextRange to CsWin32. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Docs: - Add get-text command to ui-automation.md, usage.md, skill fragment - Update screenshot section with multi-window dialog example - Add get-text to skill command map in generate-llm-docs.ps1 - Add ui click to skill command map (was missing) - Regenerate cli-schema.json and SKILL.md Tests: - Add GetText_ReturnsText test (JSON output) - Add GetText_WithoutSelector_ReturnsError test - Total: 18 tests (was 16) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Property values from UIA (e.g., Document text content) can contain newlines and carriage returns that break the single-line property listing. Replace control chars with visual symbols (↵ for newlines, → for tabs), matching the sanitization already in inspect output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rename the command from 'get-text' to 'get-value' to pair naturally with 'set-value'. Broader meaning covers text content, slider values, and any element's current value. Same 3-tier fallback: TextPattern → ValuePattern → Name. Updated: command file, UiCommand, HostBuilderExtensions, UiJsonContext, tests, all docs, skill command map, regenerated cli-schema.json and SKILL.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- samples/winui-app: WinUI 3 sample app with testable controls (counter button, text input, checkbox, submit flow) and AutomationProperties for UI automation - scripts/test-e2e-winui-ui.ps1: E2E test script that exercises all winapp ui commands (35 tests, ~6s) using --json assertions and wait-for property value checks. Validates screenshots are non-empty. - .github/workflows/build-package.yml: Added e2e-test-ui job - Fix get-property --json: Changed UiPropertyResult.Properties from Dictionary<string, object?> to Dictionary<string, string?> since source-gen JSON serialization can't handle object? in NativeAOT Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On CI the first-run marker doesn't exist, so the banner fires on the first command. Run --version first to create the marker file, ensuring subsequent --json commands get clean stdout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TreatUnmatchedTokensAsErrors=true on root command: unknown options like --text now error instead of being silently ignored - search returns exit code 1 when no matches found (was always 0) - scroll command: added --json support with UiScrollResult model, making all 14 ui commands support --json Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On CI, the first run may produce extra output (banner, error JSON) before the actual launch JSON. Use regex to extract the JSON object containing ProcessId instead of assuming clean single-object stdout. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Logs --version warmup output, first-run marker status, and raw stdout/stderr from winapp run --detach --json line by line. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1. Push-Location to sample dir before winapp run so that FetchDotNetPackageListAsync can find the .csproj and resolve the WinAppSDK runtime package for installation. 2. Remove duplicate PrintJson in RunCommand error path — StatusService.ExecuteWithStatusAsync already writes JSON error output when --json is active. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
StatusService and RunCommand both write JSON errors - this is by design since StatusService writes generic format and RunCommand writes command-specific format. The e2e script handles multiple JSON objects via regex extraction. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
StatusService was writing its own JsonErrorOutput when a task returned non-zero in --json mode, then the command handler (e.g. RunCommand) would also write its own JSON error — producing two JSON objects on stdout. Fix: StatusService no longer writes JSON for handled task errors (non-zero return code). It only writes JSON for unhandled exceptions where no command handler will get a chance to respond. Command handlers own their JSON output schema. 690/690 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI passes a relative path like 'artifacts/cli/win-x64/winapp.exe'. After Push-Location to the sample dir, the relative path breaks. Resolve-Path at startup ensures it works from any directory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a new winapp ui command group that enables UI automation for Windows apps using Microsoft UI Automation (UIA). This allows
developers and AI agents to programmatically discover, inspect, and interact with UI elements across any Windows application.
Subcommands
Key design decisions
directly to other commands
does not
patterns
Example usage
winapp ui inspect --app notepad
winapp ui search "Save" --app myapp
winapp ui invoke btn-save-d1a2 --app myapp
winapp ui screenshot --app myapp --output screen.png
winapp ui wait-for btn-done-f3c1 --app myapp --timeout 10
Type of Change
Checklist
docs/fragments/skills/(if CLI commands/workflows changed)