Platform Details
xa11y normalizes platform-specific APIs into a unified interface. This page documents how roles and actions map across platforms, and where behavior differs.
Role mapping
Section titled “Role mapping”| xa11y Role | macOS (AX) | Linux (AT-SPI) | Windows (UIA) |
|---|---|---|---|
Window | AXWindow, AXDrawer | window, frame | WindowControlType |
Application | AXApplication | application | (root element) |
Button | AXButton | push button | ButtonControlType |
CheckBox | AXCheckBox | check box, check menu item | CheckBoxControlType |
RadioButton | AXRadioButton | radio button, radio menu item | RadioButtonControlType |
TextField | AXTextField, AXSecureTextField | entry, password text | EditControlType |
TextArea | AXTextArea | text | EditControlType |
StaticText | AXStaticText | label, static, caption | TextControlType |
ComboBox | AXComboBox, AXPopUpButton | combo box | ComboBoxControlType |
List | AXList, AXOutline | list, list box | ListControlType, TreeControlType |
ListItem | (via AXRow) | list item | ListItemControlType |
Menu | AXMenu | menu | MenuControlType |
MenuItem | AXMenuItem, AXMenuBarItem | menu item, tearoff menu item | MenuItemControlType |
MenuBar | AXMenuBar, AXMenuBarExtra | menu bar | MenuBarControlType |
Tab | (subrole AXTabButton) | page tab | TabItemControlType |
TabGroup | AXTabGroup | page tab list | TabControlType |
Table | AXTable | table, tree table | TableControlType, DataGridControlType |
TableRow | (via AXRow) | table row | DataItemControlType |
TableCell | AXCell | table cell, column/row header | HeaderItemControlType |
Toolbar | AXToolbar | tool bar | ToolBarControlType |
ScrollBar | AXScrollBar | scroll bar | ScrollBarControlType |
Slider | AXSlider | slider | SliderControlType |
Image | AXImage | image, icon | ImageControlType |
Link | AXLink | link | HyperlinkControlType |
Group | AXGroup, AXScrollArea, AXRadioGroup | panel, section, form, scroll pane | GroupControlType, PaneControlType |
Dialog | AXSheet (or subrole AXDialog) | dialog, file chooser | WindowControlType (with IsDialog) |
Alert | (subrole AXApplicationAlert) | alert, notification | (alert pattern) |
ProgressBar | AXProgressIndicator, AXBusyIndicator | progress bar | ProgressBarControlType |
TreeItem | AXDisclosureTriangle (or subrole AXOutlineRow) | tree item | TreeItemControlType |
WebArea | AXWebArea | document web, document frame | DocumentControlType |
Heading | AXHeading (or subrole) | heading | (landmark pattern) |
Separator | AXSplitter | separator | SeparatorControlType |
SplitGroup | AXSplitGroup | split pane | PaneControlType |
Switch | (subrole AXSwitch) | (inferred) | (inferred) |
SpinButton | AXIncrementor | spin button | SpinnerControlType |
Tooltip | AXToolTip | tooltip, tool tip | ToolTipControlType |
Status | AXStatusBar | status bar | StatusBarControlType |
Navigation | (landmark) | landmark, navigation | (landmark) |
Roles not recognized by a platform map to Unknown.
Action mapping
Section titled “Action mapping”| xa11y Action | macOS | Linux (AT-SPI) | Windows (UIA) |
|---|---|---|---|
Press | AXPress | click, activate, press, invoke | InvokePattern |
Focus | set AXFocused=true | Component.GrabFocus | SetFocus |
Blur | set AXFocused=false | (not directly supported) | (not directly supported) |
Toggle | AXPress (on checkbox) | toggle, check, uncheck | TogglePattern |
Expand | AXShowMenu / AXPress | expand, open | ExpandCollapsePattern.Expand |
Collapse | AXCancel / AXPress | collapse, close | ExpandCollapsePattern.Collapse |
Select | AXPress | select | SelectionItemPattern |
SetValue | set AXValue attribute | Value.SetCurrentValue (numeric) / EditableText.ReplaceText (text) | RangeValuePattern / ValuePattern |
TypeText | set AXSelectedText | EditableText.InsertText | ValuePattern (splice) |
SetTextSelection | set AXSelectedTextRange | Text.SetSelection | TextPattern range ops |
Increment | AXIncrement | increment (or Value +step) | RangeValuePattern (+step) |
Decrement | AXDecrement | decrement (or Value -step) | RangeValuePattern (-step) |
ShowMenu | AXShowMenu | menu, showmenu, popup | ExpandCollapsePattern |
ScrollIntoView | (no-op — no AX equivalent) | Component.ScrollTo | ScrollItemPattern |
Platform caveats
Section titled “Platform caveats”ScrollIntoView
Section titled “ScrollIntoView”No direct equivalent on macOS — this action is a no-op. On Linux it uses Component.ScrollTo with top-edge alignment. On Windows it uses ScrollItemPattern.ScrollIntoView.
SetValue: text vs numeric
Section titled “SetValue: text vs numeric”- macOS: Sets the
AXValueattribute directly — works for both text and numeric values. - Linux: The
ValueD-Bus interface only supportsf64. Text values requireEditableText.ReplaceText. If neither interface is available, returnsTextValueNotSupported. - Windows: Tries
RangeValuePatternfor numeric values, falls back toValuePattern.SetValuefor text.
TypeText
Section titled “TypeText”Inserts text via the accessibility API — never simulates keyboard events.
- macOS: Sets
AXSelectedText(replaces selection or inserts at cursor). - Linux: Calls
EditableText.InsertTextat the current caret offset. - Windows: Reads current value via
ValuePattern, splices in the new text, then callsSetValue.
- macOS: Sets
AXFocusedto false on the element. - Linux / Windows: No direct API equivalent. The action is not supported.
Coordinates
Section titled “Coordinates”All platforms report bounds as screen coordinates with origin at the top-left of the primary display. Note:
- macOS reports in points, not pixels. On Retina displays, 1 point = 2 physical pixels.
- Windows / Linux report in physical pixels. Multi-monitor setups can produce negative coordinates for displays to the left or above the primary.
Name resolution
Section titled “Name resolution”Platforms use different attributes to determine an element’s name:
- macOS: AXTitle, then AXDescription, then AXValue (for text elements).
- Linux: Accessible.Name, then Description. For
StaticTextwith no name, the first portion of the value is used. - Windows: CurrentName property.
Checked state
Section titled “Checked state”The tri-state checked value (Off / On / Mixed) is derived differently:
- macOS: Parses the
AXValueattribute —"0"= Off,"1"= On,"2"= Mixed. - Linux: Uses AT-SPI state bits —
Checked(0x10) andMixed(0x2000). - Windows: Reads
TogglePattern.CurrentToggleState— 0 = Off, 1 = On, 2 = Indeterminate.
Linux: X11 vs Wayland
Section titled “Linux: X11 vs Wayland”xa11y supports both X11 and Wayland Linux sessions. Backends are selected at runtime — there is no compile-time feature flag.
| Capability | X11 | Wayland |
|---|---|---|
| Accessibility (AT-SPI2) | D-Bus — no display-server dependency | D-Bus — no display-server dependency |
| Input simulation | XTest extension (no setup) | /dev/uinput virtual evdev device (requires input group) |
| Screen capture | GetImage on the root window | PNG URI from org.freedesktop.portal.Screenshot |
The selection rule:
DISPLAYset → X11 (regardless ofWAYLAND_DISPLAY).- Otherwise → Wayland: uinput for input, screenshot portal for capture.
- Screenshot also returns
Unsupportedif neitherDISPLAYnorWAYLAND_DISPLAYis set.
Why uinput instead of libei + portal RemoteDesktop?
Section titled “Why uinput instead of libei + portal RemoteDesktop?”org.freedesktop.portal.RemoteDesktop (libei) is the architecturally “correct” Wayland input-sim path and the future of structured input injection on Linux, but today it covers a strict subset of the ecosystem: only xdg-desktop-portal-gnome and xdg-desktop-portal-kde implement it, and neither has a usable headless mode without a session manager (GDM / SDDM). uinput goes through the kernel and is compositor-agnostic — it works on every Wayland desktop (GNOME, KDE Plasma, sway, Hyprland, Cosmic, weston) and on headless Linux servers, with the same code path. It is the same mechanism xdotool --using-uinput, ydotool, wtype, Steam Input, and Wine all use.
The trade-off is that uinput requires the user to be in the input group, which grants global input read/write — comparable to macOS Input Monitoring. The portal model is more granular (per-app consent) but isn’t broadly available yet. xa11y may add a libei backend later as an opt-in upgrade once the portal landscape stabilises.
Wayland screenshot portal consent
Section titled “Wayland screenshot portal consent”The Screenshot portal usually auto-approves for non-interactive callers (interactive=false, modal=false); xa11y passes both. wlroots compositors via xdg-desktop-portal-wlr and GNOME via xdg-desktop-portal-gnome both support this.
Coordinates and HiDPI
Section titled “Coordinates and HiDPI”X11 and Wayland coordinates are accepted in physical pixels at scale 1.0 today. HiDPI scale is not surfaced through the existing Screenshot.scale field on Linux — it stays 1.0. Point arguments to input simulation are taken at face value, then mapped onto the uinput virtual coordinate range. Override the assumed screen size (default 1920×1080) with XA11Y_SCREEN_WIDTH / XA11Y_SCREEN_HEIGHT env vars if your setup differs.
GTK press fallback (Linux)
Section titled “GTK press fallback (Linux)”GTK 4 menu-button widgets (GtkMenuButton, AdwMenuButton, AdwSplitButton) present as an outer push button accessible that advertises NActions = 0 wrapping an inner toggle button that carries the real click action. Calling press() on the outer would normally raise ActionNotSupported. When the owning application identifies itself as GTK via Application.ToolkitName == "GTK", xa11y-linux instead walks a bounded slice of the outer’s subtree (BFS, depth 3, actionable roles only, name must match) and invokes the single actionable descendant it finds. The fallback is strictly scoped: it only runs inside GTK apps, only when the widget’s own Action interface is empty, and only when exactly one candidate matches — ambiguous subtrees still surface the original error.
Linux Electron and Chromium apps
Section titled “Linux Electron and Chromium apps”On Linux, Electron and Chromium-based apps ship with their AT-SPI2 bridge disabled by default for performance. xa11y will connect to the app but its tree will contain only the top-level window — App.by_name("…").locator("button").count() returns 0 even though buttons are visible on screen.
To expose the full tree, launch the app with --force-renderer-accessibility:
# VS Codecode --force-renderer-accessibility
# Cursorcursor --force-renderer-accessibility
# Google Chrome / Chromiumgoogle-chrome --force-renderer-accessibilityRepresentative node counts on Ubuntu 24.04 + GNOME 46 (Wayland):
| App | Without flag | With flag |
|---|---|---|
| VS Code | 1 | 140 |
| Cursor | 1 | 116 |
| Chrome | 1 | 210 |
Native GTK apps (Nautilus, gnome-terminal, GNOME Calculator, gnome-text-editor) don’t need the flag — their AT-SPI2 bridge is enabled by default.
Firefox exposes its tree when launched with MOZ_ACCESSIBILITY_ATK2=1 set in the environment.
To diagnose whether a target app has AT-SPI2 enabled at all:
busctl --user tree org.a11y.atspi.Registry | grep -i "<app-name>"If the app’s subtree is missing, the problem is the app’s accessibility configuration, not xa11y.