Skip to content

Platform Details

xa11y normalizes platform-specific APIs into a unified interface. This page documents how roles and actions map across platforms, and where behavior differs.

xa11y RolemacOS (AX)Linux (AT-SPI)Windows (UIA)
WindowAXWindow, AXDrawerwindow, frameWindowControlType
ApplicationAXApplicationapplication(root element)
ButtonAXButtonpush buttonButtonControlType
CheckBoxAXCheckBoxcheck box, check menu itemCheckBoxControlType
RadioButtonAXRadioButtonradio button, radio menu itemRadioButtonControlType
TextFieldAXTextField, AXSecureTextFieldentry, password textEditControlType
TextAreaAXTextAreatextEditControlType
StaticTextAXStaticTextlabel, static, captionTextControlType
ComboBoxAXComboBox, AXPopUpButtoncombo boxComboBoxControlType
ListAXList, AXOutlinelist, list boxListControlType, TreeControlType
ListItem(via AXRow)list itemListItemControlType
MenuAXMenumenuMenuControlType
MenuItemAXMenuItem, AXMenuBarItemmenu item, tearoff menu itemMenuItemControlType
MenuBarAXMenuBar, AXMenuBarExtramenu barMenuBarControlType
Tab(subrole AXTabButton)page tabTabItemControlType
TabGroupAXTabGrouppage tab listTabControlType
TableAXTabletable, tree tableTableControlType, DataGridControlType
TableRow(via AXRow)table rowDataItemControlType
TableCellAXCelltable cell, column/row headerHeaderItemControlType
ToolbarAXToolbartool barToolBarControlType
ScrollBarAXScrollBarscroll barScrollBarControlType
SliderAXSlidersliderSliderControlType
ImageAXImageimage, iconImageControlType
LinkAXLinklinkHyperlinkControlType
GroupAXGroup, AXScrollArea, AXRadioGrouppanel, section, form, scroll paneGroupControlType, PaneControlType
DialogAXSheet (or subrole AXDialog)dialog, file chooserWindowControlType (with IsDialog)
Alert(subrole AXApplicationAlert)alert, notification(alert pattern)
ProgressBarAXProgressIndicator, AXBusyIndicatorprogress barProgressBarControlType
TreeItemAXDisclosureTriangle (or subrole AXOutlineRow)tree itemTreeItemControlType
WebAreaAXWebAreadocument web, document frameDocumentControlType
HeadingAXHeading (or subrole)heading(landmark pattern)
SeparatorAXSplitterseparatorSeparatorControlType
SplitGroupAXSplitGroupsplit panePaneControlType
Switch(subrole AXSwitch)(inferred)(inferred)
SpinButtonAXIncrementorspin buttonSpinnerControlType
TooltipAXToolTiptooltip, tool tipToolTipControlType
StatusAXStatusBarstatus barStatusBarControlType
Navigation(landmark)landmark, navigation(landmark)

Roles not recognized by a platform map to Unknown.

xa11y ActionmacOSLinux (AT-SPI)Windows (UIA)
PressAXPressclick, activate, press, invokeInvokePattern
Focusset AXFocused=trueComponent.GrabFocusSetFocus
Blurset AXFocused=false(not directly supported)(not directly supported)
ToggleAXPress (on checkbox)toggle, check, uncheckTogglePattern
ExpandAXShowMenu / AXPressexpand, openExpandCollapsePattern.Expand
CollapseAXCancel / AXPresscollapse, closeExpandCollapsePattern.Collapse
SelectAXPressselectSelectionItemPattern
SetValueset AXValue attributeValue.SetCurrentValue (numeric) / EditableText.ReplaceText (text)RangeValuePattern / ValuePattern
TypeTextset AXSelectedTextEditableText.InsertTextValuePattern (splice)
SetTextSelectionset AXSelectedTextRangeText.SetSelectionTextPattern range ops
IncrementAXIncrementincrement (or Value +step)RangeValuePattern (+step)
DecrementAXDecrementdecrement (or Value -step)RangeValuePattern (-step)
ShowMenuAXShowMenumenu, showmenu, popupExpandCollapsePattern
ScrollIntoView(no-op — no AX equivalent)Component.ScrollToScrollItemPattern

No direct equivalent on macOS — this action is a no-op. On Linux it uses Component.ScrollTo with top-edge alignment. On Windows it uses ScrollItemPattern.ScrollIntoView.

  • macOS: Sets the AXValue attribute directly — works for both text and numeric values.
  • Linux: The Value D-Bus interface only supports f64. Text values require EditableText.ReplaceText. If neither interface is available, returns TextValueNotSupported.
  • Windows: Tries RangeValuePattern for numeric values, falls back to ValuePattern.SetValue for text.

Inserts text via the accessibility API — never simulates keyboard events.

  • macOS: Sets AXSelectedText (replaces selection or inserts at cursor).
  • Linux: Calls EditableText.InsertText at the current caret offset.
  • Windows: Reads current value via ValuePattern, splices in the new text, then calls SetValue.
  • macOS: Sets AXFocused to false on the element.
  • Linux / Windows: No direct API equivalent. The action is not supported.

All platforms report bounds as screen coordinates with origin at the top-left of the primary display. Note:

  • macOS reports in points, not pixels. On Retina displays, 1 point = 2 physical pixels.
  • Windows / Linux report in physical pixels. Multi-monitor setups can produce negative coordinates for displays to the left or above the primary.

Platforms use different attributes to determine an element’s name:

  • macOS: AXTitle, then AXDescription, then AXValue (for text elements).
  • Linux: Accessible.Name, then Description. For StaticText with no name, the first portion of the value is used.
  • Windows: CurrentName property.

The tri-state checked value (Off / On / Mixed) is derived differently:

  • macOS: Parses the AXValue attribute — "0" = Off, "1" = On, "2" = Mixed.
  • Linux: Uses AT-SPI state bits — Checked (0x10) and Mixed (0x2000).
  • Windows: Reads TogglePattern.CurrentToggleState — 0 = Off, 1 = On, 2 = Indeterminate.

xa11y supports both X11 and Wayland Linux sessions. Backends are selected at runtime — there is no compile-time feature flag.

CapabilityX11Wayland
Accessibility (AT-SPI2)D-Bus — no display-server dependencyD-Bus — no display-server dependency
Input simulationXTest extension (no setup)/dev/uinput virtual evdev device (requires input group)
Screen captureGetImage on the root windowPNG URI from org.freedesktop.portal.Screenshot

The selection rule:

  • DISPLAY set → X11 (regardless of WAYLAND_DISPLAY).
  • Otherwise → Wayland: uinput for input, screenshot portal for capture.
  • Screenshot also returns Unsupported if neither DISPLAY nor WAYLAND_DISPLAY is set.

Why uinput instead of libei + portal RemoteDesktop?

Section titled “Why uinput instead of libei + portal RemoteDesktop?”

org.freedesktop.portal.RemoteDesktop (libei) is the architecturally “correct” Wayland input-sim path and the future of structured input injection on Linux, but today it covers a strict subset of the ecosystem: only xdg-desktop-portal-gnome and xdg-desktop-portal-kde implement it, and neither has a usable headless mode without a session manager (GDM / SDDM). uinput goes through the kernel and is compositor-agnostic — it works on every Wayland desktop (GNOME, KDE Plasma, sway, Hyprland, Cosmic, weston) and on headless Linux servers, with the same code path. It is the same mechanism xdotool --using-uinput, ydotool, wtype, Steam Input, and Wine all use.

The trade-off is that uinput requires the user to be in the input group, which grants global input read/write — comparable to macOS Input Monitoring. The portal model is more granular (per-app consent) but isn’t broadly available yet. xa11y may add a libei backend later as an opt-in upgrade once the portal landscape stabilises.

The Screenshot portal usually auto-approves for non-interactive callers (interactive=false, modal=false); xa11y passes both. wlroots compositors via xdg-desktop-portal-wlr and GNOME via xdg-desktop-portal-gnome both support this.

X11 and Wayland coordinates are accepted in physical pixels at scale 1.0 today. HiDPI scale is not surfaced through the existing Screenshot.scale field on Linux — it stays 1.0. Point arguments to input simulation are taken at face value, then mapped onto the uinput virtual coordinate range. Override the assumed screen size (default 1920×1080) with XA11Y_SCREEN_WIDTH / XA11Y_SCREEN_HEIGHT env vars if your setup differs.

GTK 4 menu-button widgets (GtkMenuButton, AdwMenuButton, AdwSplitButton) present as an outer push button accessible that advertises NActions = 0 wrapping an inner toggle button that carries the real click action. Calling press() on the outer would normally raise ActionNotSupported. When the owning application identifies itself as GTK via Application.ToolkitName == "GTK", xa11y-linux instead walks a bounded slice of the outer’s subtree (BFS, depth 3, actionable roles only, name must match) and invokes the single actionable descendant it finds. The fallback is strictly scoped: it only runs inside GTK apps, only when the widget’s own Action interface is empty, and only when exactly one candidate matches — ambiguous subtrees still surface the original error.

On Linux, Electron and Chromium-based apps ship with their AT-SPI2 bridge disabled by default for performance. xa11y will connect to the app but its tree will contain only the top-level window — App.by_name("…").locator("button").count() returns 0 even though buttons are visible on screen.

To expose the full tree, launch the app with --force-renderer-accessibility:

Terminal window
# VS Code
code --force-renderer-accessibility
# Cursor
cursor --force-renderer-accessibility
# Google Chrome / Chromium
google-chrome --force-renderer-accessibility

Representative node counts on Ubuntu 24.04 + GNOME 46 (Wayland):

AppWithout flagWith flag
VS Code1140
Cursor1116
Chrome1210

Native GTK apps (Nautilus, gnome-terminal, GNOME Calculator, gnome-text-editor) don’t need the flag — their AT-SPI2 bridge is enabled by default.

Firefox exposes its tree when launched with MOZ_ACCESSIBILITY_ATK2=1 set in the environment.

To diagnose whether a target app has AT-SPI2 enabled at all:

Terminal window
busctl --user tree org.a11y.atspi.Registry | grep -i "<app-name>"

If the app’s subtree is missing, the problem is the app’s accessibility configuration, not xa11y.