Skip to content

Python API Reference

The xa11y Python package provides bindings to the xa11y Rust library via PyO3.

Install from PyPI:

Terminal window
pip install xa11y

Quick start:

import xa11y
safari = xa11y.App.by_name("Safari")
for button in safari.locator("button").elements():
print(button.name)
safari.locator("button[name='OK']").press()

Create a top-level Locator searching from the system root.

Parameter Type Default
selector str

Set the process-wide default timeout, in seconds.

Becomes the default for every auto-waiting action method, wait_* call, and app lookup (App.by_name / App.by_pid / App.find) that doesn’t pass an explicit timeout=. An explicit per-call timeout= always wins. Takes precedence over the XA11Y_DEFAULT_TIMEOUT environment variable (seconds, read once at import).

Pass 0 for “single attempt, no polling” semantics. Raises ValueError for negative or non-finite values.

Parameter Type Default
timeout float

Get the effective process-wide default timeout, in seconds.

Resolution order: the set_default_timeout value, else the XA11Y_DEFAULT_TIMEOUT environment variable, else the built-in 5.0.

Construct an InputSim backed by the platform’s native input path.

Capture pixels from the screen.

With no arguments, captures the full primary display. Pass element to capture the pixels under an element’s current bounds, or region as (x, y, width, height) to capture an explicit rectangle in logical screen coordinates. Passing both raises ValueError.

Parameter Type Default
element Element | None None
region tuple[int, int, int, int] | None None

A resilient element reference that re-queries on each interaction.

Locators never hold a live reference to a UI element. Instead, they store a selector and resolve it on demand, making them immune to staleness. Action methods auto-wait for the element to appear before acting, using the process-wide default timeout (5 seconds unless overridden via set_default_timeout or the XA11Y_DEFAULT_TIMEOUT environment variable). wait_* methods use the same default when no explicit timeout= is passed.

Property Type Description
selector str The CSS-like selector string for this locator.
Method Returns Description
nth(n: int) Locator Return a new Locator that selects the n-th match (1-based).
first() Locator Return a new Locator that selects the first match.
child(selector: str) Locator Return a new Locator scoped to direct children matching selector.
descendant(selector: str) Locator Return a new Locator scoped to descendants matching selector.
Method Returns Description
exists() bool Check if a matching element exists.
count() int Count matching elements.
element() Element Get a single Element handle for the matched element.
elements() list[Element] Get all matching elements.
Method Returns Description
tree(max_depth: int | None = None) dict Capture the subtree rooted at the matched element as a recursive dict.
dump(max_depth: int | None = None) str Render the subtree rooted at the matched element as an indented string.
press() Click / invoke the matched element.
focus() Set keyboard focus on the matched element.
blur() Remove keyboard focus from the matched element.
toggle() Toggle the matched element (checkbox, switch).
expand() Expand the matched element.
collapse() Collapse the matched element.
select() Select the matched element (list item, tab, etc.).
show_menu() Show the context menu for the matched element.
scroll_into_view() Scroll the matched element into the visible area.
increment() Increment the matched element (slider, spinner).
decrement() Decrement the matched element (slider, spinner).
set_value(value: str) Set the text value of the matched element.
set_numeric_value(value: float) Set the numeric value of the matched element (slider, spinner).
type_text(text: str) Type text at the current cursor position on the matched element.
select_text(start: int, end: int) Select a text range within the matched element (0-based offsets).
perform_action(action: str) Perform an action by snake_case name.
Method Returns Description
wait_visible(timeout: float | None = None) Element Wait until the element is visible, polling the provider.
wait_attached(timeout: float | None = None) Element Wait until the element exists in the tree.
wait_detached(timeout: float | None = None) Wait until the element is removed from the tree.
wait_enabled(timeout: float | None = None) Element Wait until the element is enabled.
wait_hidden(timeout: float | None = None) Wait until the element is hidden or removed.
wait_disabled(timeout: float | None = None) Element Wait until the element is disabled.
wait_focused(timeout: float | None = None) Element Wait until the element has keyboard focus.
wait_unfocused(timeout: float | None = None) Element Wait until the element does not have keyboard focus.
wait_until(predicate: Callable[[Element | None], bool], timeout: float | None = None) Wait until an arbitrary predicate is satisfied.

Accessibility event type constants.

Each constant’s value is the string carried in Event.event_type, so handlers can compare against the constant or the literal string interchangeably.

Constant Value Description
FOCUS_CHANGED 'focus_changed'
VALUE_CHANGED 'value_changed'
NAME_CHANGED 'name_changed'
STATE_CHANGED 'state_changed'
STRUCTURE_CHANGED 'structure_changed'
WINDOW_OPENED 'window_opened'
WINDOW_CLOSED 'window_closed'
WINDOW_ACTIVATED 'window_activated'
WINDOW_DEACTIVATED 'window_deactivated'
SELECTION_CHANGED 'selection_changed'
MENU_OPENED 'menu_opened'
MENU_CLOSED 'menu_closed'
TEXT_CHANGED 'text_changed'
ANNOUNCEMENT 'announcement'

An accessibility event delivered to subscribers.

Property Type Description
event_type str
app_name str
app_pid int
target Element | None
state_flag str | None For state_changed events: the flag that changed (e.g. 'checked').
state_value bool | None For state_changed events: the new boolean value of the flag.

A live event subscription.

Method Returns Description
try_recv() Event | None
recv(timeout: float = 5.0) Event
wait_for(predicate: Callable[[Event], bool], timeout: float = 5.0) Event
close()
__iter__() Iterator[Event]

A running application — the entry point for accessibility queries.

Property Type Description
name str
pid int | None
focused bool Whether this application currently holds the foreground / input focus.
Method Returns Description
by_name(name: str, *, timeout: float | None = None) App Find an application by exact name.
by_pid(pid: int, *, timeout: float | None = None) App Find an application by process ID.
find(predicate: Callable[[App], bool], *, timeout: float | None = None) App Find an application matching predicate.
list() list[App] List all running applications.
locator(selector: str) Locator Create a Locator scoped to this application’s accessibility tree.
subscribe() Subscription Subscribe to accessibility events from this application.
children() list[Element] Get direct children (typically windows) of this application.
as_element() Element Get an Element handle for the application root.
tree(max_depth: int | None = None) dict Capture this application’s accessibility tree as a recursive dict.
dump(max_depth: int | None = None) str Render this application’s accessibility tree as an indented string.

A live element with lazy navigation.

Property Type Description
role str
name str | None
value str | None
description str | None
numeric_value float | None
min_value float | None
max_value float | None
stable_id str | None
pid int | None
actions list[str]
bounds Rect | None
raw dict[str, object] Platform-specific raw data attached to this element.
enabled bool
visible bool
focused bool
checked str | None Tri-state toggle value: 'on', 'off', 'mixed', or None.
selected bool
expanded bool | None
editable bool
focusable bool
modal bool
required bool
busy bool
Method Returns Description
children() list[Element] Get direct children (lazy — each call queries the provider).
parent() Element | None Get parent element (lazy — each call queries the provider).
tree(max_depth: int | None = None) dict Capture the subtree rooted at this element as a recursive dict snapshot.
dump(max_depth: int | None = None) str Render the subtree rooted at this element as an indented string.
subscribe() Subscription Subscribe to accessibility events for this element (typically an app).
press() Press (default activate) this element.
focus() Move keyboard focus to this element.
blur() Remove keyboard focus from this element.
toggle() Toggle this element’s checked state.
expand() Expand this element (e.g. tree node, combo box).
collapse() Collapse this element.
select() Select this element (e.g. list item, tab).
show_menu() Show this element’s context menu.
scroll_into_view() Scroll this element into view.
increment() Increment this element’s value (e.g. slider, spinner).
decrement() Decrement this element’s value.
set_value(value: str) Replace this element’s text value.
set_numeric_value(value: float) Set this element’s numeric value.
type_text(text: str) Insert text at the current cursor position.
select_text(start: int, end: int) Select the text range from start to end (0-based offsets).
perform_action(action: str) Perform an action by snake_case name.

Input-simulation façade for synthesised pointer and keyboard events.

Targets are either a (x, y) tuple in screen pixels, or an Element (uses its bounds centre). Key values are strings: printable characters are literal ("a", "7", ";"); named keys use their Pascal name ("Enter", "ArrowUp", "F5"); modifiers are "Shift", "Ctrl", "Alt", "Meta".

Input simulation is distinct from the accessibility action layer — prefer Locator.press() / Locator.type_text() when the target exposes the semantic action. Use InputSim for gestures with no a11y equivalent (drag-and-drop, scroll wheels, global shortcuts).

Method Returns Description
click(target: tuple[int, int] | Element) Left-click once at target.
double_click(target: tuple[int, int] | Element) Left double-click at target.
right_click(target: tuple[int, int] | Element) Right-click at target.
move_to(target: tuple[int, int] | Element) Move the pointer to target without pressing any button.
drag(start: tuple[int, int] | Element, end: tuple[int, int] | Element) Left-drag from start to end.
scroll(target: tuple[int, int] | Element, dx: int = 0, dy: int = 0) Scroll at target. dx positive → right, dy positive → down.
press(key: str) Tap a key (press + release).
chord(key: str, held: list[str] = ...) Tap key while the keys in held are held down.
type_text(text: str) Type literal text into the currently focused control.

A captured image: raw RGBA8 pixels plus dimensions and scale.

width and height are in physical pixels. scale is the physical-to-logical ratio (1.0 on standard displays, 2.0 on typical Retina). pixels has length width * height * 4.

Property Type Description
width int
height int
scale float
pixels bytes Raw RGBA8 pixel bytes (width * height * 4).
Method Returns Description
to_png() bytes Encode the image as a PNG and return the bytes.
save_png(path: str | bytes | object) Encode as PNG and write to path. Accepts str, bytes or os.PathLike.

Method Returns Description
locator(selector: str) Locator
actions() list[list[object]]
clear()

A bounding rectangle in screen coordinates (pixels).

Property Type Description
x int
y int
width int
height int

All xa11y exceptions inherit from xa11y.XA11yError.

Exception Description
XA11yError Base exception for all xa11y errors.
PermissionDeniedError Accessibility permissions have not been granted.
AccessibilityNotEnabledError The target app advertises an accessibility tree but it is empty.
SelectorNotMatchedError No element in the tree matched the given selector.
ActionNotSupportedError The requested action is not supported on the target element.
TimeoutError An operation exceeded its timeout.
InvalidSelectorError The selector string has invalid syntax.
InvalidActionDataError An action received invalid data (e.g. Locator.nth(0), or a
PlatformError An OS-level accessibility error occurred.