Skip to content

Python API Reference

The xa11y Python package provides bindings to the xa11y Rust library via PyO3.

Install from PyPI:

Terminal window
pip install xa11y

Quick start:

import xa11y
safari = xa11y.App.by_name("Safari")
for button in safari.locator("button").elements():
print(button.name)
safari.locator("button[name='OK']").press()

Module Functions

xa11y.locator() → Locator

Create a top-level Locator searching from the system root.

ParameterTypeDefault
selectorstr

xa11y.input_sim() → InputSim

Construct an InputSim backed by the platform’s native input path.

xa11y.screenshot() → Screenshot

Capture pixels from the screen.

With no arguments, captures the full primary display. Pass element to capture the pixels under an element’s current bounds, or region as (x, y, width, height) to capture an explicit rectangle in logical screen coordinates. Passing both raises ValueError.

ParameterTypeDefault
elementElement | NoneNone
regiontuple[int, int, int, int] | NoneNone

Classes

Locator

A resilient element reference that re-queries on each interaction.

Locators never hold a live reference to a UI element. Instead, they store a selector and resolve it on demand, making them immune to staleness. Action methods auto-wait for the element to appear before acting.

Properties

PropertyTypeDescription
selectorstrThe CSS-like selector string for this locator.
MethodReturnsDescription
nth(n: int)LocatorReturn a new Locator that selects the n-th match (1-based).
first()LocatorReturn a new Locator that selects the first match.
child(selector: str)LocatorReturn a new Locator scoped to direct children matching selector.
descendant(selector: str)LocatorReturn a new Locator scoped to descendants matching selector.

Inspection

MethodReturnsDescription
exists()boolCheck if a matching element exists.
count()intCount matching elements.
element()ElementGet a single Element handle for the matched element.
elements()list[Element]Get all matching elements.

Actions

MethodReturnsDescription
tree(max_depth: int | None = None)dictCapture the subtree rooted at the matched element as a recursive dict.
dump(max_depth: int | None = None)strRender the subtree rooted at the matched element as an indented string.
press()Click / invoke the matched element.
focus()Set keyboard focus on the matched element.
blur()Remove keyboard focus from the matched element.
toggle()Toggle the matched element (checkbox, switch).
expand()Expand the matched element.
collapse()Collapse the matched element.
select()Select the matched element (list item, tab, etc.).
show_menu()Show the context menu for the matched element.
scroll_into_view()Scroll the matched element into the visible area.
increment()Increment the matched element (slider, spinner).
decrement()Decrement the matched element (slider, spinner).
set_value(value: str)Set the text value of the matched element.
set_numeric_value(value: float)Set the numeric value of the matched element (slider, spinner).
type_text(text: str)Type text at the current cursor position on the matched element.
select_text(start: int, end: int)Select a text range within the matched element (0-based offsets).
perform_action(action: str)Perform an action by snake_case name.

Waiting

MethodReturnsDescription
wait_visible(timeout: float = 5.0)ElementWait until the element is visible, polling the provider.
wait_attached(timeout: float = 5.0)ElementWait until the element exists in the tree.
wait_detached(timeout: float = 5.0)Wait until the element is removed from the tree.
wait_enabled(timeout: float = 5.0)ElementWait until the element is enabled.
wait_hidden(timeout: float = 5.0)Wait until the element is hidden or removed.
wait_disabled(timeout: float = 5.0)ElementWait until the element is disabled.
wait_focused(timeout: float = 5.0)ElementWait until the element has keyboard focus.
wait_unfocused(timeout: float = 5.0)ElementWait until the element does not have keyboard focus.
wait_until(predicate: Callable[[Element | None], bool], timeout: float = 5.0)Wait until an arbitrary predicate is satisfied.

EventType

Accessibility event type constants.


Event

An accessibility event delivered to subscribers.

Properties

PropertyTypeDescription
event_typestr
app_namestr
app_pidint
targetElement | None

Subscription

A live event subscription.

Methods

MethodReturnsDescription
try_recv()Event | None
recv(timeout: float = 5.0)Event
wait_for(predicate: Callable[[Event], bool], timeout: float = 5.0)Event
close()
__iter__()Iterator[Event]

App

A running application — the entry point for accessibility queries.

Properties

PropertyTypeDescription
namestr
pidint | None

Methods

MethodReturnsDescription
by_name(name: str, *, timeout: float = 5.0)AppFind an application by exact name.
by_pid(pid: int, *, timeout: float = 5.0)AppFind an application by process ID. See by_name for timeout.
list()list[App]List all running applications.
locator(selector: str)LocatorCreate a Locator scoped to this application’s accessibility tree.
subscribe()SubscriptionSubscribe to accessibility events from this application.
children()list[Element]Get direct children (typically windows) of this application.
as_element()ElementGet an Element handle for the application root.
tree(max_depth: int | None = None)dictCapture this application’s accessibility tree as a recursive dict.
dump(max_depth: int | None = None)strRender this application’s accessibility tree as an indented string.

Element

A live element with lazy navigation.

Properties

PropertyTypeDescription
rolestr
namestr | None
valuestr | None
descriptionstr | None
numeric_valuefloat | None
min_valuefloat | None
max_valuefloat | None
stable_idstr | None
pidint | None
actionslist[str]
boundsRect | None
rawdict[str, object]Platform-specific raw data attached to this element.
enabledbool
visiblebool
focusedbool
checkedstr | None
selectedbool
expandedbool | None
editablebool
focusablebool
modalbool
requiredbool
busybool

Methods

MethodReturnsDescription
children()list[Element]Get direct children (lazy — each call queries the provider).
parent()Element | NoneGet parent element (lazy — each call queries the provider).
tree(max_depth: int | None = None)dictCapture the subtree rooted at this element as a recursive dict snapshot.
dump(max_depth: int | None = None)strRender the subtree rooted at this element as an indented string.
subscribe()SubscriptionSubscribe to accessibility events for this element (typically an app).
press()Press (default activate) this element.
focus()Move keyboard focus to this element.
blur()Remove keyboard focus from this element.
toggle()Toggle this element’s checked state.
expand()Expand this element (e.g. tree node, combo box).
collapse()Collapse this element.
select()Select this element (e.g. list item, tab).
show_menu()Show this element’s context menu.
scroll_into_view()Scroll this element into view.
increment()Increment this element’s value (e.g. slider, spinner).
decrement()Decrement this element’s value.
set_value(value: str)Replace this element’s text value.
set_numeric_value(value: float)Set this element’s numeric value.
type_text(text: str)Insert text at the current cursor position.
select_text(start: int, end: int)Select the text range from start to end (0-based offsets).
perform_action(action: str)Perform an action by snake_case name.

InputSim

Input-simulation façade for synthesised pointer and keyboard events.

Targets are either a (x, y) tuple in screen pixels, or an Element (uses its bounds centre). Key values are strings: printable characters are literal ("a", "7", ";"); named keys use their Pascal name ("Enter", "ArrowUp", "F5"); modifiers are "Shift", "Ctrl", "Alt", "Meta".

Input simulation is distinct from the accessibility action layer — prefer Locator.press() / Locator.type_text() when the target exposes the semantic action. Use InputSim for gestures with no a11y equivalent (drag-and-drop, scroll wheels, global shortcuts).

Methods

MethodReturnsDescription
click(target: tuple[int, int] | Element)Left-click once at target.
double_click(target: tuple[int, int] | Element)Left double-click at target.
right_click(target: tuple[int, int] | Element)Right-click at target.
move_to(target: tuple[int, int] | Element)Move the pointer to target without pressing any button.
drag(start: tuple[int, int] | Element, end: tuple[int, int] | Element)Left-drag from start to end.
scroll(target: tuple[int, int] | Element, dx: int = 0, dy: int = 0)Scroll at target. dx positive → right, dy positive → down.
press(key: str)Tap a key (press + release).
chord(key: str, held: list[str] = ...)Tap key while the keys in held are held down.
type_text(text: str)Type literal text into the currently focused control.

Screenshot

A captured image: raw RGBA8 pixels plus dimensions and scale.

width and height are in physical pixels. scale is the physical-to-logical ratio (1.0 on standard displays, 2.0 on typical Retina). pixels has length width * height * 4.

Properties

PropertyTypeDescription
widthint
heightint
scalefloat
pixelsbytesRaw RGBA8 pixel bytes (width * height * 4).

Methods

MethodReturnsDescription
to_png()bytesEncode the image as a PNG and return the bytes.
save_png(path: str | bytes | object)Encode as PNG and write to path. Accepts str, bytes or os.PathLike.

_TestActionProbe

Methods

MethodReturnsDescription
locator(selector: str)Locator
actions()list[list[object]]
clear()

Data Classes

Rect

A bounding rectangle in screen coordinates (pixels).

Properties

PropertyTypeDescription
xint
yint
widthint
heightint

Exceptions

All xa11y exceptions inherit from xa11y.XA11yError.

ExceptionDescription
XA11yErrorBase exception for all xa11y errors.
PermissionDeniedErrorAccessibility permissions have not been granted.
AccessibilityNotEnabledErrorThe target app advertises an accessibility tree but it is empty.
SelectorNotMatchedErrorNo element in the tree matched the given selector.
ActionNotSupportedErrorThe requested action is not supported on the target element.
TimeoutErrorAn operation exceeded its timeout.
InvalidSelectorErrorThe selector string has invalid syntax.
InvalidActionDataErrorAn action received invalid data (e.g. Locator.nth(0), or a
PlatformErrorAn OS-level accessibility error occurred.