Skip to content

Quick Start

Installation

[dependencies]
xa11y = "0.4"

Platform setup

macOS — Two permissions are required:

  1. Accessibility — System Settings → Privacy & Security → Accessibility. Grant permission to your terminal (or IDE).
  2. Screen & System Audio Recording (macOS 26+) — System Settings → Privacy & Security → Screen & System Audio Recording. Grant permission to your terminal. Without this, the accessibility API only exposes menu bars — window content (buttons, text fields, etc.) is invisible.

After changing permissions, restart your terminal for them to take effect. App::by_name() (Rust) or xa11y.App.by_name() (Python) will return a PermissionDenied error with setup instructions if either permission is missing.

Linux — Ensure AT-SPI2 is running (default on GNOME and most desktop environments). No special permissions needed. Electron and Chromium-based apps (VS Code, Cursor, Chrome) require an additional launch flag to expose their accessibility tree — see Platform Details.

Windows — No special permissions needed.

Discover the tree first

Before writing selectors, dump the app’s accessibility tree so you know the exact roles and names to target. Accessibility labels are often different from what’s painted on screen, so don’t guess.

The fastest way is the CLI — it dumps the whole app in one command and works the same regardless of language:

Terminal window
xa11y apps # list running apps with PIDs
xa11y tree --app Calculator # full indented tree
xa11y find "button" --app Calculator # try a selector before using it

See the CLI guide for the full command reference.

From code, dump() and tree() are available on App, Locator, and Element in every binding:

import xa11y
calc = xa11y.App.by_name("Calculator")
print(calc.dump()) # full indented tree, rooted at the app
print(calc.dump(max_depth=2)) # limit depth
snapshot = calc.tree() # same data as a nested dict
# Locators and elements also support dump() / tree()
print(calc.locator("group[name='Keypad']").dump())

Example

use xa11y::*;
use std::time::Duration;
fn main() -> Result<()> {
// Connect to the Calculator app, polling up to 5 seconds for it to
// register with the platform's a11y API (pass Duration::ZERO for a
// single attempt with no waiting).
// (returns PermissionDenied if accessibility is not enabled)
let calc = App::by_name("Calculator", Duration::from_secs(5))?;
// Navigate the tree
for window in calc.children()? {
println!("{}: {:?}", window.role, window.name);
}
// Find all buttons via locator
let buttons = calc.locator("button").elements()?;
println!("Found {} buttons", buttons.len());
// Read properties from a specific element
let btn = calc.locator("button[name='=']").element()?;
println!("role={} name={:?} bounds={:?}", btn.role, btn.name, btn.bounds);
// Locators re-resolve on every call,
// so they always target the current UI state
calc.locator("button[name='7']").press()?;
// Type into a text field
calc.locator("text_field[name='Input']").type_text("42")?;
// Wait for UI changes before continuing
let result = calc.locator("static_text[name='Result']");
result.wait_visible(Duration::from_secs(5))?;
println!("Result: {:?}", result.element()?.value);
// List all running apps
for app in App::list()? {
println!("{} (pid: {:?})", app.name, app.pid);
}
Ok(())
}

Selector syntax

PatternMeaning
buttonElements with role Button
button[name='OK']Button named exactly “OK”
textfield[name^='Search']Text field whose name starts with “Search”
textfield[name*='email']Text field whose name contains “email”
group > buttonButtons that are direct children of a group
window button[name='OK']Button named “OK” anywhere inside a window
button:nth(2)The 2nd button match
button[name='All Clear'], button[name='Clear']Either button — see Selector groups below

Selector groups (alternation)

A top-level comma separates alternation clauses: the result is the union of each clause’s matches, deduplicated by element identity and returned in document order — the same semantics CSS uses for selector lists. This is useful when the label of an element changes with state and you want one locator that handles every variant:

# macOS Calculator's leftmost button is "All Clear" when the display is 0
# and "Clear" once you've typed anything — match either.
app.locator("button[name='All Clear'], button[name='Clear']").press()

Each clause is parsed independently, so combinators apply per clause: window button, dialog button reads as “(window > > button) or (dialog > > button)”, not as a stray comma inside one selector. Chained descendant() and child() calls on a group locator distribute over every clause:

# Equivalent to: "toolbar button, dialog button"
app.locator("toolbar, dialog").descendant("button")

Commas inside quoted attribute values ([name='a,b']) are not separators.

Next steps