Skip to content

JavaScript API Reference

Overview

The @crowecawcaw/xa11y package provides cross-platform accessibility queries and actions for Node.js. All methods that touch the accessibility tree are asynchronous — they run on the napi tokio worker pool so the Node event loop stays responsive.

import { App } from '@crowecawcaw/xa11y';
const app = await App.byName('Safari');
await app.locator('button[name="OK"]').press();

Types

type CheckedState

Checked state of a toggleable element.

type CheckedState = 'on' | 'off' | 'mixed';

type EventTypeName

Accessibility event type names, normalised across platforms.

type EventTypeName = | 'focusChanged'
| 'valueChanged'
| 'nameChanged'
| 'stateChanged'
| 'structureChanged'
| 'windowOpened'
| 'windowClosed'
| 'windowActivated'
| 'windowDeactivated'
| 'selectionChanged'
| 'menuOpened'
| 'menuClosed'
| 'alert'
| 'textChanged';

type NativeSubscription

type NativeSubscription = _NativeSubscription;

Errors

All operations throw subclasses of XA11yError. Catch a specific subclass with instanceof and let the rest propagate.

XA11yError

Base class for all xa11y errors.

PermissionDeniedError

Accessibility permissions have not been granted.

AccessibilityNotEnabledError

The target app advertises an accessibility tree but it is empty.

Raised on Linux when a Chromium/Electron app is launched without --force-renderer-accessibility (or the ACCESSIBILITY_ENABLED=1 environment variable), so the renderer accessibility bridge never populates the window’s subtree.

SelectorNotMatchedError

No element matched the selector (also used for stale elements).

ActionNotSupportedError

The requested action is not supported on the target element.

TimeoutError

An operation exceeded its timeout.

InvalidSelectorError

The selector string has invalid syntax.

InvalidActionDataError

The data passed to an action method was rejected (e.g. out-of-range slider value).

PlatformError

An OS-level accessibility error occurred.

Classes

App

A running application — the entry point for accessibility queries.

Construct via App.byName, App.byPid, or App.list. An App is not an Element — it represents the application as a whole and provides App.locator to search its accessibility tree.

Static methods

static byName(name: string, options?: AppLookupOptions | undefined | null): Promise<App>

Find an application by exact name.

Pass options.timeout (ms) to poll the accessibility API until the app appears. Useful when the app may not yet be registered (e.g. just-launched). Only “not found” errors trigger a retry; permission errors and the like fail fast.

Rejects with PermissionDeniedError if accessibility is not enabled, or SelectorNotMatchedError if no matching app is found.

static byPid(pid: number, options?: AppLookupOptions | undefined | null): Promise<App>

Find an application by process ID.

See App.byName for the options.timeout behaviour.

static list(): Promise<App[]>

List all running applications with an accessibility tree.

Properties

name

Type: string

The application’s human-readable name (e.g. "Safari").

pid

Type: number | null

The application’s process ID, or null if the platform does not expose one for this app.

Methods

locator(selector: string): Locator

Create a Locator scoped to this application’s accessibility tree.

The locator re-resolves selector on every operation, so it always targets the current UI state — see the Locator class for the full API.

children(): Promise<Element[]>

Get direct children (typically windows) of this application.

subscribe(): Promise<_NativeSubscription>

Subscribe to accessibility events from this application.

Element

A snapshot of a node in the accessibility tree.

Property getters (role, name, value, state flags, etc.) are synchronous — they read the snapshot data captured when the element was fetched. Navigation methods (children(), parent()) are async and re-query the provider on every call, so you always see the latest tree state.

Elements are cheap to pass around; they share the provider handle internally.

Properties

role

Type: string

The element’s role, as a snake_case string (e.g. "button", "check_box").

name

Type: string | null

Human-readable name (title, label, or ARIA name).

value

Type: string | null

Current value — text content for editable fields, stringified slider position, etc. For numeric controls, prefer numericValue.

description

Type: string | null

Supplementary description (tooltip text, ARIA description).

numericValue

Type: number | null

Numeric value for sliders, spin buttons, and progress indicators.

minValue

Type: number | null

Minimum numeric value for bounded controls (slider, spin button).

maxValue

Type: number | null

Maximum numeric value for bounded controls (slider, spin button).

stableId

Type: string | null

Platform-assigned identifier that is stable across queries for the same element. Not available on every platform / every widget.

pid

Type: number | null

Process ID of the owning application.

actions

Type: Array<string>

Names of actions the element advertises (e.g. ["press", "focus"]). Use Locator.performAction(name) to invoke a custom action, or the named convenience methods (press, toggle, etc.) for the common ones.

bounds

Type: Rect | null

Screen-coordinate bounding rectangle, or null for virtual / off-screen elements that do not have a physical position.

raw

Type: Record<string, unknown>

Platform-specific raw data attached to this element, as a plain JS object. Keys are provider-defined (e.g. ax_role/ax_subrole on macOS, uia_control_type on Windows). Values are JSON-compatible — strings, numbers, booleans, arrays, nested objects. Intended for debugging and platform-specific queries.

enabled

Type: boolean

true if the element is interactive (not greyed out or disabled).

visible

Type: boolean

true if the element is currently rendered on screen (not hidden, not clipped off the viewport).

focused

Type: boolean

true if the element currently has keyboard focus.

checked

Type: CheckedState | null

Tri-state checked value for checkboxes, toggle buttons, and menu items: "on", "off", "mixed", or null if the element is not toggleable.

selected

Type: boolean

true if the element is selected (list item, tab, row).

expanded

Type: boolean | null

true / false for expandable elements (disclosures, menus, tree items); null if the element is not expandable.

editable

Type: boolean

true if the element accepts text editing (text field, text area, rich-text region).

focusable

Type: boolean

true if the element can receive keyboard focus (distinct from focused, which reports the current state).

Type: boolean

true if the element is a modal dialog that blocks interaction with the rest of the app.

required

Type: boolean

true for form fields that are marked required.

busy

Type: boolean

true if the element is loading or otherwise indicating a busy state (progress indicator, spinner region).

Methods

children(): Promise<Element[]>

Get direct children (lazy — each call re-queries the provider).

parent(): Promise<Element | null>

Get the parent element, or null if this is the root.

subscribe(): Promise<_NativeSubscription>

Subscribe to accessibility events for this element (typically an app).

Event

An accessibility event delivered to a Subscription.

Events are emitted from the source application — focus changes, value edits, window lifecycle, structural updates. Attach a listener via subscription.on(type, handler) or await one with subscription.waitForEvent(type, opts).

Properties

type

Type: EventTypeName

Event kind, as a camelCase string (e.g. "focusChanged", "valueChanged").

stateFlag

Type: string | null

For stateChanged events: the flag that changed (e.g. "checked", "busy"). null for all other event kinds.

stateValue

Type: boolean | null

For stateChanged events: the new boolean value of the flag. null for all other event kinds.

appName

Type: string

Name of the application that emitted this event.

appPid

Type: number

Process ID of the application that emitted this event.

target

Type: Element | null

Snapshot of the element that triggered the event, if available.

InputSim

Synthesises OS-level pointer and keyboard events.

Constructed via the module-level inputSim() function. Targets are either an [x, y] tuple in screen pixels, or an Element (centred on its bounds). Key values are strings: printable characters are literal ("a", "7", ";"); named keys use their Pascal name ("Enter", "ArrowUp", "F5"); modifiers are "Shift", "Ctrl", "Alt", "Meta".

Input simulation is distinct from the accessibility action layer — prefer Locator.press / Locator.typeText when the target exposes the semantic action. Use InputSim for gestures with no a11y equivalent (drag-and-drop, scroll wheels, global shortcuts).

Methods return Promise<void> — the underlying OS input APIs are synchronous but can block briefly, so they run on the napi worker pool.

Methods

click(target: Array<number> | Element): Promise<void>

Left-click once at target.

doubleClick(target: Array<number> | Element): Promise<void>

Left double-click at target.

rightClick(target: Array<number> | Element): Promise<void>

Right-click at target.

moveTo(target: Array<number> | Element): Promise<void>

Move the pointer to target without pressing any button.

drag(start: Array<number> | Element, end: Array<number> | Element): Promise<void>

Left-drag from start to end. Default duration (150 ms).

scroll(target: Array<number> | Element, dx?: number | undefined | null, dy?: number | undefined | null): Promise<void>

Scroll at target. dx positive → right, dy positive → content scrolls down. Defaults: 0, 0 (a no-op).

press(key: string): Promise<void>

Tap a key (press + release).

chord(key: string, held?: Array<string> | undefined | null): Promise<void>

Tap key while the keys in held are held down.

typeText(text: string): Promise<void>

Type literal text into the currently focused control.

Locator

A resilient element reference that re-queries on each interaction.

Locators never hold a live reference to a UI element. Instead, they store a selector and resolve it on demand, making them immune to staleness. Action methods (press, typeText, toggle, …) auto-wait for the element to appear (up to 5 seconds by default) before acting.

Locators are cheap to clone — the chaining methods (child, descendant, nth, first) return new locators rather than mutating in place.

@example

const app = await App.byName('MyApp');
const save = app.locator("button[name='Save']");
await save.press(); // auto-waits, then presses
await save.waitEnabled(10); // wait up to 10 seconds

Properties

selector

Type: string

The CSS-like selector string for this locator.

Methods

nth(n: number): Locator

Return a new Locator that selects the n-th match (1-based).

first(): Locator

Return a new Locator that selects the first match.

child(selector: string): Locator

Return a new Locator scoped to direct children matching selector.

descendant(selector: string): Locator

Return a new Locator scoped to descendants matching selector.

exists(): Promise<boolean>

Check whether a matching element exists (does not throw on miss).

count(): Promise<number>

Count matching elements.

element(): Promise<Element>

Resolve to a single [Element] snapshot. Throws SelectorNotMatchedError if no element matches.

elements(): Promise<Element[]>

Resolve to all matching [Element] snapshots.

press(): Promise<void>

Click / invoke the matched element.

Auto-waits for the element to exist before acting. For elements whose primary activation is toggle or select (checkbox, tab, radio), press dispatches to that semantic — there is no need to distinguish.

focus(): Promise<void>

Move keyboard focus to the matched element.

blur(): Promise<void>

Remove keyboard focus from the matched element.

Not supported on Linux or Windows — on those platforms this rejects with ActionNotSupportedError.

toggle(): Promise<void>

Toggle a two- or three-state control (checkbox, switch, toggle button).

expand(): Promise<void>

Expand a disclosure, menu, or tree item.

collapse(): Promise<void>

Collapse a disclosure, menu, or tree item.

select(): Promise<void>

Select the matched element (list item, tab, row).

showMenu(): Promise<void>

Open the element’s context menu.

scrollIntoView(): Promise<void>

Scroll the element into the visible area.

No-op on macOS — the macOS accessibility API has no equivalent. Uses Component.ScrollTo on Linux and ScrollItemPattern on Windows.

increment(): Promise<void>

Increment a numeric value (slider, spin button) by its platform step.

decrement(): Promise<void>

Decrement a numeric value (slider, spin button) by its platform step.

setValue(value: string): Promise<void>

Set the text value of the matched element. Replaces the entire value rather than inserting at the caret — use typeText for insertion.

setNumericValue(value: number): Promise<void>

Set the numeric value of the matched element (slider, spin button).

typeText(text: string): Promise<void>

Type text at the current caret position.

Uses the platform accessibility API — never simulates keyboard events. For synthesised keystrokes (global shortcuts, drag gestures), use the InputSim surface instead.

selectText(start: number, end: number): Promise<void>

Select the text range from start to end (0-based character offsets). Rejects with InvalidActionDataError if start > end.

performAction(action: string): Promise<void>

Perform a custom action by its snake_case name.

Use this for actions the element advertises in its actions list that don’t have a dedicated method. Rejects with ActionNotSupportedError if the element does not advertise action.

waitVisible(timeoutSeconds?: number): Promise<Element>

Wait for a matching element to become visible. Rejects with TimeoutError if still hidden after timeoutSeconds.

waitAttached(timeoutSeconds?: number): Promise<Element>

Wait for a matching element to exist in the tree (may not be visible). Rejects with TimeoutError if no match appears within timeoutSeconds.

waitDetached(timeoutSeconds?: number): Promise<void>

Wait for the matching element to be removed from the tree.

waitEnabled(timeoutSeconds?: number): Promise<Element>

Wait for the matching element to become enabled (interactive).

waitHidden(timeoutSeconds?: number): Promise<void>

Wait for the matching element to be hidden or removed.

waitDisabled(timeoutSeconds?: number): Promise<Element>

Wait for the matching element to become disabled (non-interactive).

waitFocused(timeoutSeconds?: number): Promise<Element>

Wait for the matching element to receive keyboard focus.

waitUnfocused(timeoutSeconds?: number): Promise<Element>

Wait for the matching element to lose keyboard focus.

Screenshot

A captured image: raw RGBA8 pixels plus dimensions and scale.

width and height are in physical pixels. scale is the physical-to- logical ratio (1.0 on standard displays, 2.0 on typical Retina). pixels.length equals width * height * 4.

Properties

width

Type: number

Image width in physical pixels.

height

Type: number

Image height in physical pixels.

scale

Type: number

Physical-to-logical pixel ratio (1.0 on standard displays, 2.0 on typical Retina, 1.5 / 1.75 / 2.0 on common Windows / Linux HiDPI).

pixels

Type: Buffer

Raw RGBA8 pixel bytes (width * height * 4).

Methods

toPng(): Buffer

Encode the image as a PNG and return the bytes.

savePng(path: string): void

Encode as PNG and write to path.

AppLookupOptions

Options for App.byName / App.byPid.

Rect

A bounding rectangle in screen coordinates.

Coordinates use the platform’s native coordinate space: points on macOS, physical pixels on Windows and Linux. Origin is the top-left of the primary display; negative x / y are valid on multi-monitor setups.

Properties

x

Type: number

Left edge, in screen coordinates.

y

Type: number

Top edge, in screen coordinates.

width

Type: number

Width in screen-coordinate units.

height

Type: number

Height in screen-coordinate units.

Functions

inputSim(): InputSim

Construct an InputSim backed by the platform’s native input path (CGEvent on macOS, SendInput on Windows, XTest on X11).

Throws PlatformError on a Wayland-only Linux session (no XTest available). InputSim is cheap to hold; construct one and reuse.

locator(selector: string): Locator

Create a top-level Locator that searches from the system accessibility root (across all applications).