TV Testing

VibeView supports tvOS (Apple TV) and Android TV in the Device Sandbox and for AI-driven test runs, plus Roku in beta (streaming and remote control — see Roku (beta)). These platforms use a focus-based interaction model rather than touch: navigation happens with a d-pad, and actions operate on whichever UI element currently holds focus. Recording, replay, and the AI tool set all account for this.

Supported device groups

The device picker in the Device Sandbox shows up to three TV groups when TV-category devices are registered with the worker:

Apple TV — tvOS simulator entries (e.g. Apple TV 4K running tvOS).
Android TV — Android TV emulator entries.
Roku — physical Roku devices (beta).

Exact models and OS versions come from the worker’s installed simulator and emulator set, not from a fixed list in the UI.

Launching a TV session

Open the Device Sandbox.
In the device dropdown, open the Apple TV or Android TV group and pick a device.
Click Start.

The session boots the same way as a phone or tablet session, and the video stream appears once the device is ready. You can install an app from your organization’s library into a TV session the same way you would on a phone.

Navigating with the d-pad

TV sessions have no touch interaction — tapping the stream does nothing. Instead, the sandbox forwards keyboard events as d-pad presses while a TV session is active:

Keyboard	D-pad action
Arrow Up	DPAD_UP
Arrow Down	DPAD_DOWN
Arrow Left	DPAD_LEFT
Arrow Right	DPAD_RIGHT
Enter	DPAD_CENTER (select)
Escape	BACK

The hint bar under the video stream reads Use arrow keys to navigate, Enter to select, Esc to go back while a TV session is live.

Keyboard forwarding is suppressed when focus is in an input, textarea, or select control in the sandbox UI, so you can still type into the sandbox chrome without firing d-pad events at the device.

Focus model

TV apps draw a focus cursor that moves between interactive elements. A d-pad direction press moves that focus; DPAD_CENTER activates whatever is focused. VibeView surfaces focus in two places:

UI tree — focused elements are marked [focused] in the formatted tree the AI agent sees. On Apple TV the tree comes from the native automation layer. On Android TV the focused attribute from the device’s accessibility tree is preserved on each element.
Recordings — d-pad events captured during a recording are enriched with the focused element at the time of the press, so replay can confirm the intended target instead of only replaying key codes.

AI tools on TV sessions

When a test runs against a TV session, the agent’s tool list is adjusted to match the platform:

Touch-only tools (tap, tap_and_type, tap_coordinates, swipe, scroll_until_visible, gesture_preset, alert, long_press, select_picker_value, drag, keys) are removed.
press_button is used for d-pad and back: dpad_up, dpad_down, dpad_left, dpad_right, dpad_center, back. home is also accepted, but the agent is explicitly warned that it exits the app and should only be used when a step requires leaving the app.
type_text, clear_text, wait, locate, assert_visible, assert_not_visible, assert_value, find_element, step_complete, and step_failed work the same as on mobile.

On both TV platforms the agent gets two additional focus tools:

tap_focused_tv(ref) — move focus to the element identified by ref and press SELECT in one call. Use this instead of chaining several press_button(dpad_*) calls followed by dpad_center.
focus_element_tv(ref) — move focus to the element identified by ref without pressing SELECT. Use when you want to verify state or position focus before a subsequent action.

The two tools work on both platforms, tuned to each:

On Apple TV they use the platform’s built-in focus navigation to reach the element, then select it. The tools resolve ref against the current UI tree.
On Android TV VibeView navigates focus to the target element automatically and verifies focus actually arrived. If the element can’t be reached, the tool stops and reports it instead of pressing keys indefinitely.

Android TV focus-walk requires the app to expose focus to the device’s accessibility layer (react-native-tvos / Leanback native-focus apps). Apps that track focus only in their own JS never surface it, so focus-walk can’t see or verify it — see Limitations.

Recording TV interactions

Enhanced recording captures d-pad presses on TV sessions. Each press is automatically enriched with the element that held focus at the time, so replay can verify the intended target instead of only re-sending key codes. This happens in the background and never slows down your recording.

Focus-verified replay

When a recorded TV d-pad step carries a focused element, replay does not blindly re-send the recorded keycode. Instead it navigates to that element and verifies focus actually landed there before continuing. This makes replay survive layout drift: if a row or tile is added or removed since recording, replay still targets the intended element instead of silently mis-firing the old keycode at whatever now sits in that position.

If the element can’t be reached — it’s gone from the screen, or the app’s focus isn’t observable to the device’s accessibility layer — replay falls back to the recorded d-pad keycode and marks the step unverified, recording the reason. The behavior is identical on both TV platforms.

Apple TV specifics

The AI agent’s UI tree and focus actions on Apple TV are served by the native automation layer, which exposes the full element tree and focus state for the running app.

Android TV specifics

The UI tree comes from the device’s accessibility tree. The focused and selected attributes are preserved on every element, so the agent can see which row or tile holds focus.
Focus navigation is automatic: VibeView moves focus to the target element and verifies it arrived, or reports that the target can’t be reached. This is what backs tap_focused_tv / focus_element_tv and focus-verified replay on Android TV.

Roku (beta)

Roku support is in beta. Roku devices appear as a third group in the device picker, and a Roku session streams a real physical Roku device live to your browser.

Remote control works with the same keyboard model as the other TV platforms: arrow keys move the d-pad, Enter is OK/select, and Escape is back. Text entry is supported too — when a Roku on-screen keyboard is focused, printable keys you type are sent straight to it (Backspace deletes), with no separate typing mode.

The UI Tree Inspector works on Roku sessions. Element coordinates in the tree are best-effort, so treat highlighted positions as approximate.

Roku sessions are confined to the app under test: system and home navigation is blocked, and if the device leaves the app for any reason, VibeView brings the app back automatically.

Not yet available on Roku:

App upload / sideload — the app under test must already be installed on the device.
AI test runs — the AI test runner does not support Roku sessions yet.

Limitations

tap_focused_tv and focus_element_tv work on both TV platforms. On Android TV they require focus to be observable to the device’s accessibility layer (react-native-tvos / Leanback native-focus apps). Apps that track focus only in their own JS never surface it: focus-walk can’t see or verify focus, so the agent and replay fall back to plain press_button(dpad_*) navigation and steps are marked unverified. This is an app-architecture limitation, not a bug.
Touch input into the video canvas is disabled on TV sessions. Navigate via keyboard.
Exact device models depend on the worker’s installed simulator/emulator set; the sandbox does not ship a fixed catalog of TV devices.

Running TV tests from the CLI

Pass --device-category tv to any of the CLI run commands to allocate a TV device for the run. The platform (tvOS vs Android TV) is still derived from the test case’s app — --device-category only selects the form factor.

vibeview run-test test-789 --device-category tv
vibeview run-suite suite-abc --device-category tv
vibeview run-suites --tag smoke --device-category tv

If no TV-category device is registered with the worker, the run queues until one becomes available (same behavior as phone / tablet).

When multiple TV devices are registered, use vibeview list-devices --category tv to find a specific device_id, then pin your run to it:

vibeview run-test test-789 --device atv-01

The --device flag overrides --device-category (the two cannot be combined).

Cross-TV replay (record on one TV platform, replay on the other)

A TV test recorded on one TV platform can be replayed on the other TV platform. Both Apple TV and Android TV are the same form factor (--device-category tv), so a recording made on one replays on the other — the same way iOS and Android mobile recordings replay cross-platform.

There is no special flag for this. Replay targets whichever TV platform the run is pointed at, and the platform comes from what you replay against: the test case’s app (or suite), or the live sandbox session, determines the platform — exactly as described in Running TV tests from the CLI, where --device-category tv selects the form factor and the platform is derived from the app. To replay on the other platform, point the same recorded test at that platform’s app (or run it in a session on that platform).

Whether a step replays verified or unverified depends on shared identifiers:

When the app exposes shared, stable identifiers across platforms, replay reaches the recorded element by selector and focus-verifies it. react-native-tvos apps that set a JS testID get it mapped to both the iOS accessibilityIdentifier and the Android resource-id / content-desc, so the same selector matches on either platform — giving focus-verified cross-TV replay.
Without shared identifiers, the selector can’t match on the other platform, so replay falls back to portable d-pad keycodes (the DPAD_* map is identical across both platforms) and the step is marked unverified.

So shared testIDs are what unlock verified cross-TV replay. The custom-JS-focus limitation applies here too: if an Android TV app’s focus isn’t observable to the device’s accessibility layer, focus can’t be verified on that side regardless of identifiers, and those steps fall back to unverified d-pad navigation.