CI Testing with test-spec.json
Define mobile tests as code and run them automatically in CI/CD pipelines. The test-spec.json format lets AI agents and automation tools describe tests that VibeView executes on real simulators.
Quick Start
- Create
.vibeview/test-spec.jsonin your repository root:
{
"version": 1,
"test_name": "Login Flow",
"steps": [
"Launch the app",
"Tap Sign In",
"Enter test credentials",
"Verify home screen loads"
]
}
- Copy the workflow template
templates/vibeview-test.yml(provided by VibeView) to.github/workflows/vibeview-test.ymlin your repo. See Workflow Template below.
Or run manually with the CLI:
# Install CLI from GitHub Packages
npm install -g @scriptx-com/vibeview-cli
vibeview run-test <test-id> --output json
Schema Reference
| Field | Type | Required | Description |
|---|---|---|---|
version | number | Yes | Schema version. Must be 1. |
steps | string[] | Yes | Array of natural language test steps for the AI agent. |
test_name | string | No | Name for the test case. Defaults to "CI Test". |
context | string | No | Additional context for the AI agent (e.g., login credentials, app state assumptions). |
Field Details
version — Must be 1. This field is mandatory from day one for forward compatibility. Future schema versions will increment this number and maintain backward compatibility.
steps — Each string is a natural language instruction for the AI agent. Write steps as you would explain them to a human tester. The AI agent interprets each step, interacts with the app UI, and reports pass/fail.
test_name — Optional label for the test case. Appears in PR comments, the VibeView dashboard, and CLI output.
context — Optional background information the AI agent uses when executing steps. Useful for providing test credentials, describing expected app state, or clarifying ambiguous UI elements.
Note:
app_pathandplatformare not in the test spec. The app build artifact path is configured in the workflow template’sAPP_PATHenv var, and platform is auto-detected from the binary when uploading.
File Location
The file must be at .vibeview/test-spec.json in the repository root:
your-repo/
.vibeview/
test-spec.json
src/
build/
...
The workflow template looks for this path by default. If the file is missing or invalid, the workflow fails with a clear error message.
Examples
Minimal (required fields only)
{
"version": 1,
"steps": [
"Launch the app",
"Verify the home screen is visible"
]
}
With context and test name
{
"version": 1,
"test_name": "Login Flow",
"steps": [
"Launch the app",
"Tap the Sign In button",
"Enter 'testuser@example.com' in the email field",
"Enter 'password123' in the password field",
"Tap Submit",
"Verify the dashboard screen loads with a welcome message"
],
"context": "The app has a test account pre-configured: testuser@example.com / password123. The Sign In button is on the initial landing screen."
}
Onboarding example
{
"version": 1,
"test_name": "Onboarding Flow",
"steps": [
"Launch the app",
"Swipe left through the onboarding screens",
"Tap Get Started",
"Verify the main screen appears"
],
"context": "First launch after fresh install. The app shows 3 onboarding slides before the Get Started button."
}
Using with GitHub Actions
Copy the workflow template into your repo and configure secrets. The workflow handles app upload, test creation, execution, and PR reporting automatically.
Minimal workflow
# Copy templates/vibeview-test.yml to .github/workflows/vibeview-test.yml
# Then add your build step and configure secrets (see Workflow Template section)
The workflow:
- Validates required secrets (
VIBEVIEW_API_TOKEN,VIBEVIEW_NPM_TOKEN). - Installs
@scriptx-com/vibeview-clifrom GitHub Packages. - Reads
.vibeview/test-spec.jsonand validates all required fields. - Uploads the app artifact to VibeView.
- Creates a test case from the spec steps and context.
- Runs the test with
--commit-shaset to the PR head SHA. - Posts a detailed PR comment with pass/fail status, step results, and a link to the full run in VibeView.
Pinning a specific build
By default, CI runs install the app’s latest uploaded build. To run against an older build — for example, to reproduce a regression on the last release before landing a fix — pass --build <id> to the CLI or set build_id on the CI API payload.
The id is the numeric AppBuild id. You can find it on the app detail page under Build History, or in the JSON response of POST /api/v1/apps/{app_id}/builds.
CLI
vibeview run-test <test-id> --build 142 --commit-sha $(git rev-parse HEAD)
vibeview run-suite <suite-id> --build 142 --commit-sha $(git rev-parse HEAD)
GitHub Actions
- name: Run VibeView suite against a specific build
env:
VIBEVIEW_API_TOKEN: ${{ secrets.VIBEVIEW_API_TOKEN }}
run: |
vibeview run-suite ${{ secrets.SUITE_ID }} \
--build ${{ vars.BUILD_ID }} \
--commit-sha ${{ github.sha }} \
--output junit > results.xml
If --build is omitted, the run falls back to the app’s latest build.
CI HTTP API
The CLI ultimately calls POST /api/v1/ci/run-test and POST /api/v1/ci/run-suite. Both endpoints accept an optional build_id integer in the JSON body. When omitted, the server resolves the latest build for the attached app. When the case or suite has no app attached, sending build_id returns HTTP 400.
Workflow Template
VibeView is distributed as a copy-paste workflow template rather than a uses: action reference.
Setup:
-
Copy the workflow template
templates/vibeview-test.yml(provided by VibeView) to.github/workflows/vibeview-test.ymlin your repo. -
Add these secrets to your repository (Settings > Secrets and variables > Actions):
VIBEVIEW_API_TOKEN— Your API token from VibeView Settings > API TokensVIBEVIEW_NPM_TOKEN— A GitHub PAT withpackages:readscope on the vibeview orgVIBEVIEW_BASE_URL(optional) — Your VibeView instance URL if not using the default
-
Add your app build step in the workflow where indicated by the comment.
-
Create
.vibeview/test-spec.jsonwith your test definition. -
Open a PR — the workflow runs automatically.
The template validates your test spec, uploads the app, runs AI tests, and posts results as a PR comment.
Using with Generic CI
For non-GitHub CI systems (GitLab CI, CircleCI, Jenkins, etc.), install the CLI and call commands directly:
# Install CLI from GitHub Packages
npm install -g @scriptx-com/vibeview-cli
# Parse the spec
STEPS=$(jq -c .steps .vibeview/test-spec.json)
CONTEXT=$(jq -r '.context // ""' .vibeview/test-spec.json)
TEST_NAME=$(jq -r '.test_name // "CI Test"' .vibeview/test-spec.json)
# Upload the app (platform is auto-detected from the binary)
APP_PATH="./build/MyApp.app" # Set to your build output path
APP_ID=$(vibeview upload-app "$APP_PATH" --output json | jq -r .app_id)
# Create and run the test
TEST_ID=$(vibeview create-test \
--app "$APP_ID" \
--steps "$STEPS" \
--context "$CONTEXT" \
--test-name "$TEST_NAME" \
--output json | jq -r .test_id)
vibeview run-test "$TEST_ID" \
--commit-sha $(git rev-parse HEAD) \
--output json
Environment variables
Set these in your CI environment:
| Variable | Description |
|---|---|
VIBEVIEW_API_TOKEN | Your VibeView API token (from Settings > API Tokens) |
VIBEVIEW_BASE_URL | Your VibeView instance URL (default: https://vibeview.io) |
Validation
The workflow template validates test-spec.json before proceeding. If any required field is missing or invalid, it fails immediately with a clear error:
- Missing
version— “test-spec.json: ‘version’ field is required” - Wrong version — “test-spec.json: unsupported version 2, expected 1”
- Missing
steps— “test-spec.json: ‘steps’ field is required” - Empty
steps— “test-spec.json: ‘steps’ must contain at least one step” - App file not found — “App artifact not found at: $APP_PATH. Set the APP_PATH env var or build your app in a prior step.”
Validation runs before any API calls, so you get fast feedback without waiting for uploads or test execution.
CLI Output with Run URLs
When a test completes, the CLI output includes a URL to view the full results in VibeView:
Human format:
PASS Login Flow 12.3s
4/4 cases passed
https://vibeview.io/tests/runs/abc123
JSON format:
{
"run_id": "abc123",
"status": "passed",
"url": "https://vibeview.io/tests/runs/abc123",
"duration_ms": 12340,
"test_runs": [...]
}
The URL is useful for PR comments, Slack notifications, or linking from other CI tools.
Reading run results
Run commands (run-test, run-suite, run-suites) are async: the server accepts the request and returns {run_id, status} immediately. The CLI then polls GET /api/v1/ci/runs/{run_id} every 3 seconds until the run completes. This section documents the poll payload for programmatic consumers who call the endpoint directly.
Response shapes
The poll response has two shapes depending on whether the run is a single test or a suite.
type: "test_run" — a single test case run. Top-level run fields plus a steps[] array (present once the run reaches a terminal status):
{
"run_id": "tr_abc123",
"type": "test_run",
"status": "failed",
"case_name": "Login Flow",
"duration_ms": 8420,
"run_url": "https://vibeview.io/tests/runs/tr_abc123",
"model_used": "claude-4-sonnet",
"progress": null,
"steps": [
{
"step": "Tap the Sign In button",
"status": "failed",
"error": "Element not found: Sign In button",
"error_kind": "element_not_found",
"error_kind_description": "The element the step needed was not present or could not be located on screen.",
"duration_ms": 3100,
"screenshot_url": "/api/v1/tests/test-runs/tr_abc123/screenshots/step_1.jpg",
"screenshot_signed_url": "https://vibeview.io/api/v1/public/screenshots/tr_abc123/step_1.jpg?exp=1746835200&sig=abc123",
"iteration_screenshots": []
}
]
}
While the run is still in flight, steps is absent and progress carries the live step-progress events instead; progress is null once the run is terminal.
type: "suite_run" — a suite run. Top-level run fields plus a test_runs[] array of per-case entries. total_cases is the suite’s full case count — test_runs[] only contains cases that have started, so use total_cases as the denominator for “N/M complete”. Note the run link: the top-level run_url points at the suite detail page (/tests/{suite_public_id}), where suite runs are shown as expandable rows; each per-case entry’s url points at that case run’s own page.
{
"run_id": "sr_def456",
"type": "suite_run",
"status": "failed",
"suite_public_id": "suite_abc123",
"suite_name": "Login Suite",
"duration_ms": 24800,
"run_url": "https://vibeview.io/tests/suite_abc123",
"total_cases": 4,
"test_runs": [
{
"public_id": "tr_ghi789",
"case_name": "Login Flow",
"status": "failed",
"duration_ms": 8200,
"error_message": "Visual mismatch on home screen",
"error_kind": "visual_regression",
"error_kind_description": "The screen differed from its visual baseline beyond the allowed threshold.",
"screenshot_signed_url": "https://vibeview.io/api/v1/public/screenshots/tr_ghi789/step_3.jpg?exp=1746835200&sig=xyz789",
"url": "https://vibeview.io/tests/runs/tr_ghi789"
}
]
}
Both shapes also carry started_at, finished_at, and error_message at the top level. While a run is waiting for a device (status: "queued"), the payload includes a queue block with entry_id, position, and specific_device_id.
screenshot_url vs screenshot_signed_url
Each failed step or suite case includes two screenshot fields:
screenshot_url— a relative API path (/api/v1/tests/test-runs/{run_id}/screenshots/{file}). Resolve it against your VibeView API host and send anAuthorization: Bearer <token>header to fetch it.screenshot_signed_url— a signed, time-limited public URL. No auth header needed. Returnsimage/jpegdirectly (suitable for embedding in Slack messages, PR comments, or email). Expires after 7 days; returns HTTP 403 if expired or the signature is invalid.
error_kind values
The error_kind field classifies why a step or suite case failed. It is populated for failed or errored steps/cases and null for passed ones. Every payload that carries error_kind also carries error_kind_description — a human-readable sentence explaining the classification.
AI-supplied — the AI agent picks these when it calls step_failed:
| Value | Description |
|---|---|
element_not_found | The element the step needed was not present or could not be located on screen. |
app_unresponsive | The screen was frozen or no UI rendered, so the step could not proceed. |
app_bug | The app crashed or behaved incorrectly (e.g. an error dialog or clearly wrong screen). |
wrong_state | The app was on a different screen than the step expected. |
assertion_mismatch | A condition the agent verified was not true. |
blocked | The step could not run because an external precondition was unmet (e.g. missing test data, login wall, paywall). |
needs_human | The failure was genuinely ambiguous and needs a person to judge. |
System-supplied — set in code when the cause is known:
| Value | Description |
|---|---|
assertion_failed | An explicit assertion in the test did not hold. |
max_iterations | The agent could not complete the step within the allowed number of attempts; the step may be too complex — try breaking it into smaller steps. |
aborted | The run was halted by a safety guard (e.g. token budget exceeded, or a replay step permanently failed). |
visual_regression | The screen differed from its visual baseline beyond the allowed threshold. |
Fallback:
| Value | Description |
|---|---|
unknown | The failure could not be classified (older run, or an unrecognized category). |
For webhook payloads that also carry error_kind, see the error_kind table in the Webhooks & Slack guide.
Filtering suites by tag
Tag suites in the VibeView UI (suite create/edit), then have CI run a tag-filtered subset instead of a single hard-coded suite ID. This is the recommended pattern for monorepos where different teams own different suites.
Discovering tags
vibeview list-suites --tag profiles
Lists every suite tagged profiles so you can confirm what would run.
Running by tag
vibeview run-suites --tag profiles --commit-sha $GIT_SHA
Runs every matching suite sequentially, one device session per suite, and aggregates the result. Exit codes: 0 aggregate passed, 1 any suite failed, 2 no matches / timeout / error / cancelled.
To pin a tagged run to a specific device, first list registered devices with vibeview list-devices, then pass --device.
PR-driven workflow
Trigger different suites based on what changed in the PR:
- name: Run profile tests if profiles/ changed
if: contains(github.event.pull_request.changed_files, 'profiles/')
run: vibeview run-suites --tag profiles --commit-sha ${{ github.sha }}
Use --match all if you need a suite to have every listed tag (e.g., --tag smoke --tag ios --match all runs only the iOS smoke suites).
In a build matrix where each job builds a single platform, scope the run to that platform with --platform so a shared tag like prod only runs the suites you built an app for — suites of other platforms are skipped and reported as excluded:
# matrix.platform is one of ios / android / tvos / androidtv
- run: vibeview run-suites --tag prod --platform ${{ matrix.platform }} --commit-sha ${{ github.sha }}
This is cleaner than encoding the platform as a tag — the suite’s own platform (or its linked app’s) is used. See run-suites for details.
Version History
| Version | Date | Changes |
|---|---|---|
| 1 | 2026-03-30 | Initial schema: version, steps, context, test_name |