Releases: takahirom/arbigent
0.20.0
Fixed Focus Logic
The previous focus logic contained an issue where movement would stop prematurely even when failing to reach the target, causing the AI to become confused. This has been resolved through improved loop termination conditions.
Enhanced Multi-Image Assertion
We addressed challenges in verifying video playback status by implementing a multi-image AI assertion system. This allows sequential image comparisons to validate dynamic content states.
What's Changed
- Fix focus by index by @takahirom in #129
- Add multiple image assertion by @takahirom in #130
Full Changelog: 0.19.0...0.20.0
0.19.0
Experiment to Optimize System Prompt
We will store JSONL files in arbigent-result/
containing requestBody and responseBody data. User feedback from the interface can also be recorded in arbigent-result/
to enable AI-driven optimization.

The current system prompt was developed through trial and error. We plan to implement the COPRO optimization method using collected request-response pairs as training data. While not yet implemented, we have sample code for prompt optimization.
Failed Cache Removal
We encountered recurring failures due to preserved AI decision caches after unsuccessful tests. This was resolved by automatically removing corresponding cache entries when tests fail.
What's Changed
- Add experimental methods for maintain ssot by @takahirom in #123
- Remove failed cache by @takahirom in #127
- Save API call jsonl and add feedback feature by @takahirom in #124
Full Changelog: 0.18.0...0.19.0
0.18.0
New Feature
You can add notes for other team members and rename scenario IDs to make the YAML more readable.
What's Changed
- Add note for humans by @takahirom in #121
- Make scenario id changeable by @takahirom in #122
Full Changelog: 0.17.0...0.18.0
0.17.0
Fix for Windows Compatibility Issues
Though I don't have a Windows environment for testing, the issue might be resolved now. Thank you for reporting this, @anunay1!
If you're still experiencing issues, please let me know.
#118
Bugfix
Fixed erroneous image assertion executions even when not explicitly set. These assertions were being triggered upon Agent task completion, resulting in redundant executions. We've addressed this behavior and implemented test cases to verify the fix.
What's Changed
- Fix issue where image assertion runs even if image assertions are not set by @takahirom in #116
- Improve auto focus by @takahirom in #117
- Fix windows adb connection by @takahirom in #119
- Add perUserInstall for Windows by @takahirom in #120
Full Changelog: 0.16.0...0.17.0
0.16.0
New Feature: AI Decision-Making Cache
When the ViewTree structure and goal (prompt) are identical, the system can now utilize a cached memory of AI decisions. This addresses performance bottlenecks since AI decision-making is typically the most resource-intensive and time-consuming component. This is an experimental feature and is disabled by default.
- Launch app // ← This task can be cached when executing the "Open Member Page" scenario
- Open search
- Open member page

Important bug fix
In version 0.15.0, the API key could be written to the log file unintentionally.
We've introduced a mechanism to replace the API key with placeholders and added a CI check to prevent potential API key leaks.
What's Changed
- Refactor: Extract detectStuckScreen by @takahirom in #112
- Refactor: Make step suspend by @takahirom in #113
- Refactor: Make execute interceptable by @takahirom in #114
- Add AI decision cache by @takahirom in #115
Full Changelog: 0.15.0...0.16.0
[Deprecated] 0.15.0
In version 0.15.0, log files may contain an API key. We are actively addressing this issue.
UI Updates
We have several updates to the UI:
- Added launch app arguments
- Added a console for debugging
The scenario.cleanupData
parameter in YAML is no longer used (though it currently does not cause errors). You can use the initialization methods instead.
CLI Updates
You can now use --scenario-id=foo
to filter scenarios. You can also specify multiple scenarios using --scenario-id=foo,bar
or --scenario-id=foo --scenario-id=bar
.
Use --dry-run
to preview which scenarios will run. This is particularly useful with the --shard
option, as it might otherwise be difficult to determine target scenarios. (This is also used in Arbigent tests)
What's Changed
- Refactor initialization methods and add wait option and arguments for… by @takahirom in #106
- Refactor console by @takahirom in #107
- Add instructions for installing apps from unidentified developers by @takahirom in #108
- Add scenario-id filter and dry-run by @takahirom in #109
- Add test for initialization methods by @takahirom in #110
- Remove confusing cleanup option by @takahirom in #111
Full Changelog: 0.14.0...0.15.0
0.14.0
Fix critical issues in UI
We encountered issues with Arbigent where device connectivity and API key input occasionally failed to save properly. These issues have now been resolved. Please test the updated version.
What's Changed
- Connect device only once on UI by @takahirom in #102
- Fix API key input bug by @takahirom in #103
Full Changelog: 0.13.0...0.14.0
0.13.0
Add shard option to enable parallel tests
You can run tests separately with the --shard option.
arbigent --shard=1/4
cli-e2e-android:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shardIndex: [ 1, 2, 3, 4 ]
shardTotal: [ 4 ]
steps:
...
- name: CLI E2E test
uses: reactivecircus/android-emulator-runner@v2
...
script: |
arbigent --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }} --os=android --project-file=sample-test/src/main/resources/projects/e2e-test-android.yaml --ai-type=gemini --gemini-model-name=gemini-2.0-flash-exp
...
- uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4
if: ${{ always() }}
with:
name: cli-report-android-${{ matrix.shardIndex }}-${{ matrix.shardTotal }}
path: |
arbigent-result/*
retention-days: 90
What's Changed
- Remove UI tree from the result. Because UI tree is too big by @takahirom in #100
- Add shard option to enable parallel execution by @takahirom in #99
- Add shard to README by @takahirom in #101
Full Changelog: 0.12.0...0.13.0
0.12.0
You can now see the Arbigent running status at the bottom of the screen.
What's Changed
- [Doc/Sample] Add now in android sample prompts by @takahirom in #95
- [UI] Add GlobalStatus for UI by @takahirom in #96
- [UI] Show device connecting message in UI by @takahirom in #97
- [Doc] Update yaml file by @takahirom in #98
Full Changelog: 0.11.0...0.12.0
0.11.0
New Feature: screen stuck detection
Identifies and recovers from situations where the AI agent gets stuck on the same screen, prompting it to reconsider its actions.
What's Changed
- [Web] Fix Web issue where Web can't find element that clicks by @takahirom in #89
- [UI] Improve window close logic by @takahirom in #90
- [Refactor] Refactor destructing by @takahirom in #91
- [New Feature] Add screen stuck detection by @takahirom in #92
- [Docs] Add SMURF and Stuck Screen Detection to README by @takahirom in #93
- [Docs] Refactor features of README by @takahirom in #94
Full Changelog: 0.10.1...0.11.0