feat(gitlab): incremental /review -i and /improve -i#2389
Conversation
When push_commands re-runs /review on every MR update, the full diff is re-evaluated and the persistent review comment is rewritten — losing the context of prior findings. The /review -i path already supported this on GitHub but was a no-op for GitLab. Mirror the GitHub provider's incremental flow for GitLab: - Locate the most recent prior review note (## PR Reviewer Guide). - Use its timestamp to partition MR commits into seen vs. new. - Use repository_compare(last_seen_sha, head_sha) to derive the diff restricted to commits added since the last review, anchoring file contents to last_seen_sha instead of the MR base. - Adapt GitLab commits / notes to the PyGithub-shaped attributes (`.sha`, `.commit.author.date`, `.created_at`, `.html_url`) consumed by the shared incremental code path in PRReviewer. Existing default push_commands are unchanged; users opt in by setting `push_commands = ["/describe", "/review -i"]` in their GitLab config.
Review Summary by QodoAdd GitLab incremental review support via /review -i
WalkthroughsDescription• Add incremental review support for GitLab via /review -i flag • Implement commit and note adapters to match PyGithub attribute shape • Restrict diff analysis to commits added since last review • Anchor file contents to last-seen commit instead of MR base • Add comprehensive unit tests for incremental review flow Diagramflowchart LR
A["MR Update"] --> B["get_incremental_commits"]
B --> C["Find Previous Review Note"]
C --> D{Review Found?}
D -->|No| E["Full Review"]
D -->|Yes| F["Partition Commits"]
F --> G["repository_compare<br/>last_seen_sha..head_sha"]
G --> H["Restrict to New Files"]
H --> I["Incremental Review"]
E --> J["get_diff_files"]
I --> J
J --> K["PR Review Output"]
File Changes1. pr_agent/git_providers/gitlab_provider.py
|
Code Review by Qodo
1.
|
- get_commit_range: skip commits whose timestamp fails to parse so they never become last_seen_commit. PRReviewer compares that field with `>` against a datetime and would raise TypeError if the date were None. - get_diff_files (incremental branch): run raw_changes through _expand_submodule_changes so GITLAB.EXPAND_SUBMODULE_DIFFS keeps working under /review -i, matching the full-review path. - get_diff_files (incremental branch): drop the unsafe diff_refs index fallback for head_sha; rely on _incremental_head_sha (always populated when incremental_active) and a defensive .get() for the rare callers that arrange state by hand. Tests cover both fixes: a dateless commit no longer anchors the incremental window, and _expand_submodule_changes is invoked on the incremental path.
|
Persistent review updated to latest commit acd1c17 |
`get_previous_review()` guarded the notes cache with `not getattr(self, '_incremental_notes_cache', None)`, which treats an empty list as "not cached" and re-fetches from GitLab on every call. Switch to `hasattr` so a legitimately empty notes list is preserved as a cached sentinel; the API call only happens once.
|
Persistent review updated to latest commit cd141c0 |
|
Update on the bot review feedback — for the maintainer's convenience: Resolved in
Resolved earlier in
Not addressing in this PR — flagging for follow-up:
Full unit-test suite stays green (351 tests, including the new GitLab incremental coverage). |
`_get_incremental_commits` previously returned early with `is_incremental=True` whenever `commits_range` came back empty, which conflated two distinct cases: - legitimately no new commits since the previous review (last_seen_commit set) — the reviewer should publish "Incremental Review Skipped"; - a broken timeline (previous review's timestamp didn't parse, or every post-review commit had an unparseable date) — `last_seen_commit` was never set, but the run still silently produced "no new files". Distinguish the two by checking whether `last_seen_commit` was anchored. When it wasn't, disable `is_incremental` so the run falls back to a full review instead of silently dropping it. Two new tests cover both failure modes (unparseable review timestamp and all-dateless commits).
|
Persistent review updated to latest commit 4e4b931 |
…ental-review # Conflicts: # tests/unittest/test_gitlab_provider.py
|
Persistent review updated to latest commit 71280ed |
| def __init__(self, gl_commit): | ||
| self._gl_commit = gl_commit | ||
| self.sha = getattr(gl_commit, 'id', None) | ||
| date = _parse_gitlab_iso_datetime( | ||
| getattr(gl_commit, 'committed_date', None) | ||
| or getattr(gl_commit, 'authored_date', None) | ||
| or getattr(gl_commit, 'created_at', None) | ||
| ) | ||
| self.commit = SimpleNamespace(author=SimpleNamespace(date=date)) |
There was a problem hiding this comment.
1. Commit shape mismatch 🐞 Bug ≡ Correctness
_GitlabIncrementalCommit only reads commit fields via attribute access (getattr), but GitLabProvider.get_commit_messages accesses commit data via dict indexing (commit['message']). If MR commits are dict-like (or only support item access), the adapter will yield sha/date as None and get_commit_range() will skip commits, breaking incremental anchoring and forcing incorrect fallback behavior.
Agent Prompt
## Issue description
`_GitlabIncrementalCommit` currently extracts `id` and timestamp fields via `getattr(...)` only. Elsewhere in the same provider, commit objects are treated as dict-like (`commit['message']`), meaning commit objects may not reliably expose attributes.
If commits are dict-like (or only implement item access), the incremental adapter will produce `sha=None` and `date=None`, causing `get_commit_range()` to skip commits (or fail to anchor correctly), which breaks incremental review behavior.
## Issue Context
Incremental review relies on `_GitlabIncrementalCommit.sha` and `_GitlabIncrementalCommit.commit.author.date` to determine `first_new_commit` / `last_seen_commit`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[58-75]
- Update `_GitlabIncrementalCommit` to read fields from both attribute-shaped and mapping/item-shaped commit objects (e.g., try `getattr`, then fallback to `gl_commit.get('field')` / `gl_commit['field']` when supported).
- Ensure `sha` and `date` are populated whenever the underlying commit object contains `id` and a date field, regardless of access style.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
`/improve` was unaffected by the GitLab incremental work: each push to an MR re-scanned the full diff and re-posted the same `**Suggestion:** ...` notes on the same lines, producing duplicate inline comments. Witness note IDs 158625 / 158636 / 158637 on MR teamtravelata/travelata!1115 — three identical suggestions for the same `parseInt(...)` line across three push events. Mirror what `/review -i` does for incremental scope, but anchor on the last /improve output rather than the last /review: - GitLabProvider.get_incremental_commits gains a `kind="review"` kwarg. `kind="suggestions"` anchors on the most recent prior `## PR Code Suggestions ✨` summary OR inline `**Suggestion:**` note, whichever is newer. `kind="review"` keeps existing /review -i semantics. - Extracted shared `_find_anchor_note(prefixes)` helper so review and suggestions paths share the cache + sort logic. - PRCodeSuggestions parses `-i` exactly like PRReviewer.parse_incremental and activates `kind="suggestions"` incremental on the provider when the kwarg is supported. If the provider doesn't accept the kwarg, the flag degrades to a no-op (full /improve), so older providers keep working unchanged. - `run()` short-circuits with a log line when incremental is active but no files changed since the last suggestions pass, instead of falling through to a full /improve pass. Tests cover three new branches: anchoring on a suggestion note ahead of a later review note, fallback when no prior suggestion exists, and the default `kind="review"` still ignores suggestion notes.
When `git merge <target>` is run on the MR's source branch between two incremental passes, `repository_compare(last_seen_sha, head_sha)` walks through the merge commit and surfaces every file the target branch touched since the MR's merge-base — files that are NOT part of the MR's own contribution. Reproduction (teamtravelata/travelata!1115): 1. MR open on `test-pr`, only frontend JS files changed. 2. /improve and /review run, post initial suggestions/review. 3. User runs `git merge master`, which brings in CI config changes (notably `.gitlab-ci.yml` from !1116) via a merge commit. 4. Next /review -i posts a suggestion on `.gitlab-ci.yml`, a file the MR has never touched. `mr.changes()` is anchored on the MR's merge-base with target, so it correctly excludes target-side content. Intersect the file set from `repository_compare` with `mr.changes()` to drop these "phantom" files. Files where the MR also has its own changes still go through (with their full compare diff — partial overlap with target content is rare and benign next to the original bug).
|
Persistent review updated to latest commit 1fc55a1 |
Six findings from the latest bot review: 1. `_find_anchor_note` calls `body.startswith` without checking the body is a string — add an `isinstance(body, str)` guard. Paranoid but cheap. 2. Anchor selection was wrong when the newest matching note's timestamp couldn't be parsed: the local `sorted(...)` with a `datetime.min` fallback demoted the newest match below older parseable ones, so incremental runs anchored on an older note and re-reviewed already-reviewed commits. Drop the local sort entirely — GitLab returns notes in created_at-DESC order by default, so iterating in natural order picks the newest match. When that match has an unparseable timestamp, `_get_incremental_commits` already falls back to a full run via the `last_seen_commit is None` branch. 3. `_parse_gitlab_iso_datetime` used single-quoted string literals — switched to double quotes to match the repo's Ruff convention. 4. `_get_incremental_commits` extracted `new_path` via dict access only; non-dict diff entries (a stricter library or a stubbed client) were silently dropped. Add a `getattr(diff, 'new_path', None)` fallback so the object-shape branch isn't dead code. 5. `pr_code_suggestions.py` import line wasn't isort-clean after adding `IncrementalPR` — reorder to `GitProvider, IncrementalPR, get_main_pr_language`. 6. `PRCodeSuggestions.__init__` called `get_main_pr_language(get_languages(), get_files())` before `run()` short-circuited the no-op incremental path. For GitLab, `get_files()` falls back to a full `mr.changes()` call when `unreviewed_files_set` is empty — that's a wasted API round-trip on every push that finds nothing new to review. Set `self._incremental_empty_scope = True` and return early from `__init__`; `run()` short-circuits on the same flag. Bot finding The-PR-Agent#2 ("Commit shape mismatch") was a false positive — python- gitlab's `RESTObject` supports both attribute and item access on the same object, so the inconsistency the bot flagged doesn't manifest in practice.
|
Persistent review updated to latest commit a534a03 |
|
A quick note on two findings I deliberately left alone in #2 "Commit shape mismatch" — false positive in my read. The bot flagged that #14 "No-new-commits fetches full diff" — valid finding, deliberately deferred. The pattern (falsy check on
Doing only (1) here would either lose the user-facing skip message on GitLab or — worse — change GitHub's behaviour as a side-effect. Happy to open a dedicated follow-up PR for it if the maintainer wants that scope; it didn't feel right to bolt onto this one. (For the |
Summary
When
push_commandsre-runs the GitLab CI ai-reviewer on every MR update, both/reviewand/improvepreviously re-ran on the full MR diff, re-publishing the same findings every push:/reviewrewrote the persistent## PR Reviewer Guidecomment, losing prior context./improvere-posted identical**Suggestion:** ...inline notes on the same lines (witnessed in our production: three duplicateparseIntsuggestions for the same code across three pushes).This PR adds GitLab incremental support for both tools, opt-in via the
-iflag (matching the existing GitHub/review -ipattern). The full flow has been hardened across multiple iterations of code-review feedback.What's in scope
/review -ion GitLabMirrors
GithubProvider.get_incremental_commits:## PR Reviewer Guide/## Incremental PR Reviewer Guidenote.repository_compare(last_seen_sha, head_sha)to derive the diff restricted to new commits.last_seen_sha.PRReviewer./improve -ion GitLabReuses the same incremental scope plumbing with a different anchor:
GitLabProvider.get_incremental_commitsnow accepts akind="review"|"suggestions"kwarg (defaults to"review", so existing callers are unchanged).kind="suggestions"anchors on the latest prior## PR Code Suggestions ✨summary OR inline**Suggestion:**note.PRCodeSuggestionsparses-ilikePRReviewer.parse_incrementaland activates the suggestions-scoped incremental mode if the provider supports it; otherwise it gracefully falls back to a full pass.Robustness
Three rounds of code-review feedback addressed:
get_commit_rangeignores commits whose timestamps fail to parse instead of letting them becomelast_seen_commit, which would laterTypeErrorwhen compared to a datetime._find_anchor_noteuseshasattrinstead of a falsy check, so MRs with no notes don't refetch on every call._expand_submodule_changesruns on the incremental path too, matching full-review behaviour underGITLAB.EXPAND_SUBMODULE_DIFFS.commits_rangeis empty AND nolast_seen_commitwas anchored (unparseable review timestamp, all-dateless commits), the run now falls back to a full review instead of silently surfacing "Incremental Review Skipped".git merge <target>on the MR branch,repository_compare(last_seen, head)walks through the merge commit and surfaces files that the target branch touched — files NOT part of the MR's actual contribution. The incremental scope now intersects withmr.changes()(which is anchored on the MR's merge-base with target) to drop these "phantom" files.Out of scope
GithubProvider.get_files()(same falsy check onunreviewed_files_set). Fixing it cleanly without UX regression requires changes in both providers andPRReviewer.run(). Happy to address in a follow-up PR if maintainers want.Configuration
After this lands, GitLab users opt in by setting their CI/push commands:
Or in CI scripts:
Defaults stay unchanged: existing users keep the current full-review behaviour and opt in explicitly.
Test plan