Skip to content

feat(gitlab): incremental /review -i and /improve -i#2389

Open
sizickp wants to merge 8 commits into
The-PR-Agent:mainfrom
sizickp:feature/gitlab-incremental-review
Open

feat(gitlab): incremental /review -i and /improve -i#2389
sizickp wants to merge 8 commits into
The-PR-Agent:mainfrom
sizickp:feature/gitlab-incremental-review

Conversation

@sizickp
Copy link
Copy Markdown

@sizickp sizickp commented May 14, 2026

Summary

When push_commands re-runs the GitLab CI ai-reviewer on every MR update, both /review and /improve previously re-ran on the full MR diff, re-publishing the same findings every push:

  • /review rewrote the persistent ## PR Reviewer Guide comment, losing prior context.
  • /improve re-posted identical **Suggestion:** ... inline notes on the same lines (witnessed in our production: three duplicate parseInt suggestions for the same code across three pushes).

This PR adds GitLab incremental support for both tools, opt-in via the -i flag (matching the existing GitHub /review -i pattern). The full flow has been hardened across multiple iterations of code-review feedback.

What's in scope

/review -i on GitLab

Mirrors GithubProvider.get_incremental_commits:

  • Locates the most recent prior ## PR Reviewer Guide / ## Incremental PR Reviewer Guide note.
  • Partitions MR commits into seen vs. new by note timestamp.
  • Uses repository_compare(last_seen_sha, head_sha) to derive the diff restricted to new commits.
  • Anchors file contents to last_seen_sha.
  • Adapts GitLab commits/notes to PyGithub-shaped attributes consumed by the shared incremental code path in PRReviewer.

/improve -i on GitLab

Reuses the same incremental scope plumbing with a different anchor:

  • GitLabProvider.get_incremental_commits now accepts a kind="review"|"suggestions" kwarg (defaults to "review", so existing callers are unchanged).
  • kind="suggestions" anchors on the latest prior ## PR Code Suggestions ✨ summary OR inline **Suggestion:** note.
  • PRCodeSuggestions parses -i like PRReviewer.parse_incremental and activates the suggestions-scoped incremental mode if the provider supports it; otherwise it gracefully falls back to a full pass.

Robustness

Three rounds of code-review feedback addressed:

  • Dateless commits skipped: get_commit_range ignores commits whose timestamps fail to parse instead of letting them become last_seen_commit, which would later TypeError when compared to a datetime.
  • Empty notes cache stabilises: _find_anchor_note uses hasattr instead of a falsy check, so MRs with no notes don't refetch on every call.
  • Submodule expansion: _expand_submodule_changes runs on the incremental path too, matching full-review behaviour under GITLAB.EXPAND_SUBMODULE_DIFFS.
  • Broken-timeline fallback: when commits_range is empty AND no last_seen_commit was anchored (unparseable review timestamp, all-dateless commits), the run now falls back to a full review instead of silently surfacing "Incremental Review Skipped".
  • Target-branch merge exclusion (most recent, important): when a user runs git merge <target> on the MR branch, repository_compare(last_seen, head) walks through the merge commit and surfaces files that the target branch touched — files NOT part of the MR's actual contribution. The incremental scope now intersects with mr.changes() (which is anchored on the MR's merge-base with target) to drop these "phantom" files.

Out of scope

  • No-new-commits fetches full diff — a pre-existing pattern shared with GithubProvider.get_files() (same falsy check on unreviewed_files_set). Fixing it cleanly without UX regression requires changes in both providers and PRReviewer.run(). Happy to address in a follow-up PR if maintainers want.

Configuration

After this lands, GitLab users opt in by setting their CI/push commands:

[gitlab]
push_commands = ["/describe", "/review -i", "/improve -i"]

Or in CI scripts:

python pr_agent/cli.py --pr_url "${PR_URL}" review -i
python pr_agent/cli.py --pr_url "${PR_URL}" improve -i

Defaults stay unchanged: existing users keep the current full-review behaviour and opt in explicitly.

Test plan

  • Full unit suite green: 416 tests passing locally (47 GitLab-specific, including 16 new ones covering the incremental code paths).
  • Live verification on GitLab: the entire flow has been driving an internal pr-agent fork against multiple test MRs. Every behavioural case in this PR description (anchor-on-suggestion-note, no-new-commits skip, master-merge phantom-file exclusion) has been observed end-to-end against a self-hosted GitLab.

When push_commands re-runs /review on every MR update, the full diff is
re-evaluated and the persistent review comment is rewritten — losing the
context of prior findings. The /review -i path already supported this on
GitHub but was a no-op for GitLab.

Mirror the GitHub provider's incremental flow for GitLab:
- Locate the most recent prior review note (## PR Reviewer Guide).
- Use its timestamp to partition MR commits into seen vs. new.
- Use repository_compare(last_seen_sha, head_sha) to derive the diff
  restricted to commits added since the last review, anchoring file
  contents to last_seen_sha instead of the MR base.
- Adapt GitLab commits / notes to the PyGithub-shaped attributes
  (`.sha`, `.commit.author.date`, `.created_at`, `.html_url`) consumed by
  the shared incremental code path in PRReviewer.

Existing default push_commands are unchanged; users opt in by setting
`push_commands = ["/describe", "/review -i"]` in their GitLab config.
@github-actions github-actions Bot added the feature 💡 label May 14, 2026
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add GitLab incremental review support via /review -i

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add incremental review support for GitLab via /review -i flag
• Implement commit and note adapters to match PyGithub attribute shape
• Restrict diff analysis to commits added since last review
• Anchor file contents to last-seen commit instead of MR base
• Add comprehensive unit tests for incremental review flow
Diagram
flowchart LR
  A["MR Update"] --> B["get_incremental_commits"]
  B --> C["Find Previous Review Note"]
  C --> D{Review Found?}
  D -->|No| E["Full Review"]
  D -->|Yes| F["Partition Commits"]
  F --> G["repository_compare<br/>last_seen_sha..head_sha"]
  G --> H["Restrict to New Files"]
  H --> I["Incremental Review"]
  E --> J["get_diff_files"]
  I --> J
  J --> K["PR Review Output"]
Loading

Grey Divider

File Changes

1. pr_agent/git_providers/gitlab_provider.py ✨ Enhancement +196/-7

Implement GitLab incremental review with commit/note adapters

• Add _parse_gitlab_iso_datetime() helper to normalize GitLab ISO 8601 timestamps to naive UTC
• Implement _GitlabIncrementalCommit adapter exposing .sha and .commit.author.date attributes
• Implement _GitlabIncrementalNote adapter exposing .created_at and .html_url attributes
• Add get_incremental_commits() method to populate incremental review state
• Add _get_incremental_commits() to locate previous review, partition commits, and compute
 incremental diff via repository_compare()
• Add get_commit_range() to identify first new commit after previous review timestamp
• Add get_previous_review() to find most recent matching review note by prefix
• Modify get_diff_files() to use incremental file set and anchor content to last-seen commit when
 active
• Modify get_files() to return incremental file set when active

pr_agent/git_providers/gitlab_provider.py


2. tests/unittest/test_gitlab_provider.py 🧪 Tests +202/-1

Add comprehensive incremental review unit tests

• Add parametrized tests for _parse_gitlab_iso_datetime() covering Z suffix, timezone offsets, and
 edge cases
• Add tests for _GitlabIncrementalCommit adapter shape and date fallback logic
• Add tests for _GitlabIncrementalNote adapter HTML URL construction
• Add integration tests for get_incremental_commits() covering no-review fallback, commit
 partitioning, empty file sets, and missing anchor commits
• Add tests for get_files() and get_previous_review() behavior in incremental mode

tests/unittest/test_gitlab_provider.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 14, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

Grey Divider


Action required

1. note.body.startswith no type check ✓ Resolved 📘 Rule violation ☼ Reliability
Description
_find_anchor_note() calls note.body.startswith(...) without verifying note.body is a string,
which can raise AttributeError if GitLab returns an unexpected type and break incremental runs.
Code

pr_agent/git_providers/gitlab_provider.py[R599-601]

+        for note in notes_sorted:
+            if any(note.body.startswith(prefix) for prefix in prefixes):
+                return _GitlabIncrementalNote(note, mr_web_url=mr_web_url)
Evidence
PR Compliance ID 19 requires defensive access for optional/variable external structures. The new
_find_anchor_note() logic filters on truthiness but still calls .startswith() on note.body
without ensuring it is a str.

pr_agent/git_providers/gitlab_provider.py[589-602]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_find_anchor_note()` assumes `note.body` is a string and calls `.startswith()` directly. If GitLab returns `None` or a non-string value, incremental review can crash.
## Issue Context
Notes are external/provider objects with variable schemas; boundary code should defensively check types before calling string methods.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[589-602]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Commit shape mismatch 🐞 Bug ≡ Correctness
Description
_GitlabIncrementalCommit only reads commit fields via attribute access (getattr), but
GitLabProvider.get_commit_messages accesses commit data via dict indexing (commit['message']). If MR
commits are dict-like (or only support item access), the adapter will yield sha/date as None and
get_commit_range() will skip commits, breaking incremental anchoring and forcing incorrect fallback
behavior.
Code

pr_agent/git_providers/gitlab_provider.py[R66-74]

+    def __init__(self, gl_commit):
+        self._gl_commit = gl_commit
+        self.sha = getattr(gl_commit, 'id', None)
+        date = _parse_gitlab_iso_datetime(
+            getattr(gl_commit, 'committed_date', None)
+            or getattr(gl_commit, 'authored_date', None)
+            or getattr(gl_commit, 'created_at', None)
+        )
+        self.commit = SimpleNamespace(author=SimpleNamespace(date=date))
Evidence
The incremental adapter is implemented using attribute access for id and date fields, while the
same provider’s existing get_commit_messages() implementation indexes commits as dictionaries.
This inconsistency means at least one of these code paths will mis-handle the commit object shape,
and incremental anchoring can fail when commits are dict-like.

pr_agent/git_providers/gitlab_provider.py[58-74]
pr_agent/git_providers/gitlab_provider.py[1118-1129]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_GitlabIncrementalCommit` currently extracts `id` and timestamp fields via `getattr(...)` only. Elsewhere in the same provider, commit objects are treated as dict-like (`commit['message']`), meaning commit objects may not reliably expose attributes.
If commits are dict-like (or only implement item access), the incremental adapter will produce `sha=None` and `date=None`, causing `get_commit_range()` to skip commits (or fail to anchor correctly), which breaks incremental review behavior.
## Issue Context
Incremental review relies on `_GitlabIncrementalCommit.sha` and `_GitlabIncrementalCommit.commit.author.date` to determine `first_new_commit` / `last_seen_commit`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[58-75]
- Update `_GitlabIncrementalCommit` to read fields from both attribute-shaped and mapping/item-shaped commit objects (e.g., try `getattr`, then fallback to `gl_commit.get('field')` / `gl_commit['field']` when supported).
- Ensure `sha` and `date` are populated whenever the underlying commit object contains `id` and a date field, regardless of access style.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. None commit date crashes ✓ Resolved 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_commit_range can set incremental.last_seen_commit even when the adapted commit
timestamp is None, and PRReviewer then performs a > comparison against a datetime which can
raise TypeError and abort /review -i. This can happen when GitLab returns an unexpected/empty
timestamp and _parse_gitlab_iso_datetime returns None.
Code

pr_agent/git_providers/gitlab_provider.py[R482-490]

+        for index in range(len(self.mr_commits) - 1, -1, -1):
+            adapter = _GitlabIncrementalCommit(self.mr_commits[index])
+            commit_time = adapter.commit.author.date
+            if commit_time is not None and commit_time > last_review_time:
+                self.incremental.first_new_commit = adapter
+                first_new_commit_index = index
+            else:
+                self.incremental.last_seen_commit = adapter
+                break
Evidence
The GitLab adapter can legally produce None timestamps, and get_commit_range() currently stores
such a commit as last_seen_commit. PRReviewer then uses last_seen_commit.commit.author.date in a
> comparison without checking for None, which will raise TypeError when the stored date is
None.

pr_agent/git_providers/gitlab_provider.py[33-55]
pr_agent/git_providers/gitlab_provider.py[477-491]
pr_agent/tools/pr_reviewer.py[328-353]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`get_commit_range()` may assign `self.incremental.last_seen_commit` to a commit whose adapted `commit.author.date` is `None`. Later, `PRReviewer._can_run_incremental_review()` compares that `None` value to a datetime (`last_seen_commit_date > recent_commits_threshold`), which can raise `TypeError` and break incremental reviews.
## Issue Context
This occurs when `_parse_gitlab_iso_datetime()` fails to parse a GitLab timestamp (returns `None`), which is then propagated into `_GitlabIncrementalCommit(...).commit.author.date`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[33-55]
- pr_agent/git_providers/gitlab_provider.py[477-491]
## Suggested fix
- In `GitLabProvider.get_commit_range()`, treat `commit_time is None` as an incremental-fallback condition:
- either skip that commit and continue scanning older commits (do not set `last_seen_commit`), or
- explicitly disable incremental (`self.incremental.is_incremental = False`) and return an empty range so the caller falls back to a full review.
- Ensure `self.incremental.last_seen_commit` is only set when `adapter.commit.author.date` is a real `datetime` (never `None`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Unsorted git_provider imports ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
The from pr_agent.git_providers.git_provider import ... names are not ordered as Ruff/isort
expects, which can introduce lint failures and inconsistent formatting.
Code

pr_agent/tools/pr_code_suggestions.py[27]

+from pr_agent.git_providers.git_provider import IncrementalPR, get_main_pr_language, GitProvider
Evidence
PR Compliance ID 10 requires new Python code to conform to Ruff/isort style. The modified import
line adds IncrementalPR but leaves the imported names in a non-isort order.

AGENTS.md
pr_agent/tools/pr_code_suggestions.py[24-28]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The import list from `pr_agent.git_providers.git_provider` is not in isort/Ruff order.
## Issue Context
The repo enables Ruff isort checks (`I001`), so unsorted imports can fail linting and add noise to future diffs.
## Fix Focus Areas
- pr_agent/tools/pr_code_suggestions.py[24-28]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Newest anchor note skipped ✓ Resolved 🐞 Bug ≡ Correctness
Description
GitLabProvider._find_anchor_note() sorts notes by parsed created_at and pushes unparseable
timestamps to the end, so if the newest matching anchor note has an unparseable created_at but an
older matching note parses, the older note will be selected as the anchor. This can make `/review
-i or /improve -i` re-process commits that were already covered by the newest (but unparseable)
anchor note instead of safely falling back.
Code

pr_agent/git_providers/gitlab_provider.py[R590-601]

+        notes_sorted = sorted(
+            (n for n in self._incremental_notes_cache if getattr(n, 'body', None)),
+            key=lambda n: (
+                _parse_gitlab_iso_datetime(getattr(n, 'created_at', None))
+                or datetime.min
+            ),
+            reverse=True,
+        )
+        mr_web_url = getattr(self.mr, 'web_url', None)
+        for note in notes_sorted:
+            if any(note.body.startswith(prefix) for prefix in prefixes):
+                return _GitlabIncrementalNote(note, mr_web_url=mr_web_url)
Evidence
_find_anchor_note() sorts notes by _parse_gitlab_iso_datetime(created_at) and substitutes
datetime.min when parsing fails, which demotes unparseable timestamps (even if the note is newest)
and can cause an older matching note to be chosen. Incremental anchoring then uses
previous_review.created_at in get_commit_range(), so selecting an older note changes the commit
partition.

pr_agent/git_providers/gitlab_provider.py[538-560]
pr_agent/git_providers/gitlab_provider.py[572-602]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_find_anchor_note()` sorts matching notes by parsed timestamps and uses `datetime.min` for unparseable timestamps, which can cause the newest matching note (by GitLab order) to be ignored in favor of an older parseable note. This changes the incremental anchor point and can re-review already-reviewed commits.
## Issue Context
GitLab note listing is already newest-first; incremental logic should anchor to the newest matching note and if that note’s timestamp is unparseable, treat it as “cannot anchor” (fall back to full) rather than silently anchoring on an older note.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[572-602]
## Suggested approach
- Avoid sorting; iterate notes in the order returned by GitLab (newest-first) and return the first note whose body starts with any prefix.
- If that returned note’s `created_at` parses to `None`, let higher-level logic fall back (or explicitly disable incremental) rather than selecting an older anchor.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Improve -i prefetches APIs ✓ Resolved 🐞 Bug ➹ Performance
Description
PRCodeSuggestions.__init__() calls get_languages() and get_files() (via
get_main_pr_language) before run() short-circuits no-op incremental /improve -i runs with an
empty unreviewed_files_set. On GitLab, get_files() falls back to mr.changes() when
unreviewed_files_set is empty, causing extra GitLab API calls for runs that will immediately skip.
Code

pr_agent/tools/pr_code_suggestions.py[R38-56]

      self.git_provider = get_git_provider_with_context(pr_url)
+        self.args = args
+        self.incremental = self._parse_incremental(args)
+        # When invoked as `/improve -i`, narrow `git_provider.get_diff_files()` to the files
+        # changed since the previous suggestions pass. Falls back to full when the provider
+        # doesn't support incremental scope or no prior suggestion comment exists.
+        if self.incremental.is_incremental and hasattr(self.git_provider, "get_incremental_commits"):
+            try:
+                self.git_provider.get_incremental_commits(self.incremental, kind="suggestions")
+            except TypeError:
+                # Older provider signature without the `kind` kwarg — skip incremental scope.
+                get_logger().info(
+                    "Provider does not support kind-based incremental commits; "
+                    "running /improve on the full MR diff"
+                )
+                self.incremental = IncrementalPR(False)
      self.main_language = get_main_pr_language(
          self.git_provider.get_languages(), self.git_provider.get_files()
      )
Evidence
__init__ always computes main_language using get_languages()/get_files() before run()
checks and skips incremental no-op runs. GitLab’s get_files() uses mr.changes() unless
unreviewed_files_set is truthy, so empty incremental scope still triggers the full MR file fetch.

pr_agent/tools/pr_code_suggestions.py[34-56]
pr_agent/tools/pr_code_suggestions.py[114-127]
pr_agent/git_providers/gitlab_provider.py[752-761]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`PRCodeSuggestions` computes `main_language` in `__init__` (calling provider `get_languages()` and `get_files()`) even for incremental `/improve -i` runs that will later return early because `unreviewed_files_set` is empty. This adds avoidable API calls/latency on the common “no new files” incremental path.
## Issue Context
The early-return check lives in `run()`, but the expensive provider calls happen earlier in `__init__`. For GitLab, `get_files()` will call `mr.changes()` unless `unreviewed_files_set` is truthy.
## Fix Focus Areas
- pr_agent/tools/pr_code_suggestions.py[34-56]
- pr_agent/tools/pr_code_suggestions.py[114-127]
## Suggested approach
- Move `main_language = get_main_pr_language(...)` (and any other provider-heavy initialization that depends on it) to after the early-return check in `run()`, or
- If incremental is enabled and the provider exposes `unreviewed_files_set`, perform the “empty scope” check immediately after `get_incremental_commits()` in `__init__` and skip initializing language-related state.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (7)
7. Single quotes in ISO parser ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
New Python code uses single-quoted string literals (e.g., value.replace('Z', '+00:00')), which
conflicts with the repo’s Ruff convention requiring double quotes. This may introduce lint failures
and inconsistent formatting across the codebase.
Code

pr_agent/git_providers/gitlab_provider.py[R49-53]

+        s = value.replace('Z', '+00:00')
+        dt = datetime.fromisoformat(s)
+        if dt.tzinfo is not None:
+            dt = dt.astimezone(timezone.utc).replace(tzinfo=None)
+        return dt
Evidence
PR Compliance ID 10 requires Ruff conventions including double quotes. The added ISO datetime
parsing logic introduces single-quoted string literals on the cited lines.

AGENTS.md
pr_agent/git_providers/gitlab_provider.py[49-53]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New/modified Python lines use single-quoted strings, but the compliance checklist requires Ruff-style double quotes.
## Issue Context
The repository style requirement (Ruff) calls for double quotes; new additions should follow it to avoid lint/style drift.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[49-53]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


8. Non-dict diffs dropped ✓ Resolved 🐞 Bug ≡ Correctness
Description
GitLabProvider._get_incremental_commits() supports repository_compare returning a non-dict
result, but then only extracts new_path when each diff item is a dict. If diffs contains
object-shaped entries, unreviewed_files_set will stay empty and incremental behavior can degrade
(incorrect fallback/skip).
Code

pr_agent/git_providers/gitlab_provider.py[R481-490]

+        if isinstance(compare_result, dict):
+            diffs = compare_result.get('diffs', []) or []
+        else:
+            diffs = getattr(compare_result, 'diffs', []) or []
+
+        for diff in diffs:
+            new_path = diff.get('new_path') if isinstance(diff, dict) else None
+            if new_path:
+                self.unreviewed_files_set[new_path] = diff
+
Evidence
The code explicitly anticipates a non-dict compare_result (`getattr(compare_result, 'diffs',
...)) but then ignores any non-dict diff items by setting new_path=None unless diff` is a
dict, preventing incremental file collection.

pr_agent/git_providers/gitlab_provider.py[481-490]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_get_incremental_commits()` allows `compare_result` to be either a `dict` or an object, but the subsequent diff loop only supports dict-shaped diff entries. If `compare_result.diffs` yields objects (or any non-dict mapping), the code will not extract `new_path` and will silently drop those diffs.
## Issue Context
This happens in the incremental-review diff collection path after `project.repository_compare(last_seen_sha, head_sha)`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[481-490]
## Suggested fix
In the `for diff in diffs:` loop, extract `new_path` (and store the diff) for both dict and object shapes, e.g.:
- `new_path = diff.get('new_path')` when `isinstance(diff, dict)`
- otherwise `new_path = getattr(diff, 'new_path', None)`
Optionally normalize stored diff entries to a dict-like form if later code assumes dict indexing.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Unparseable review timestamp misclassified ✓ Resolved 🐞 Bug ☼ Reliability
Description
If the chosen previous review note parses to created_at=None, get_commit_range() returns an
empty list and _get_incremental_commits() returns early without disabling incremental mode, so the
run can be treated as “no new files” rather than falling back to a full review. This makes
incremental review behavior depend on timestamp parse success rather than actual MR changes.
Code

pr_agent/git_providers/gitlab_provider.py[R436-438]

+        self.incremental.commits_range = self.get_commit_range()
+        if not self.incremental.commits_range:
+            return
Evidence
The provider can select a previous review note whose created_at is None (parse failure) and then
get_commit_range() returns []; _get_incremental_commits() treats any falsy commits_range as
a reason to return without disabling incremental, which later allows PRReviewer to enter the “no new
files” handling for incremental runs.

pr_agent/git_providers/gitlab_provider.py[77-89]
pr_agent/git_providers/gitlab_provider.py[425-439]
pr_agent/git_providers/gitlab_provider.py[477-481]
pr_agent/tools/pr_reviewer.py[126-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When the previous review note’s `created_at` cannot be parsed (becomes `None`), `GitLabProvider.get_commit_range()` returns `[]` and `_get_incremental_commits()` exits early, leaving incremental mode enabled but without computing an incremental diff. This can lead to the PRReviewer path treating the run as “no new files” instead of safely reverting to a full review.
### Issue Context
`_GitlabIncrementalNote.created_at` is derived via `_parse_gitlab_iso_datetime(...)`, which can return `None` on unexpected formats. The current incremental flow treats an empty `commits_range` as a valid “no new commits” outcome, but it’s also used for “cannot determine commit range due to missing review timestamp”.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[77-100]
- pr_agent/git_providers/gitlab_provider.py[425-456]
- pr_agent/git_providers/gitlab_provider.py[477-500]
### Suggested approach
- Ensure `get_previous_review(...)` does **not** return a note adapter with `created_at is None` (skip such notes and continue searching older matching notes).
- If no valid dated prior review exists, explicitly set `self.incremental.is_incremental = False` (fall back to full review). Alternatively (or additionally), if `last_review_time is None` inside `get_commit_range()`, disable incremental and return `[]` with a clear log message indicating fallback rather than “no new commits”.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


10. Dateless commits can suppress diff ✓ Resolved 🐞 Bug ≡ Correctness
Description
get_commit_range() currently skips commits whose timestamps can’t be parsed; if all commits after
the previous review are “dateless”, the method returns an empty range and
_get_incremental_commits() never calls repository_compare, preventing any incremental diff from
being computed. In that scenario the run can be misinterpreted as having nothing new to review.
Code

pr_agent/git_providers/gitlab_provider.py[R485-492]

+            if commit_time is None:
+                # A commit without a parseable timestamp cannot be placed on the timeline;
+                # skip it so it never lands in last_seen_commit (PRReviewer compares that
+                # date with `>`, which would TypeError against None).
+                get_logger().warning(
+                    f"Skipping commit {adapter.sha} with unparseable timestamp during incremental review"
+                )
+                continue
Evidence
The code explicitly continues on commit_time is None, and then returns an empty range when
first_new_commit_index remains unset; _get_incremental_commits() exits immediately on an empty
commits_range, so repository_compare(...) is never invoked and unreviewed_files_set remains
empty.

pr_agent/git_providers/gitlab_provider.py[477-500]
pr_agent/git_providers/gitlab_provider.py[425-439]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
In `GitLabProvider.get_commit_range()`, commits with an unparseable timestamp are skipped (`continue`). If the only commits after the last review are unparseable, `first_new_commit_index` never gets set and the method returns `[]`, which makes `_get_incremental_commits()` return early and skip computing `repository_compare(...)` diffs.
### Issue Context
Skipping is used to avoid `TypeError` when comparing `None` with datetimes, but it also means the provider can fail to detect “new commits exist” and therefore fail to compute incremental diffs.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[425-476]
- pr_agent/git_providers/gitlab_provider.py[477-500]
### Suggested approach
- Keep the existing safety (don’t compare `None` dates), but avoid returning an empty range solely due to dateless commits.
- For example: if `commit_time is None` and `first_new_commit_index` is still `None` (i.e., we haven’t identified any newer commits yet), treat that commit as part of the “new” range by setting `first_new_commit_index = index` (and `first_new_commit = adapter`) so incremental diff computation proceeds.
- If you don’t want to treat dateless commits as “new”, then explicitly disable incremental mode (fallback to full review) when dateless commits prevent establishing a commit range.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


11. Empty notes cache refetched ✓ Resolved 🐞 Bug ➹ Performance
Description
GitLabProvider.get_previous_review() uses if not getattr(self, '_incremental_notes_cache', None)
so an empty notes list ([]) is treated as “not cached” and triggers mr.notes.list(get_all=True)
again on every call. This causes repeated GitLab API calls for MRs with no notes and makes caching
behavior inconsistent.
Code

pr_agent/git_providers/gitlab_provider.py[R504-509]

+        if not getattr(self, '_incremental_notes_cache', None):
+            try:
+                self._incremental_notes_cache = list(self.mr.notes.list(get_all=True))
+            except Exception as e:
+                get_logger().error(f"Failed to list MR notes for incremental review: {e}")
+                return None
Evidence
The cache guard uses a falsy check; when GitLab returns no notes, _incremental_notes_cache becomes
[], which remains falsy and triggers another API fetch on the next call.

pr_agent/git_providers/gitlab_provider.py[501-507]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`get_previous_review()` re-fetches MR notes repeatedly when the notes list is empty because it uses a falsy check (`if not getattr(...):`). An empty list is falsy, so the cache guard never stabilizes.
### Issue Context
This method is part of the incremental-review plumbing and may be called multiple times within a run or across commands in the same process, so repeated `notes.list(get_all=True)` calls are unnecessary load.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[501-510]
### Suggested implementation directions
- Change the guard to distinguish between “not set” and “set to empty list”, e.g.:
- `if not hasattr(self, '_incremental_notes_cache'):`
- or `if getattr(self, '_incremental_notes_cache', None) is None:`
- Keep behavior otherwise identical.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


12. diff_refs[...] used without guard ✓ Resolved 📘 Rule violation ☼ Reliability
Description
get_diff_files() directly indexes self.mr.diff_refs[...] in both incremental and non-incremental
paths, which can raise KeyError if GitLab returns missing/partial diff_refs. This can break
reviews at runtime instead of failing gracefully or falling back to full review.
Code

pr_agent/git_providers/gitlab_provider.py[R589-595]

+            head_sha_for_content = getattr(self, '_incremental_head_sha', None) \
+                or self.mr.diff_refs['head_sha']
+        else:
+            raw_changes = self.mr.changes().get('changes', [])
+            raw_changes = self._expand_submodule_changes(raw_changes)
+            base_sha_for_content = self.mr.diff_refs['base_sha']
+            head_sha_for_content = self.mr.diff_refs['head_sha']
Evidence
PR Compliance ID 20 requires defensive access (e.g., dict .get) for variable external payloads.
The changed code directly indexes self.mr.diff_refs['head_sha'] / self.mr.diff_refs['base_sha']
when computing base_sha_for_content/head_sha_for_content, which can raise KeyError if those
keys are absent.

pr_agent/git_providers/gitlab_provider.py[586-595]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`get_diff_files()` directly indexes `self.mr.diff_refs['head_sha']` / `self.mr.diff_refs['base_sha']` which can raise `KeyError` when `diff_refs` is missing keys or has an unexpected shape.
## Issue Context
This data comes from an external provider payload (GitLab). The compliance requirement is to use defensive access patterns for optional/variable structures.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[586-595]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


13. Submodule diffs not expanded ✓ Resolved 🐞 Bug ≡ Correctness
Description
In incremental mode, GitLabProvider.get_diff_files uses repository_compare diffs directly and
never applies _expand_submodule_changes, so submodule bumps won't be expanded into file-level
diffs even when GITLAB.EXPAND_SUBMODULE_DIFFS is enabled. This causes incremental reviews to miss
submodule-internal changes that full reviews would include.
Code

pr_agent/git_providers/gitlab_provider.py[R586-595]

+        if incremental_active:
+            raw_changes = list(self.unreviewed_files_set.values())
+            base_sha_for_content = self.incremental.last_seen_commit_sha
+            head_sha_for_content = getattr(self, '_incremental_head_sha', None) \
+                or self.mr.diff_refs['head_sha']
+        else:
+            raw_changes = self.mr.changes().get('changes', [])
+            raw_changes = self._expand_submodule_changes(raw_changes)
+            base_sha_for_content = self.mr.diff_refs['base_sha']
+            head_sha_for_content = self.mr.diff_refs['head_sha']
Evidence
_expand_submodule_changes() is explicitly responsible for expanding 'Subproject commit' bumps when
enabled, but it is only invoked on the non-incremental mr.changes() path. The incremental path
uses unreviewed_files_set diffs and skips the expansion step entirely.

pr_agent/git_providers/gitlab_provider.py[294-347]
pr_agent/git_providers/gitlab_provider.py[580-595]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Incremental `get_diff_files()` bypasses `_expand_submodule_changes()`, so the opt-in submodule expansion feature only works for full MR reviews and not for incremental reviews.
## Issue Context
- Full mode: `mr.changes()` output is passed through `_expand_submodule_changes()`.
- Incremental mode: `raw_changes` is sourced from `unreviewed_files_set` (repository_compare diffs) and is not expanded.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[294-347]
- pr_agent/git_providers/gitlab_provider.py[580-595]
## Suggested fix
- In the `incremental_active` branch of `get_diff_files()`, run:
- `raw_changes = self._expand_submodule_changes(raw_changes)`
so behavior matches the full-review path when `GITLAB.EXPAND_SUBMODULE_DIFFS` is enabled.
- (Optional) Add/extend a unit test to assert submodule expansion is applied in incremental mode as well.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

14. No-new-commits fetches full diff 🐞 Bug ➹ Performance
Description
When /review -i finds no new commits, GitLabProvider._get_incremental_commits() returns with
is_incremental still true but unreviewed_files_set empty, and get_diff_files() then treats
incremental mode as inactive and falls back to mr.changes() (full MR) because the empty dict is
falsy. Since PRReviewer.__init__ always calls get_num_of_files() (which calls
len(get_diff_files())), this adds avoidable API calls/work on the common “nothing new to review”
path before the run later short-circuits.
Code

pr_agent/git_providers/gitlab_provider.py[R588-609]

+        incremental_active = bool(
+            getattr(self, 'incremental', None)
+            and getattr(self.incremental, 'is_incremental', False)
+            and getattr(self, 'unreviewed_files_set', None)
+        )
+
+        if incremental_active:
+            raw_changes = list(self.unreviewed_files_set.values())
+            # Apply submodule expansion symmetrically with the full-review path so that
+            # `GITLAB.EXPAND_SUBMODULE_DIFFS` keeps working under `/review -i`.
+            raw_changes = self._expand_submodule_changes(raw_changes)
+            base_sha_for_content = self.incremental.last_seen_commit_sha
+            # `_incremental_head_sha` is populated by `_get_incremental_commits()` whenever
+            # incremental_active is true; we still guard for defensive callers.
+            head_sha_for_content = getattr(self, '_incremental_head_sha', None)
+            if not head_sha_for_content:
+                head_sha_for_content = (self.mr.diff_refs or {}).get('head_sha')
+        else:
+            raw_changes = self.mr.changes().get('changes', [])
+            raw_changes = self._expand_submodule_changes(raw_changes)
+            base_sha_for_content = self.mr.diff_refs['base_sha']
+            head_sha_for_content = self.mr.diff_refs['head_sha']
Evidence
The provider leaves incremental enabled but may have an empty unreviewed_files_set (no new
commits), and get_diff_files() treats that as non-incremental due to a truthiness check.
Separately, PRReviewer always computes num_pr_files via get_num_of_files(), and the default
implementation calls get_diff_files(), forcing the fallback path to run during initialization.

pr_agent/git_providers/gitlab_provider.py[436-438]
pr_agent/git_providers/gitlab_provider.py[588-609]
pr_agent/tools/pr_reviewer.py[47-55]
pr_agent/tools/pr_reviewer.py[77-84]
pr_agent/git_providers/git_provider.py[398-402]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
In incremental mode, `get_diff_files()` currently considers incremental “active” only if `unreviewed_files_set` is truthy. When there are **no new commits/files**, `unreviewed_files_set` is `{}` (falsy) while `is_incremental` remains `True`, so `get_diff_files()` falls back to fetching the full MR changes.
This is especially wasteful because `PRReviewer.__init__` always calls `get_num_of_files()`, which calls `get_diff_files()`; so the full changes fetch/work happens even though the run path will later publish/skip for “no new files”.
### Issue Context
- `PRReviewer.__init__` calls `get_num_of_files()` unconditionally.
- Default `GitProvider.get_num_of_files()` calls `len(self.get_diff_files())`.
- `GitLabProvider.get_diff_files()` uses a truthiness check on `unreviewed_files_set`, so `{}` disables the incremental path.
### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[585-610]
- pr_agent/git_providers/gitlab_provider.py[678-688]
- pr_agent/git_providers/git_provider.py[398-402]
### Suggested implementation directions
- Make `incremental_active` depend on incremental mode being enabled and the *presence* of `unreviewed_files_set` (not its truthiness), so an empty dict yields `raw_changes = []`.
- Optionally override `GitLabProvider.get_num_of_files()` to return `len(self.unreviewed_files_set)` when incremental is enabled and `unreviewed_files_set` exists, avoiding a `get_diff_files()` call entirely in this mode.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit a534a03

Results up to commit N/A


🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0)


Action required
1. note.body.startswith no type check ✓ Resolved 📘 Rule violation ☼ Reliability
Description
_find_anchor_note() calls note.body.startswith(...) without verifying note.body is a string,
which can raise AttributeError if GitLab returns an unexpected type and break incremental runs.
Code

pr_agent/git_providers/gitlab_provider.py[R599-601]

+        for note in notes_sorted:
+            if any(note.body.startswith(prefix) for prefix in prefixes):
+                return _GitlabIncrementalNote(note, mr_web_url=mr_web_url)
Evidence
PR Compliance ID 19 requires defensive access for optional/variable external structures. The new
_find_anchor_note() logic filters on truthiness but still calls .startswith() on note.body
without ensuring it is a str.

pr_agent/git_providers/gitlab_provider.py[589-602]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_find_anchor_note()` assumes `note.body` is a string and calls `.startswith()` directly. If GitLab returns `None` or a non-string value, incremental review can crash.
## Issue Context
Notes are external/provider objects with variable schemas; boundary code should defensively check types before calling string methods.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[589-602]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Commit shape mismatch 🐞 Bug ≡ Correctness
Description
_GitlabIncrementalCommit only reads commit fields via attribute access (getattr), but
GitLabProvider.get_commit_messages accesses commit data via dict indexing (commit['message']). If MR
commits are dict-like (or only support item access), the adapter will yield sha/date as None and
get_commit_range() will skip commits, breaking incremental anchoring and forcing incorrect fallback
behavior.
Code

pr_agent/git_providers/gitlab_provider.py[R66-74]

+    def __init__(self, gl_commit):
+        self._gl_commit = gl_commit
+        self.sha = getattr(gl_commit, 'id', None)
+        date = _parse_gitlab_iso_datetime(
+            getattr(gl_commit, 'committed_date', None)
+            or getattr(gl_commit, 'authored_date', None)
+            or getattr(gl_commit, 'created_at', None)
+        )
+        self.commit = SimpleNamespace(author=SimpleNamespace(date=date))
Evidence
The incremental adapter is implemented using attribute access for id and date fields, while the
same provider’s existing get_commit_messages() implementation indexes commits as dictionaries.
This inconsistency means at least one of these code paths will mis-handle the commit object shape,
and incremental anchoring can fail when commits are dict-like.

pr_agent/git_providers/gitlab_provider.py[58-74]
pr_agent/git_providers/gitlab_provider.py[1118-1129]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_GitlabIncrementalCommit` currently extracts `id` and timestamp fields via `getattr(...)` only. Elsewhere in the same provider, commit objects are treated as dict-like (`commit['message']`), meaning commit objects may not reliably expose attributes.
If commits are dict-like (or only implement item access), the incremental adapter will produce `sha=None` and `date=None`, causing `get_commit_range()` to skip commits (or fail to anchor correctly), which breaks incremental review behavior.
## Issue Context
Incremental review relies on `_GitlabIncrementalCommit.sha` and `_GitlabIncrementalCommit.commit.author.date` to determine `first_new_commit` / `last_seen_commit`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[58-75]
- Update `_GitlabIncrementalCommit` to read fields from both attribute-shaped and mapping/item-shaped commit objects (e.g., try `getattr`, then fallback to `gl_commit.get('field')` / `gl_commit['field']` when supported).
- Ensure `sha` and `date` are populated whenever the underlying commit object contains `id` and a date field, regardless of access style.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. None commit date crashes ✓ Resolved 🐞 Bug ☼ Reliability
Description
GitLabProvider.get_commit_range can set incremental.last_seen_commit even when the adapted commit
timestamp is None, and PRReviewer then performs a > comparison against a datetime which can
raise TypeError and abort /review -i. This can happen when GitLab returns an unexpected/empty
timestamp and _parse_gitlab_iso_datetime returns None.
Code

pr_agent/git_providers/gitlab_provider.py[R482-490]

+        for index in range(len(self.mr_commits) - 1, -1, -1):
+            adapter = _GitlabIncrementalCommit(self.mr_commits[index])
+            commit_time = adapter.commit.author.date
+            if commit_time is not None and commit_time > last_review_time:
+                self.incremental.first_new_commit = adapter
+                first_new_commit_index = index
+            else:
+                self.incremental.last_seen_commit = adapter
+                break
Evidence
The GitLab adapter can legally produce None timestamps, and get_commit_range() currently stores
such a commit as last_seen_commit. PRReviewer then uses last_seen_commit.commit.author.date in a
> comparison without checking for None, which will raise TypeError when the stored date is
None.

pr_agent/git_providers/gitlab_provider.py[33-55]
pr_agent/git_providers/gitlab_provider.py[477-491]
pr_agent/tools/pr_reviewer.py[328-353]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`get_commit_range()` may assign `self.incremental.last_seen_commit` to a commit whose adapted `commit.author.date` is `None`. Later, `PRReviewer._can_run_incremental_review()` compares that `None` value to a datetime (`last_seen_commit_date > recent_commits_threshold`), which can raise `TypeError` and break incremental reviews.
## Issue Context
This occurs when `_parse_gitlab_iso_datetime()` fails to parse a GitLab timestamp (returns `None`), which is then propagated into `_GitlabIncrementalCommit(...).commit.author.date`.
## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[33-55]
- pr_agent/git_providers/gitlab_provider.py[477-491]
## Suggested fix
- In `GitLabProvider.get_commit_range()`, treat `commit_time is None` as an incremental-fallback condition:
- either skip that commit and continue scanning older commits (do not set `last_seen_commit`), or
- explicitly disable incremental (`self.incremental.is_incremental = False`) and return an empty range so the caller falls back to a full review.
- Ensure `self.incremental.last_seen_commit` is only set when `adapter.commit.author.date` is a real `datetime` (never `None`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
4. Unsorted git_provider imports ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
The from pr_agent.git_providers.git_provider import ... names are not ordered as Ruff/isort
expects, which can introduce lint failures and inconsistent formatting.
Code

pr_agent/tools/pr_code_suggestions.py[27]

+from pr_agent.git_providers.git_provider import IncrementalPR, get_main_pr_language, GitProvider
Evidence
PR Compliance ID 10 requires new Python code to conform to Ruff/isort style. The modified import
line adds IncrementalPR but leaves the imported names in a non-isort order.

AGENTS.md
pr_agent/tools/pr_code_suggestions.py[24-28]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The import list from `pr_agent.git_providers.git_provider` is not in isort/Ruff order.
## Issue Context
The repo enables Ruff isort checks (`I001`), so unsorted imports can fail linting and add noise to future diffs.
## Fix Focus Areas
- pr_agent/tools/pr_code_suggestions.py[24-28]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Newest anchor note skipped ✓ Resolved 🐞 Bug ≡ Correctness
Description
GitLabProvider._find_anchor_note() sorts notes by parsed created_at and pushes unparseable
timestamps to the end, so if the newest matching anchor note has an unparseable created_at but an
older matching note parses, the older note will be selected as the anchor. This can make `/review
-i or /improve -i` re-process commits that were already covered by the newest (but unparseable)
anchor note instead of safely falling back.
Code

pr_agent/git_providers/gitlab_provider.py[R590-601]

+        notes_sorted = sorted(
+            (n for n in self._incremental_notes_cache if getattr(n, 'body', None)),
+            key=lambda n: (
+                _parse_gitlab_iso_datetime(getattr(n, 'created_at', None))
+                or datetime.min
+            ),
+            reverse=True,
+        )
+        mr_web_url = getattr(self.mr, 'web_url', None)
+        for note in notes_sorted:
+            if any(note.body.startswith(prefix) for prefix in prefixes):
+                return _GitlabIncrementalNote(note, mr_web_url=mr_web_url)
Evidence
_find_anchor_note() sorts notes by _parse_gitlab_iso_datetime(created_at) and substitutes
datetime.min when parsing fails, which demotes unparseable timestamps (even if the note is newest)
and can cause an older matching note to be chosen. Incremental anchoring then uses
previous_review.created_at in get_commit_range(), so selecting an older note changes the commit
partition.

pr_agent/git_providers/gitlab_provider.py[538-560]
pr_agent/git_providers/gitlab_provider.py[572-602]

Agent prompt
The issue below was found during a cod...

Comment thread pr_agent/git_providers/gitlab_provider.py
- get_commit_range: skip commits whose timestamp fails to parse so they
  never become last_seen_commit. PRReviewer compares that field with `>`
  against a datetime and would raise TypeError if the date were None.
- get_diff_files (incremental branch): run raw_changes through
  _expand_submodule_changes so GITLAB.EXPAND_SUBMODULE_DIFFS keeps
  working under /review -i, matching the full-review path.
- get_diff_files (incremental branch): drop the unsafe diff_refs index
  fallback for head_sha; rely on _incremental_head_sha (always populated
  when incremental_active) and a defensive .get() for the rare callers
  that arrange state by hand.

Tests cover both fixes: a dateless commit no longer anchors the
incremental window, and _expand_submodule_changes is invoked on the
incremental path.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 14, 2026

Persistent review updated to latest commit acd1c17

`get_previous_review()` guarded the notes cache with `not getattr(self,
'_incremental_notes_cache', None)`, which treats an empty list as
"not cached" and re-fetches from GitLab on every call. Switch to
`hasattr` so a legitimately empty notes list is preserved as a cached
sentinel; the API call only happens once.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 14, 2026

Persistent review updated to latest commit cd141c0

@sizickp
Copy link
Copy Markdown
Author

sizickp commented May 14, 2026

Update on the bot review feedback — for the maintainer's convenience:

Resolved in cd141c00 (this push):

Resolved earlier in acd1c171:

  • None commit date crashes: dateless commits are now skipped in get_commit_range() so last_seen_commit never holds a None timestamp.
  • Submodule diffs not expanded: _expand_submodule_changes() now runs in the incremental path too, matching the full-review behavior.
  • diff_refs[...] used without guard: tightened the new incremental code (rely on _incremental_head_sha + .get() fallback). The pre-existing non-incremental branch (untouched by this PR) still uses direct indexing — same as before; happy to address in a separate PR if that's desired.

Not addressing in this PR — flagging for follow-up:

  • No-new-commits fetches full diff (formatting #5): valid finding, but it's a pre-existing pattern shared with GithubProvider.get_files() / get_diff_files() — they use the same falsy check on unreviewed_files_set, so GitHub has the same wasted work on push-with-no-new-commits. A proper fix would need to:

    1. Switch the truthy check to hasattr on both providers,
    2. Reorder the checks in PRReviewer.run() so the "Incremental Review Skipped" message still publishes before the "PR has no files" silent-skip path.

    Doing (1) in isolation here would either regress GitHub or lose the user-facing "Incremental Review Skipped" message on GitLab. I'd prefer to scope that to a dedicated follow-up that touches both providers + shared code together, if you'd like me to open it.

Full unit-test suite stays green (351 tests, including the new GitLab incremental coverage).

`_get_incremental_commits` previously returned early with
`is_incremental=True` whenever `commits_range` came back empty, which
conflated two distinct cases:

- legitimately no new commits since the previous review (last_seen_commit
  set) — the reviewer should publish "Incremental Review Skipped";
- a broken timeline (previous review's timestamp didn't parse, or every
  post-review commit had an unparseable date) — `last_seen_commit` was
  never set, but the run still silently produced "no new files".

Distinguish the two by checking whether `last_seen_commit` was anchored.
When it wasn't, disable `is_incremental` so the run falls back to a full
review instead of silently dropping it. Two new tests cover both
failure modes (unparseable review timestamp and all-dateless commits).
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 15, 2026

Persistent review updated to latest commit 4e4b931

…ental-review

# Conflicts:
#	tests/unittest/test_gitlab_provider.py
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 15, 2026

Persistent review updated to latest commit 71280ed

Comment on lines +66 to +74
def __init__(self, gl_commit):
self._gl_commit = gl_commit
self.sha = getattr(gl_commit, 'id', None)
date = _parse_gitlab_iso_datetime(
getattr(gl_commit, 'committed_date', None)
or getattr(gl_commit, 'authored_date', None)
or getattr(gl_commit, 'created_at', None)
)
self.commit = SimpleNamespace(author=SimpleNamespace(date=date))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Commit shape mismatch 🐞 Bug ≡ Correctness

_GitlabIncrementalCommit only reads commit fields via attribute access (getattr), but
GitLabProvider.get_commit_messages accesses commit data via dict indexing (commit['message']). If MR
commits are dict-like (or only support item access), the adapter will yield sha/date as None and
get_commit_range() will skip commits, breaking incremental anchoring and forcing incorrect fallback
behavior.
Agent Prompt
## Issue description
`_GitlabIncrementalCommit` currently extracts `id` and timestamp fields via `getattr(...)` only. Elsewhere in the same provider, commit objects are treated as dict-like (`commit['message']`), meaning commit objects may not reliably expose attributes.

If commits are dict-like (or only implement item access), the incremental adapter will produce `sha=None` and `date=None`, causing `get_commit_range()` to skip commits (or fail to anchor correctly), which breaks incremental review behavior.

## Issue Context
Incremental review relies on `_GitlabIncrementalCommit.sha` and `_GitlabIncrementalCommit.commit.author.date` to determine `first_new_commit` / `last_seen_commit`.

## Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[58-75]
  - Update `_GitlabIncrementalCommit` to read fields from both attribute-shaped and mapping/item-shaped commit objects (e.g., try `getattr`, then fallback to `gl_commit.get('field')` / `gl_commit['field']` when supported).
  - Ensure `sha` and `date` are populated whenever the underlying commit object contains `id` and a date field, regardless of access style.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sergey Petrov added 2 commits May 15, 2026 10:41
`/improve` was unaffected by the GitLab incremental work: each push to an
MR re-scanned the full diff and re-posted the same `**Suggestion:** ...`
notes on the same lines, producing duplicate inline comments. Witness
note IDs 158625 / 158636 / 158637 on MR teamtravelata/travelata!1115 —
three identical suggestions for the same `parseInt(...)` line across
three push events.

Mirror what `/review -i` does for incremental scope, but anchor on the
last /improve output rather than the last /review:

- GitLabProvider.get_incremental_commits gains a `kind="review"` kwarg.
  `kind="suggestions"` anchors on the most recent prior
  `## PR Code Suggestions ✨` summary OR inline `**Suggestion:**` note,
  whichever is newer. `kind="review"` keeps existing /review -i
  semantics.
- Extracted shared `_find_anchor_note(prefixes)` helper so review and
  suggestions paths share the cache + sort logic.
- PRCodeSuggestions parses `-i` exactly like PRReviewer.parse_incremental
  and activates `kind="suggestions"` incremental on the provider when
  the kwarg is supported. If the provider doesn't accept the kwarg, the
  flag degrades to a no-op (full /improve), so older providers keep
  working unchanged.
- `run()` short-circuits with a log line when incremental is active but
  no files changed since the last suggestions pass, instead of falling
  through to a full /improve pass.

Tests cover three new branches: anchoring on a suggestion note ahead of
a later review note, fallback when no prior suggestion exists, and the
default `kind="review"` still ignores suggestion notes.
When `git merge <target>` is run on the MR's source branch between two
incremental passes, `repository_compare(last_seen_sha, head_sha)` walks
through the merge commit and surfaces every file the target branch
touched since the MR's merge-base — files that are NOT part of the MR's
own contribution.

Reproduction (teamtravelata/travelata!1115):
1. MR open on `test-pr`, only frontend JS files changed.
2. /improve and /review run, post initial suggestions/review.
3. User runs `git merge master`, which brings in CI config changes
   (notably `.gitlab-ci.yml` from !1116) via a merge commit.
4. Next /review -i posts a suggestion on `.gitlab-ci.yml`, a file the MR
   has never touched.

`mr.changes()` is anchored on the MR's merge-base with target, so it
correctly excludes target-side content. Intersect the file set from
`repository_compare` with `mr.changes()` to drop these "phantom" files.

Files where the MR also has its own changes still go through (with their
full compare diff — partial overlap with target content is rare and
benign next to the original bug).
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 15, 2026

Persistent review updated to latest commit 1fc55a1

@sizickp sizickp changed the title feat(gitlab): support incremental review via /review -i feat(gitlab): incremental /review -i and /improve -i May 15, 2026
Comment thread pr_agent/git_providers/gitlab_provider.py Outdated
Six findings from the latest bot review:

1. `_find_anchor_note` calls `body.startswith` without checking the body
   is a string — add an `isinstance(body, str)` guard. Paranoid but cheap.

2. Anchor selection was wrong when the newest matching note's timestamp
   couldn't be parsed: the local `sorted(...)` with a `datetime.min` fallback
   demoted the newest match below older parseable ones, so incremental runs
   anchored on an older note and re-reviewed already-reviewed commits.
   Drop the local sort entirely — GitLab returns notes in created_at-DESC
   order by default, so iterating in natural order picks the newest match.
   When that match has an unparseable timestamp, `_get_incremental_commits`
   already falls back to a full run via the `last_seen_commit is None`
   branch.

3. `_parse_gitlab_iso_datetime` used single-quoted string literals —
   switched to double quotes to match the repo's Ruff convention.

4. `_get_incremental_commits` extracted `new_path` via dict access only;
   non-dict diff entries (a stricter library or a stubbed client) were
   silently dropped. Add a `getattr(diff, 'new_path', None)` fallback so
   the object-shape branch isn't dead code.

5. `pr_code_suggestions.py` import line wasn't isort-clean after adding
   `IncrementalPR` — reorder to `GitProvider, IncrementalPR, get_main_pr_language`.

6. `PRCodeSuggestions.__init__` called `get_main_pr_language(get_languages(),
   get_files())` before `run()` short-circuited the no-op incremental
   path. For GitLab, `get_files()` falls back to a full `mr.changes()`
   call when `unreviewed_files_set` is empty — that's a wasted API
   round-trip on every push that finds nothing new to review. Set
   `self._incremental_empty_scope = True` and return early from
   `__init__`; `run()` short-circuits on the same flag.

Bot finding The-PR-Agent#2 ("Commit shape mismatch") was a false positive — python-
gitlab's `RESTObject` supports both attribute and item access on the same
object, so the inconsistency the bot flagged doesn't manifest in practice.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 15, 2026

Persistent review updated to latest commit a534a03

@sizickp
Copy link
Copy Markdown
Author

sizickp commented May 15, 2026

A quick note on two findings I deliberately left alone in a534a03a, in case it's useful for the human reviewer:

#2 "Commit shape mismatch" — false positive in my read. The bot flagged that _GitlabIncrementalCommit reads commit fields via getattr(...) while get_commit_messages() elsewhere in the same file uses commit['message'], suggesting the two access styles can't both work. They can: python-gitlab's RESTObject (the base class for ProjectMergeRequestCommit) implements both __getattr__ and __getitem__, both backed by the same _attrs dict. So getattr(commit, 'id', None) and commit['id'] return identical values for any GitLab commit object the SDK produces. The "inconsistency" the bot sees is real stylistically, but it doesn't manifest as a bug in practice — and harmonising the two would mean adding fallback ladders for an object shape that doesn't exist.

#14 "No-new-commits fetches full diff" — valid finding, deliberately deferred. The pattern (falsy check on unreviewed_files_set in get_files() / get_diff_files() causing fallback to the full MR file list when the incremental set is empty) is shared with GithubProvider — see github_provider.py#L197. A correct fix needs:

  1. switching the truthy check to hasattr (or is not None) on both providers, and
  2. reordering PRReviewer.run() so the "Incremental Review Skipped — no files changed" message still publishes before the earlier "PR has no files: skipping" silent-skip.

Doing only (1) here would either lose the user-facing skip message on GitLab or — worse — change GitHub's behaviour as a side-effect. Happy to open a dedicated follow-up PR for it if the maintainer wants that scope; it didn't feel right to bolt onto this one.

(For the /improve side specifically, finding #6 — addressed in this push — already short-circuits the same wasted mr.changes() call via an early-return in __init__, so the practical impact of #14 is now limited to /review -i on the "nothing new since last review" path.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant