TL;DR: Run AI code review as a GitHub Actions job that triggers on pull_request, feeds the diff (not the whole repo) to an LLM, and posts the result as a single PR review comment. Guard it with permissions: pull-requests: write, keep your API key in Actions secrets, and gate it so it doesn't run on every trivial commit. The full workflow is below.

Every team that adopts AI code review eventually asks the same question: how do we make this run automatically on pull requests without babysitting it? GitHub Actions is the natural home for that, but a naive setup either leaks secrets, floods every PR with noise, or burns tokens re-reviewing the entire repository on each push. This guide walks through a setup I actually run — one that reviews only what changed, posts a single tidy comment, and stays out of the way when it has nothing useful to say.

If you want the bigger picture first — what AI review is good at, where it fails, and how to roll it out to a team — start with the AI Code Review complete guide. This post is the hands-on CI half of that story.

What we're building

The goal is a workflow that:

Triggers when a pull request is opened or updated.
Collects the diff for that PR (not the whole codebase — that wastes tokens and dilutes the review).
Sends the diff to an LLM with a focused review prompt.
Posts the review back as a PR comment.
Fails gracefully — a review job should never block a merge just because the API had a hiccup.

Step 1: Scope the permissions

The most common mistake is giving the workflow more access than it needs. A reviewer job needs to read the code and write a comment — nothing else:

permissions:
  contents: read
  pull-requests: write

Setting contents: read (instead of the default write) means that even if the review step is compromised by a malicious dependency, it can't push to your branches. This is the single most important line in the whole file.

Step 2: Store the API key as a secret

Never hard-code an API key in a workflow file — it lives in your git history forever the moment you push. Add it under Settings → Secrets and variables → Actions, then reference it as an environment variable:

env:
  LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

If you're on a fork-heavy public repo, be aware that secrets are not exposed to workflows triggered by pull requests from forks. That's a security feature, not a bug — but it means AI review on external contributions needs the pull_request_target event and extra care. For internal repos, plain pull_request is simpler and safer.

Step 3: Grab the diff, not the repo

Here's the part most tutorials get wrong. Feeding the whole repository to the model on every run is slow and expensive, and the model loses the plot in a sea of unchanged files. Instead, fetch just the diff between the base branch and the PR head:

git fetch origin "${BASE_SHA}" --depth=1
git diff "${BASE_SHA}" HEAD > pr.diff

For large PRs, truncate the diff to a sane token budget. A diff over a few thousand lines is usually a sign the PR is too big to review meaningfully anyway — a good reason to add a size check that comments "this PR is large; consider splitting it" instead of attempting a review.

Step 4: The review prompt

The quality of the review is mostly determined by the prompt. A vague "review this code" produces vague output. Be specific about what you want and — just as important — what you don't want:

You are reviewing a pull request diff. Focus only on:
- Correctness bugs and logic errors
- Security issues (injection, secrets, auth bypass)
- Resource leaks and error handling

Do NOT comment on formatting, style, or naming — a linter handles those.
If the diff looks correct and you have nothing substantive to say,
reply with exactly: "No blocking issues found."

Diff:
<the diff goes here>

That last instruction is what saves you from the noisy-bot problem. Without an explicit "say nothing if there's nothing to say" escape hatch, LLMs will manufacture feedback to seem useful — and a bot that cries wolf on every PR gets muted within a week. For more on writing prompts that produce signal instead of noise, see prompt engineering for developers.

Step 5: The complete workflow

Putting it together, here's a self-contained workflow. It uses a small inline script to call the model and post the comment via the GitHub CLI (gh), which is pre-installed on GitHub-hosted runners:

name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Build diff
        id: diff
        run: |
          BASE="${{ github.event.pull_request.base.sha }}"
          git diff "$BASE" HEAD > pr.diff
          # Skip empty or huge diffs
          LINES=$(wc -l < pr.diff)
          echo "lines=$LINES" >> "$GITHUB_OUTPUT"

      - name: Run AI review
        if: steps.diff.outputs.lines != '0' && steps.diff.outputs.lines < 3000
        env:
          LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
          GH_TOKEN: ${{ github.token }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
        run: |
          REVIEW=$(python3 scripts/ai_review.py < pr.diff)
          if [ "$REVIEW" != "No blocking issues found." ]; then
            gh pr comment "$PR_NUMBER" --body "### AI review

$REVIEW"
          fi

The if: condition on the review step means empty diffs and oversized PRs are skipped automatically. The comment is only posted when the model actually found something — so a clean PR gets no bot noise at all.

The scripts/ai_review.py referenced here is a thin wrapper: read stdin, build the prompt from Step 4, call your LLM provider's API, print the response. Keep it under 40 lines. The exact API call depends on your provider, but the shape is always the same — send system prompt plus diff, receive text, print it.

Step 6: Make failures non-blocking

A review job that fails the whole check suite when the LLM API times out will train your team to ignore red checks. Add continue-on-error: true to the job, or wrap the API call so a failure prints a note and exits zero:

jobs:
  review:
    runs-on: ubuntu-latest
    continue-on-error: true

AI review is advisory. It should inform the human reviewer, never gate the merge on its own. If you find yourself wanting it to block merges, that's a signal to invest in real tests instead — see AI-generated tests: a practical guide.

Common pitfalls

Reviewing on every push. Use synchronize sparingly — on a busy PR it fires on every commit. Consider only running the full review on opened and reopened, with a lighter check on synchronize.
Leaking the diff to logs. Don't echo the diff in a step that runs on untrusted input; treat PR content as untrusted.
No cost ceiling. Set a monthly budget alert with your LLM provider. A misconfigured trigger loop can rack up a surprising bill overnight.
Trusting the review blindly. The model will occasionally be confidently wrong. It's a second pair of eyes, not the final word.

Where to go next

Once the workflow is stable, the natural next questions are how good is this review actually? and should it replace human review? The honest answer is covered in AI code review vs human review — the short version is that they're complementary, and the CI bot's job is to clear the trivial stuff so humans can focus on architecture and intent.

If you're still choosing a tool rather than rolling your own, the best AI code review tools for developers breakdown compares the managed options that bolt onto GitHub with far less setup than the DIY workflow above.

How to Set Up AI Code Review in GitHub Actions (2026 Guide)

On this page