How to Set Up AI Code Review in GitHub Actions (2026 Guide)
On this page
TL;DR: Run AI code review as a GitHub Actions job that triggers on pull_request, feeds the diff (not the whole repo) to an LLM, and posts the result as a single PR review comment. Guard it with permissions: pull-requests: write, keep your API key in Actions secrets, and gate it so it doesn't run on every trivial commit. The full workflow is below.
Every team that adopts AI code review eventually asks the same question: how do we make this run automatically on pull requests without babysitting it? GitHub Actions is the natural home for that, but a naive setup either leaks secrets, floods every PR with noise, or burns tokens re-reviewing the entire repository on each push. This guide walks through a setup I actually run — one that reviews only what changed, posts a single tidy comment, and stays out of the way when it has nothing useful to say.
If you want the bigger picture first — what AI review is good at, where it fails, and how to roll it out to a team — start with the AI Code Review complete guide. This post is the hands-on CI half of that story.
What we're building
The goal is a workflow that:
- Triggers when a pull request is opened or updated.
- Collects the diff for that PR (not the whole codebase — that wastes tokens and dilutes the review).
- Sends the diff to an LLM with a focused review prompt.
- Posts the review back as a PR comment.
- Fails gracefully — a review job should never block a merge just because the API had a hiccup.
Step 1: Scope the permissions
The most common mistake is giving the workflow more access than it needs. A reviewer job needs to read the code and write a comment — nothing else:
permissions:
contents: read
pull-requests: write
Setting contents: read (instead of the default write) means that even if the review step is compromised by a malicious dependency, it can't push to your branches. This is the single most important line in the whole file.
Step 2: Store the API key as a secret
Never hard-code an API key in a workflow file — it lives in your git history forever the moment you push. Add it under Settings → Secrets and variables → Actions, then reference it as an environment variable:
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
If you're on a fork-heavy public repo, be aware that secrets are not exposed to workflows triggered by pull requests from forks. That's a security feature, not a bug — but it means AI review on external contributions needs the pull_request_target event and extra care. For internal repos, plain pull_request is simpler and safer.
Step 3: Grab the diff, not the repo
Here's the part most tutorials get wrong. Feeding the whole repository to the model on every run is slow and expensive, and the model loses the plot in a sea of unchanged files. Instead, fetch just the diff between the base branch and the PR head:
git fetch origin "${BASE_SHA}" --depth=1
git diff "${BASE_SHA}" HEAD > pr.diff
For large PRs, truncate the diff to a sane token budget. A diff over a few thousand lines is usually a sign the PR is too big to review meaningfully anyway — a good reason to add a size check that comments "this PR is large; consider splitting it" instead of attempting a review.
Step 4: The review prompt
The quality of the review is mostly determined by the prompt. A vague "review this code" produces vague output. Be specific about what you want and — just as important — what you don't want:
You are reviewing a pull request diff. Focus only on:
- Correctness bugs and logic errors
- Security issues (injection, secrets, auth bypass)
- Resource leaks and error handling
Do NOT comment on formatting, style, or naming — a linter handles those.
If the diff looks correct and you have nothing substantive to say,
reply with exactly: "No blocking issues found."
Diff:
<the diff goes here>
That last instruction is what saves you from the noisy-bot problem. Without an explicit "say nothing if there's nothing to say" escape hatch, LLMs will manufacture feedback to seem useful — and a bot that cries wolf on every PR gets muted within a week. For more on writing prompts that produce signal instead of noise, see prompt engineering for developers.
Step 5: The complete workflow
Putting it together, here's a self-contained workflow. It uses a small inline script to call the model and post the comment via the GitHub CLI (gh), which is pre-installed on GitHub-hosted runners:
name: AI Code Review
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build diff
id: diff
run: |
BASE="${{ github.event.pull_request.base.sha }}"
git diff "$BASE" HEAD > pr.diff
# Skip empty or huge diffs
LINES=$(wc -l < pr.diff)
echo "lines=$LINES" >> "$GITHUB_OUTPUT"
- name: Run AI review
if: steps.diff.outputs.lines != '0' && steps.diff.outputs.lines < 3000
env:
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
GH_TOKEN: ${{ github.token }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
REVIEW=$(python3 scripts/ai_review.py < pr.diff)
if [ "$REVIEW" != "No blocking issues found." ]; then
gh pr comment "$PR_NUMBER" --body "### AI review
$REVIEW"
fi
The if: condition on the review step means empty diffs and oversized PRs are skipped automatically. The comment is only posted when the model actually found something — so a clean PR gets no bot noise at all.
The scripts/ai_review.py referenced here is a thin wrapper: read stdin, build the prompt from Step 4, call your LLM provider's API, print the response. Keep it under 40 lines. The exact API call depends on your provider, but the shape is always the same — send system prompt plus diff, receive text, print it.
Step 6: Make failures non-blocking
A review job that fails the whole check suite when the LLM API times out will train your team to ignore red checks. Add continue-on-error: true to the job, or wrap the API call so a failure prints a note and exits zero:
jobs:
review:
runs-on: ubuntu-latest
continue-on-error: true
AI review is advisory. It should inform the human reviewer, never gate the merge on its own. If you find yourself wanting it to block merges, that's a signal to invest in real tests instead — see AI-generated tests: a practical guide.
Common pitfalls
- Reviewing on every push. Use
synchronizesparingly — on a busy PR it fires on every commit. Consider only running the full review onopenedandreopened, with a lighter check onsynchronize. - Leaking the diff to logs. Don't
echothe diff in a step that runs on untrusted input; treat PR content as untrusted. - No cost ceiling. Set a monthly budget alert with your LLM provider. A misconfigured trigger loop can rack up a surprising bill overnight.
- Trusting the review blindly. The model will occasionally be confidently wrong. It's a second pair of eyes, not the final word.
Where to go next
Once the workflow is stable, the natural next questions are how good is this review actually? and should it replace human review? The honest answer is covered in AI code review vs human review — the short version is that they're complementary, and the CI bot's job is to clear the trivial stuff so humans can focus on architecture and intent.
If you're still choosing a tool rather than rolling your own, the best AI code review tools for developers breakdown compares the managed options that bolt onto GitHub with far less setup than the DIY workflow above.
Related Articles
AI Code Review Prompts That Actually Work (With Examples)
The quality of an AI code review is decided almost entirely by the prompt. Review prompt patterns that produce signal instead of noise — copy-paste examples for bugs, security, and PR-level review.
AI Code Review vs Human Code Review: When to Use Each (2026)
AI code review and human review aren't competitors — they're a division of labour. What each is good at, where each fails, and how to combine them so you ship faster without lowering the bar.
How to Reduce Pull Request Review Time (Without Cutting Corners)
Slow PR reviews are usually a process problem, not a people problem. How to cut review turnaround — smaller PRs, clearer descriptions, an AI first pass, and review SLAs — without lowering your quality bar.