feat(core): implement truthfulness and verification integrity guardrails by Shafwansafi06 · Pull Request #20613 · google-gemini/gemini-cli

Shafwansafi06 · 2026-02-27T21:06:58Z

Summary

Implement truthfulness and verification integrity guardrails in the system prompt to mitigate AI hallucinations, specifically addressing issues where the agent claims to have reviewed resources (files, screenshots, command outputs) without performing tool calls.

Closes #19651

Details

This PR directly addresses the "Verification Integrity" failures reported in #19651, where the agent was found to be "lying about having accessed and evaluated" screenshots and code.

The implementation introduces a high-recency system prompt section:

Verification Integrity: Explicitly forbids claiming review/read/check status without corresponding tool results in the conversation history.
No Assumed State: Mandates reading before asserting state, preventing "made up" metrics or code logic.
Explicit Uncertainty: Forces the agent to admit when it hasn't accessed a resource, breaking the loop of false confirmations.

These rules are added to both modern (snippets.ts) and legacy (snippets.legacy.ts) prompt compositions to ensure coverage across models.

How to Validate

Run unit tests for snippets:

npm test -w @google/gemini-cli-core -- src/prompts/snippets.test.ts

Run integration tests for prompts:

npm test -w @google/gemini-cli-core -- src/core/prompts.test.ts

Verify snapshots in packages/core/src/core/__snapshots__/prompts.test.ts.snap include the new # Truthfulness & Verification Integrity section.

Pre-Merge Checklist

Updated relevant documentation and README (if needed)
Added/updated tests (if needed)
Noted breaking changes (if any)
Validated on required platforms/methods:
- Linux
  - npm run

feat(core): implement truthfulness and verification integrity guardrails

gemini-code-assist · 2026-02-27T21:07:16Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces crucial truthfulness and verification integrity guardrails into the system prompts for the AI agent. The primary goal is to significantly reduce AI hallucinations by ensuring the agent's responses are grounded in actual tool interactions and verified information, rather than fabricated claims or assumptions. This enhancement directly addresses reported issues where the agent would falsely state it had reviewed resources without performing the necessary actions.

Highlights

Verification Integrity: Explicitly forbids the AI agent from claiming to have reviewed, read, checked, or verified any resource (file, directory, command output, URL) unless an explicit tool call and its result are present in the current conversation history.
No Assumed State: Mandates that the agent must read the contents or state of a resource before asserting it, preventing assumptions based on name, path, or prior knowledge, as files can change.
Explicit Uncertainty: Forces the agent to explicitly admit when it is uncertain about having accessed a resource in the current session and to offer to re-read it, thereby preventing false confirmations.

Changelog

packages/core/src/core/snapshots/prompts.test.ts.snap
- Updated various system prompt snapshots to include the newly added 'Truthfulness & Verification Integrity' section.
packages/core/src/core/prompts.test.ts
- Added a new test case to confirm that the core system prompt correctly incorporates the truthfulness guardrails.
packages/core/src/prompts/snippets.legacy.ts
- Introduced the renderTruthfulnessGuardrails function.
- Integrated the renderTruthfulnessGuardrails function into the legacy prompt composition.
packages/core/src/prompts/snippets.test.ts
- Added a new test file to provide dedicated unit tests for the renderTruthfulnessGuardrails function, covering its content and formatting.
packages/core/src/prompts/snippets.ts
- Introduced the renderTruthfulnessGuardrails function.
- Integrated the renderTruthfulnessGuardrails function into the modern prompt composition.

Activity

No human activity (comments, reviews, etc.) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces important truthfulness and verification integrity guardrails to the system prompts for both modern and legacy models, aiming to reduce AI hallucinations. The changes are well-tested with new unit tests and updated snapshots. My main feedback is to refactor the duplicated renderTruthfulnessGuardrails function into a shared module to improve maintainability and adhere to the DRY principle, which also aligns with ensuring consistency for critical prompt engineering components.

packages/core/src/prompts/snippets.ts

Shafwansafi06 added 2 commits February 28, 2026 02:23

feat(core): implement truthfulness and verification integrity guardrails

3f11d57

Merge pull request #1 from Shafwansafi06/fix/hallucination-guardrails-v2

84f0268

feat(core): implement truthfulness and verification integrity guardrails

Shafwansafi06 requested a review from a team as a code owner February 27, 2026 21:06

gemini-code-assist bot reviewed Feb 27, 2026

View reviewed changes

packages/core/src/prompts/snippets.ts Show resolved Hide resolved

gemini-cli bot added the area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality label Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): implement truthfulness and verification integrity guardrails#20613

feat(core): implement truthfulness and verification integrity guardrails#20613
Shafwansafi06 wants to merge 2 commits intogoogle-gemini:mainfrom
Shafwansafi06:main

Shafwansafi06 commented Feb 27, 2026

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Shafwansafi06 commented Feb 27, 2026

Summary

Details

How to Validate

Pre-Merge Checklist

Uh oh!

gemini-code-assist bot commented Feb 27, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant