"The simple believes everything, but the prudent gives thought to his steps."
π This repository collects papers investigating the trustworthiness of retrieval augmented generation.
π Welcome to recommend missing papers through Adding Issues or Pull Requests.
2026-03Updated with new papers from 2025-2026!2025-06Updated new papers!2025-02Our latest survey Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey is available on ArXiv!2024-06We create this repository to maintain a paper list onTrustworthiness of the LLM Retrieval Augmented Generation paradigm.
- [arxiv] A Systematic Review of Key Retrieval-Augmented Generation (RAG) Systems: Progress, Gaps, and Future Directions.
2025.07 - [arxiv] Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers.
2025.05 - [arxiv] Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG.
2025.01 - [arxiv] Mitigating Hallucination in Large Language Models (LLMs): An Application-Oriented Survey on RAG, Reasoning, and Agentic Systems.
2025.10 - [arxiv][Github] Retrieval-Augmented Generation for Large Language Models: A Survey.
2024.03 - [arxiv] Evaluation of Retrieval-Augmented Generation: A Survey.
2024.05 - [arxiv] Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey.
2023.11 - [arxiv] Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity.
2023.10 - [arxiv] ChatGPT is not Enough: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling.
2023.06 - [arxiv] A Survey of Knowledge-Enhanced Pre-trained Language Models.
2023.05 - [Paper] Augmented Language Models: a Survey.
2023.02
- [arxiv][Website] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
2024.02 - [arxiv] In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT.
2023.04 - [arxiv] How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks.
2023.03 - [Paper] [Github] Prompting GPT-3 To Be Reliable.
2022.10
- [arxiv]Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey.
2025.02 - [arxiv] Trustworthiness in Retrieval-Augmented Generation Systems: A Survey.
2024.09 - [arxiv] TrustLLM: Trustworthiness in Large Language Models.
2024.09
- [arxiv] Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation.
2026.02 - [arxiv] RAG Security and Privacy: Formalizing the Threat Model and Attack Surface.
2025.09 - [arxiv] RAG with Differential Privacy.
2025.01 - [arxiv] Privacy-Preserving Retrieval Augmented Generation with Differential Privacy.
2024.12 - [arxiv] RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service.
2024.12 - [arxiv] Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors.
2024.11 - [arxiv] RAG-Thief: Scalable Extraction of Private Data from Retrieval-Augmented Generation Applications with Agent-based Attacks.
2024.11 - [arxiv] Mask-based Membership Inference Attacks for Retrieval-Augmented Generation.
2024.10 - [arxiv] Phantom: General Trigger Attacks on Retrieval Augmented Language Generation.
2024.10 - [arxiv] Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking.
2024.09 - [arxiv] Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation.
2024.06 - [arxiv] Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data.
2024.06 - [arxiv] Privacy Implications of Retrieval-Based Language Models.
2024.05 - [arxiv] The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG).
2024.02
- [AAAI] Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective.
2024.10 - [arxiv] Conformalized Answer Set Prediction for Knowledge Graph Embedding.
2024.08 - [ICLR] Conformal Language Modeling.
2024.06 - [arxiv] TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction.
2024.04 - [arxiv] Unsupervised Cross-Task Generalization via Retrieval Augmentation.
2022.04 - [arxiv] Generation-Augmented Retrieval for Open-Domain Question Answering.
2021.05 - [ICML] C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models.
2024.02 - [arxiv] CONFLARE: CONFormal LArge language model REtrieval.
2024.04
- [arxiv] Investigating the Robustness of Retrieval-Augmented Generation at the Query Level.
2025.07 - [arxiv] Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?
2025.04 - [RAGSynth] RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization.
2025.05 - [ACL] Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models.
2025.05 - [arxiv] Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals.
2025.05 - [arxiv] QE-RAG: A Robust Retrieval-Augmented Generation Benchmark for Query Entry Errors.
2025.04 - [arxiv] Quantifying the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data.
2025.03 - [arxiv] Revisiting Robust RAG: Do We Still Need Complex Robust Training in the Era of Powerful LLMs?
2025.02 - [EMNLP] RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering.
2024.10 - [arxiv] Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models.
2024.10 - [arxiv] Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution.
2024.06 - [ACL] Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training.
2024.05 - [arxiv] Certifiably Robust RAG against Retrieval Corruption.
2024.05 - [ICLR] Making Retrieval-Augmented Language Models Robust to Irrelevant Context.
2024.05 - [arxiv] Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation.
2024.03 - [LREC-COLING] Tug-of-War Between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models.
2024.02 - [arxiv] Resolving Knowledge Conflicts in Large Language Models.
2023.10 - [Paper] Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning.
2023.07
- [arxiv] Practical Poisoning Attacks against Retrieval-Augmented Generation.
2026.01 - [arxiv] RAG Safety: Exploring Knowledge Poisoning Attacks to Retrieval-Augmented Generation.
2025.07 - [arxiv] RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models.
2025.04 - [arxiv] Understanding Data Poisoning Attacks for RAG: Insights and Algorithms.
2025.01 - [WWW] Traceback of Poisoned Texts in Poisoning Attacks to Retrieval-Augmented Generation.
2025.02 - [arxiv] FlipedRAG: Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models.
2025.01 - [arxiv] GraphRAG under Fire.
2025.01 - [arxiv] TrustRAG: Enhancing Robustness and Trustworthiness in RAG.
2025.01 - [arxiv] Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems.
2025.01 - [arxiv] RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis.
2024.11 - [ACL] ATM: Adversarial Tuning Multi-agent System Makes a Robust Retrieval-Augmented Generator.
2024.11 - [EMNLP] Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations.
2024.10 - [arxiv] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models.
2024.08 - [arxiv] BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models.
2024.06 - [arxiv] Poisoned LangChain: Jailbreak LLMs by LangChain.
2024.06 - [arxiv] Pandora: Jailbreak GPTs by Retrieval Augmented Generation Poisoning.
2024.02 - [arxiv] Backdoor Attacks on Dense Passage Retrievers for Disseminating Misinformation.
2024.02 - [arxiv] Poisoning Retrieval Corpora by Injecting Adversarial Passages.
2023.10
- [arxiv] After Retrieval, Before Generation: Enhancing the Trustworthiness of Large Language Models in RAG.
2025.05 - [WWW] Bias-Aware Agent: Enhancing Fairness in AI-Driven Knowledge Retrieval.
2025.05 - [arxiv] Bias Amplification in RAG: Poisoning Knowledge Retrieval to Steer LLMs.
2025.06 - [arxiv] The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation.
2025.04 - [arxiv] Mitigating Bias in RAG: Controlling the Embedder.
2025.02 - [arxiv] No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users.
2024.10 - [arxiv] Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation.
2024.09 - [arxiv] Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems.
2024.09 - [arxiv] BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models.
2024.06 - [arxiv] FairRAG: Fair Human Generation via Fair Retrieval Augmentation.
2024.04 - [arxiv] Mitigating Test-Time Bias for Fair Image Retrieval.
2023.05
- [arxiv] Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning.
2024.02 - [SIGIR24] RAG-Ex: A Generic Framework for Explaining Retrieval Augmented Generation.
2024.07 - [arxiv] From Feature Importance to Natural Language Explanations Using LLMs with RAG.
2024.06 - [SIGIR24] IR-RAG@ SIGIR24: Information Retrieval's Role in RAG Systems.
2024.07 - [arxiv] RAGE Against the Machine: Retrieval-Augmented LLM Explanations.
2024.05
- [arxiv] A Survey of Text Watermarking in the Era of Large Language Models.
2023.12 - [EMNLP24]CODEIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code.
2024.11 - [NAACL24]SEMSTAMP: A Semantic Watermark with Paraphrastic Robustness for Text Generation.
2024.06 - [arxiv]On the Reliability of Watermarks for Large Language Models.
2024.05 - [arxiv]On the Learnability of Watermarks for Language Models.
2024.05 - [arxiv]Publicly-Detectable Watermarking for Language Models.
2025.01 - [arxiv]Ward: Provable RAG Dataset Inference via LLM Watermarks.
2024.10 - [arxiv] From Feature Importance to Natural Language Explanations Using LLMs with RAG.
2024.06 - [SIGIR24] IR-RAG@ SIGIR24: Information Retrieval's Role in RAG Systems.
2024.07
- [arxiv] RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation.
2024.06 - [arxiv] RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models.
2024.05 - [ACL24] WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models.
2024.08
- β¨ Add a new paper or update an existing one.
- π§ Use the same format as existing entries to describe the work.
- π A very brief explanation why you think a paper should be added or updated is recommended (Not Neccessary) via
Adding IssuesorPull Requests.
Don't worry if you put something wrong, they will be fixed for you. Just feel free to contribute and promote your awesome work here! π€© We'll get back to you in time ~ π