> For the complete documentation index, see [llms.txt](https://docs.system.com/system/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.system.com/system/legal/legal/data-integrity-policy.md).

# Data Integrity Policy

**Data Integrity Policy**

**June 2025**

The Data Integrity Policy outlines the principles and procedures System follows to ensure the accuracy, consistency, and reliability of its data throughout its lifecycle. This policy helps maintain trust in the organization’s data systems, supports compliance with regulatory standards, and ensures that data can be used confidently to support decision-making and reporting. It establishes guidelines for data entry, storage, access, modification, and transfer to prevent unauthorized changes, corruption, or loss.

There are the eight key steps we take to ensure the accuracy and reliability of our data:

1. **We maintain high security and compliance standards for storage and management of data.** We conduct regular audits with third party providers to ensure compliance with strict standards. We design our processes to minimize the exposure to private or personally identifiable data.
2. **We use a continuous human-in-the-loop framework to monitor and improve our extraction process.** The data in the System Graph and System Clinical Graph are based on cutting-edge text extraction, tuned to exceed industry standards for accuracy. We regularly measure accuracy with validation from subject matter experts
3. **We use automated tools to check for data quality at every step in our processing pipeline.** We employ monitoring to ensure data integrity and quality.
4. **We ensure that data in the System Graph and System Clinical Graph are traceable directly to the original source.** We include original source metadata for every extracted relationship so that System’s products all have citations for original sources. System Syntheses are always created from extracted relationships and not from pre-summarized text. Users can track all citations found in any of System’s products back to the original source.
5. **We engineer System Synthesis to minimize the likelihood of “hallucination.”** Unlike some research summary services, we do not synthesize large amounts of unstructured text; we create summaries of relationships extracted from original sources using rule-based algorithms (not LLM-generated text). Users only see syntheses that are based on context provided to LLMs (and not based on knowledge encoded in LLMs). We further reduce the likelihood of hallucinations by running post-processing to ensure that all statements are traceable to one or more statements.
6. **We regularly benchmark the accuracy of System Synthesis** using question-answering standards produced by third parties (including BioASQ and Mayo Clinic). In our latest benchmark review, we had 85-90% accuracy across different test datasets.
7. **We have created tools and processes for users to suggest revisions** to information they find inaccurate. Any data that is flagged by a user is redacted to other users while our team works to correct the information.
8. **We proactively remove retracted studies from our data.** Studies flagged by Retraction Watch and similar services are automatically removed from our data when identified; any extracted relationships from retracted studies are not seen by users.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.system.com/system/legal/legal/data-integrity-policy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.