AI Analysis

Last Updated: March 27, 2026 (NQU-401 review) Last Reviewed: 2026-04-01

Overview

Nquiry uses AI to analyze evidence and generate findings based on professional investigation standards. The AI applies a rigorous evidence evaluation framework derived from CIGIE (Council of the Inspectors General on Integrity and Efficiency), GAO (Government Accountability Office), and other cross-sector standards.

Analysis Types

Question Analysis

Analyzes evidence to answer a specific investigation question.

Output includes:

Direct answer with confidence level
Evidence cited with relevance assessment
Alternative explanations considered
Evidence gaps identified
Recommended follow-up actions

Best for:

Answering specific factual questions
Evaluating allegations
Determining compliance with criteria

Topic Analysis (Topic Synthesis)

Synthesizes findings across all questions in a topic using a dedicated TopicSynthesisOutput schema (NQU-472, deployed 2026-03-26).

Output includes:

Source analysis summaries with provenance (which question analyses fed this synthesis)
Cross-question patterns with convergence/divergence indicators
Inter-question contradictions (where analyses disagree)
Emergent gaps (gaps only visible when looking across questions)
Cited findings with evidence references
Topic-level conclusions

Best for:

Understanding a subject area holistically
Identifying patterns and contradictions across questions
Discovering gaps that individual analyses miss
Preparing topic sections for reports

Gap Analysis

Identifies missing evidence and unanswered questions.

Output includes:

Questions with insufficient evidence
Evidence types that would strengthen findings
Recommended evidence sources
Priority gaps to address

Best for:

Planning additional evidence collection
Assessing readiness for conclusions
Identifying risks to findings

Overall Summary

High-level synthesis of the entire investigation.

Output includes:

Executive-level summary
Key findings across all topics
Overall evidence assessment
Conclusions and recommendations

Best for:

Executive briefings
Report introductions
Stakeholder communications

Generating Analysis

From the Analysis Page

Navigate to Analysis in the sidebar
Click "Generate Analysis" to open the dialog
Select analysis type (Question, Topic, Gap, Summary)
Choose specific questions or topics (if applicable)
Optionally add investigator direction
Click "Generate Analysis"

Async generation (NQU-484): The dialog closes immediately after firing the request. A blue "Generating..." banner appears in the analysis list. When generation completes, a toast notification appears and the status updates to "Complete" or "Failed." You can navigate freely while generation runs in the background.

Batch generation: Select multiple questions and generate analyses for all of them at once. Progress shows "Submitting 1/N..." during request phase, then "Generating analysis 1 of N..." during processing (NQU-467).

Gray-out of analyzed questions (NQU-485): Questions that already have a completed analysis are disabled (grayed out) in the Generate Analysis dialog with an "analyzed — vN" version indicator. "Select All" only selects unanalyzed questions. Questions with failed analyses remain selectable.

Investigator Direction

Provide specific guidance to focus the AI:

"Focus on timeline inconsistencies"
"Evaluate against the procurement policy Section 3.2"
"Consider the vendor's explanation in Exhibit 5"

Direction is included in the AI prompt but doesn't override the evaluation framework.

Regeneration via Request Revision

Once an analysis is complete, the only path to a new version is through Request Revision on the existing analysis (NQU-485/490):

Open the existing analysis
Click "Request Revision"
Select a reason from the dropdown (e.g., new evidence, different direction, quality concerns)
Add feedback text explaining what to change
Click "Regenerate"

This preserves audit trail integrity — every new version has a documented reason.

Admin/test workflow: Admins can alternatively delete an existing analysis entirely, which re-enables the question in the Generate Analysis dialog for a clean regeneration.

Quality Metrics

Every analysis includes quality metrics to help you assess reliability.

Overall Confidence

Level	Meaning
Established	High quality across all metrics
Probable	Good quality with minor gaps
Possible	Acceptable but notable limitations
Insufficient	Below thresholds, review carefully

Faithfulness Score

Measures whether AI claims are grounded in evidence:

95-100%: Nearly all claims verified
85-94%: Most claims verified
70-84%: Notable unsupported claims
<70%: Many unverified claims

Coverage Score

Measures whether the analysis addresses all aspects:

95-100%: All aspects addressed
85-94%: Minor gaps only
70-84%: Notable gaps
<70%: Major gaps

Retrieval Quality

Measures how relevant the retrieved evidence was:

Strong: Average similarity > 0.85
Moderate: Average similarity 0.70-0.85
Weak: Average similarity < 0.70

See AI Quality Metrics for detailed information.

How AI Analysis Works

1. Evidence Retrieval (Three-Stage Hybrid Pipeline)

The system uses a three-stage hybrid retrieval approach (NQU-379, NQU-462):

Stage 1 — Keyword Search: PostgreSQL tsvector/tsquery finds evidence containing the same terms as the question. Catches exact-match evidence (case numbers, names, policy codes) that semantic search misses.

Stage 2 — Semantic Search (Vector Embeddings): Amazon Titan Text Embeddings V2 via pgvector converts questions and evidence into meaning vectors. Catches evidence that's relevant even when it uses different words.

Stage 3 — Reranking: Cohere Rerank 3.5 re-evaluates merged results from Stages 1 and 2 by reading question and evidence together. Promotes keyword-surfaced results that semantic search missed; demotes false positives.

Results are combined using team-draft interleaving (alternating picks from vector and keyword ranked lists) before reranking (NQU-462). Query-adaptive weighting automatically boosts keyword weight for entity-heavy queries containing proper nouns, dates, and identifiers.

2. Document Processing

For file attachments:

PDFs are sent directly to Claude for visual analysis
Word documents have text extracted
Excel files are converted to CSV format
Images are analyzed visually

3. Analysis Generation

The AI receives:

System prompt with evaluation framework
Investigation context (title, focus, work type)
Relevant evidence chunks
Full document files
Background documents (if configured)
Framework documents (if configured)
User direction (if provided)

4. Citation Repair

The AI sometimes generates valid cited findings with correct evidence titles but hallucinated UUIDs instead of real evidence IDs. A citation repair step (repairCitationIds(), NQU-469) runs between AI output and the sanitizer, matching hallucinated IDs to real ones by title. The sanitizer remains as a safety net.

5. Quality Checking

After generation:

Faithfulness checker verifies claims against evidence
Coverage checker identifies gaps in question coverage
Overall confidence is calculated
Results are stored with analysis
Retrieval quality is logged (retrieved vs. cited evidence) for continuous monitoring (NQU-462)

Quality checks now work correctly for all analysis types including gap analysis and error check (NQU-474 fixed text extraction for non-question output schemas).

Evidence Considered Panel

Each analysis includes an expandable "Evidence Considered" panel (NQU-486, redesigned 2026-03-26) showing which evidence passages were retrieved and how:

Plain-language retrieval story: "Found by: Keyword + Vector | Rerank confirmed" per passage, replacing raw pipeline diagnostics
Header: Shows total evidence passages used
Raw scores: Available behind an info toggle for advanced users
No misleading labels: Removed the old "Low / Below threshold" similarity labels that confused users despite the system working correctly

Evidence Considered Panel: Gap Impact Badges

Gap analysis outputs now include impact badges (Critical / Significant / Minor) with tooltip definitions explaining what each level means operationally (NQU-480).

Evidence Evaluation Framework

The AI uses 10 quality criteria from CIGIE/GAO standards as a conceptual framework when assessing evidence — they shape what the AI looks for, but they're no longer scored individually per item. Per NQU-100, the active output is a simpler 2-field assessment (confidence + reasoning) rolled up across all 10 dimensions.

The 10 framework dimensions:

Relevance: Does it address the question?
Reliability: Is the source credible?
Sufficiency: Is there enough evidence?
Validity: Does it prove what it claims?
Competence: Is the source qualified?
Completeness: Are there gaps?
Timeliness: Is it current?
Objectivity: Fact-based or opinion?
Authenticity: Is it genuine?
Consistency: Does it align with other evidence?

See Evidence Evaluation Framework for the historical complete framework.

Confidence Levels

The AI assigns confidence to conclusions:

Level	Meaning
Established	Strong, sufficient, convergent evidence
Probable	Moderate evidence or strong with minor gaps
Possible	Some evidence but significant gaps or weaknesses
Insufficient	Evidence too weak or incomplete to support reliable conclusions

Finding Status

After reviewing analysis, assign a finding status:

Status	When to Use
Pending	Not yet reviewed
Substantiated	Evidence clearly supports the finding
Not Substantiated	Evidence does not support the allegation
Inconclusive	Evidence is insufficient to determine

Finding status can be set on the Questions page or Analysis page.

Background Documents

Context documents that inform the AI:

Charge letters
Organizational charts
Scope memos
Relevant policies

Toggle "Include in Analysis" to control which are sent to AI.

Framework Documents

Evaluation criteria documents:

Policies and procedures
Regulations and standards
Industry frameworks
Audit criteria

The AI uses these as the basis for evaluating compliance.

Usage and Quotas

AI Generation Limits

Plan	Monthly Limit
Trial	15 total
Core	25/month
Pro	50/month

AI generations include:

Question/topic/gap/summary analysis
Report section generation
AI Agent chat messages

Monitoring Usage

View usage in Settings → Billing:

Current period usage
Limit and remaining
Usage history

Best Practices

Before Generating

Ensure evidence is uploaded and linked
Check that embeddings are processed
Configure background/framework documents
Write a clear focus statement

Providing Direction

Be specific about what to focus on
Reference specific documents or policies
Ask for particular aspects to be addressed
Note any concerns to investigate

Reviewing Output

Check quality metrics first
Verify key claims against evidence
Note any unsupported claims
Review coverage gaps
Assess if regeneration is needed

Iterating

Add missing evidence before regenerating
Provide more specific direction
Split complex questions into parts
Try different analysis types

Troubleshooting

Low Faithfulness Score

Evidence may not directly support AI's phrasing
Add more explicit evidence
Check if relevant evidence was retrieved

Low Coverage Score

Question may be too complex
Evidence may not address all aspects
Provide direction to focus on gaps

Weak Retrieval

Add more evidence related to the topic
Ensure evidence has been processed for embeddings
Rephrase question to match evidence terminology

Validation Failed

AI produced malformed output
Try regenerating
Contact support if persistent

AI Quality Metrics - Detailed metrics explanation
Evidence Evaluation Framework - Full framework text
Investigation Workflow - Analysis in context

Overview​

Analysis Types​

Question Analysis​

Topic Analysis (Topic Synthesis)​

Gap Analysis​

Overall Summary​

Generating Analysis​

From the Analysis Page​

Investigator Direction​

Regeneration via Request Revision​

Quality Metrics​

Overall Confidence​

Faithfulness Score​

Coverage Score​

Retrieval Quality​

How AI Analysis Works​

1. Evidence Retrieval (Three-Stage Hybrid Pipeline)​

2. Document Processing​

3. Analysis Generation​

4. Citation Repair​

5. Quality Checking​

Evidence Considered Panel​

Evidence Considered Panel: Gap Impact Badges​

Evidence Evaluation Framework​

Confidence Levels​

Finding Status​

Background Documents​

Framework Documents​

Usage and Quotas​

AI Generation Limits​

Monitoring Usage​

Best Practices​

Before Generating​

Providing Direction​

Reviewing Output​

Iterating​

Troubleshooting​

Low Faithfulness Score​

Low Coverage Score​

Weak Retrieval​

Validation Failed​

Related Documentation​

Overview

Analysis Types

Question Analysis

Topic Analysis (Topic Synthesis)

Gap Analysis

Overall Summary

Generating Analysis

From the Analysis Page

Investigator Direction

Regeneration via Request Revision

Quality Metrics

Overall Confidence

Faithfulness Score

Coverage Score

Retrieval Quality

How AI Analysis Works

1. Evidence Retrieval (Three-Stage Hybrid Pipeline)

2. Document Processing

3. Analysis Generation

4. Citation Repair

5. Quality Checking

Evidence Considered Panel

Evidence Considered Panel: Gap Impact Badges

Evidence Evaluation Framework

Confidence Levels

Finding Status

Background Documents

Framework Documents

Usage and Quotas

AI Generation Limits

Monitoring Usage

Best Practices

Before Generating

Providing Direction

Reviewing Output

Iterating

Troubleshooting

Low Faithfulness Score

Low Coverage Score

Weak Retrieval

Validation Failed

Related Documentation