AI Analysis
Last Updated: March 27, 2026 (NQU-401 review) Last Reviewed: 2026-04-01
Overview
Nquiry uses AI to analyze evidence and generate findings based on professional investigation standards. The AI applies a rigorous evidence evaluation framework derived from CIGIE (Council of the Inspectors General on Integrity and Efficiency), GAO (Government Accountability Office), and other cross-sector standards.
Analysis Types
Question Analysis
Analyzes evidence to answer a specific investigation question.
Output includes:
- Direct answer with confidence level
- Evidence cited with relevance assessment
- Alternative explanations considered
- Evidence gaps identified
- Recommended follow-up actions
Best for:
- Answering specific factual questions
- Evaluating allegations
- Determining compliance with criteria
Topic Analysis (Topic Synthesis)
Synthesizes findings across all questions in a topic using a dedicated TopicSynthesisOutput schema (NQU-472, deployed 2026-03-26).
Output includes:
- Source analysis summaries with provenance (which question analyses fed this synthesis)
- Cross-question patterns with convergence/divergence indicators
- Inter-question contradictions (where analyses disagree)
- Emergent gaps (gaps only visible when looking across questions)
- Cited findings with evidence references
- Topic-level conclusions
Best for:
- Understanding a subject area holistically
- Identifying patterns and contradictions across questions
- Discovering gaps that individual analyses miss
- Preparing topic sections for reports
Gap Analysis
Identifies missing evidence and unanswered questions.
Output includes:
- Questions with insufficient evidence
- Evidence types that would strengthen findings
- Recommended evidence sources
- Priority gaps to address
Best for:
- Planning additional evidence collection
- Assessing readiness for conclusions
- Identifying risks to findings
Overall Summary
High-level synthesis of the entire investigation.
Output includes:
- Executive-level summary
- Key findings across all topics
- Overall evidence assessment
- Conclusions and recommendations
Best for:
- Executive briefings
- Report introductions
- Stakeholder communications
Generating Analysis
From the Analysis Page
- Navigate to Analysis in the sidebar
- Click "Generate Analysis" to open the dialog
- Select analysis type (Question, Topic, Gap, Summary)
- Choose specific questions or topics (if applicable)
- Optionally add investigator direction
- Click "Generate Analysis"
Async generation (NQU-484): The dialog closes immediately after firing the request. A blue "Generating..." banner appears in the analysis list. When generation completes, a toast notification appears and the status updates to "Complete" or "Failed." You can navigate freely while generation runs in the background.
Batch generation: Select multiple questions and generate analyses for all of them at once. Progress shows "Submitting 1/N..." during request phase, then "Generating analysis 1 of N..." during processing (NQU-467).
Gray-out of analyzed questions (NQU-485): Questions that already have a completed analysis are disabled (grayed out) in the Generate Analysis dialog with an "analyzed — vN" version indicator. "Select All" only selects unanalyzed questions. Questions with failed analyses remain selectable.
Investigator Direction
Provide specific guidance to focus the AI:
- "Focus on timeline inconsistencies"
- "Evaluate against the procurement policy Section 3.2"
- "Consider the vendor's explanation in Exhibit 5"
Direction is included in the AI prompt but doesn't override the evaluation framework.
Regeneration via Request Revision
Once an analysis is complete, the only path to a new version is through Request Revision on the existing analysis (NQU-485/490):
- Open the existing analysis
- Click "Request Revision"
- Select a reason from the dropdown (e.g., new evidence, different direction, quality concerns)
- Add feedback text explaining what to change
- Click "Regenerate"
This preserves audit trail integrity — every new version has a documented reason.
Admin/test workflow: Admins can alternatively delete an existing analysis entirely, which re-enables the question in the Generate Analysis dialog for a clean regeneration.
Quality Metrics
Every analysis includes quality metrics to help you assess reliability.
Overall Confidence
| Level | Meaning |
|---|---|
| Established | High quality across all metrics |
| Probable | Good quality with minor gaps |
| Possible | Acceptable but notable limitations |
| Insufficient | Below thresholds, review carefully |
Faithfulness Score
Measures whether AI claims are grounded in evidence:
- 95-100%: Nearly all claims verified
- 85-94%: Most claims verified
- 70-84%: Notable unsupported claims
- <70%: Many unverified claims
Coverage Score
Measures whether the analysis addresses all aspects:
- 95-100%: All aspects addressed
- 85-94%: Minor gaps only
- 70-84%: Notable gaps
- <70%: Major gaps
Retrieval Quality
Measures how relevant the retrieved evidence was:
- Strong: Average similarity > 0.85
- Moderate: Average similarity 0.70-0.85
- Weak: Average similarity < 0.70
See AI Quality Metrics for detailed information.
How AI Analysis Works
1. Evidence Retrieval (Three-Stage Hybrid Pipeline)
The system uses a three-stage hybrid retrieval approach (NQU-379, NQU-462):
Stage 1 — Keyword Search: PostgreSQL tsvector/tsquery finds evidence containing the same terms as the question. Catches exact-match evidence (case numbers, names, policy codes) that semantic search misses.
Stage 2 — Semantic Search (Vector Embeddings): Amazon Titan Text Embeddings V2 via pgvector converts questions and evidence into meaning vectors. Catches evidence that's relevant even when it uses different words.
Stage 3 — Reranking: Cohere Rerank 3.5 re-evaluates merged results from Stages 1 and 2 by reading question and evidence together. Promotes keyword-surfaced results that semantic search missed; demotes false positives.
Results are combined using team-draft interleaving (alternating picks from vector and keyword ranked lists) before reranking (NQU-462). Query-adaptive weighting automatically boosts keyword weight for entity-heavy queries containing proper nouns, dates, and identifiers.
2. Document Processing
For file attachments:
- PDFs are sent directly to Claude for visual analysis
- Word documents have text extracted
- Excel files are converted to CSV format
- Images are analyzed visually
3. Analysis Generation
The AI receives:
- System prompt with evaluation framework
- Investigation context (title, focus, work type)
- Relevant evidence chunks
- Full document files
- Background documents (if configured)
- Framework documents (if configured)
- User direction (if provided)
4. Citation Repair
The AI sometimes generates valid cited findings with correct evidence titles but hallucinated UUIDs instead of real evidence IDs. A citation repair step (repairCitationIds(), NQU-469) runs between AI output and the sanitizer, matching hallucinated IDs to real ones by title. The sanitizer remains as a safety net.
5. Quality Checking
After generation:
- Faithfulness checker verifies claims against evidence
- Coverage checker identifies gaps in question coverage
- Overall confidence is calculated
- Results are stored with analysis
- Retrieval quality is logged (retrieved vs. cited evidence) for continuous monitoring (NQU-462)
Quality checks now work correctly for all analysis types including gap analysis and error check (NQU-474 fixed text extraction for non-question output schemas).
Evidence Considered Panel
Each analysis includes an expandable "Evidence Considered" panel (NQU-486, redesigned 2026-03-26) showing which evidence passages were retrieved and how:
- Plain-language retrieval story: "Found by: Keyword + Vector | Rerank confirmed" per passage, replacing raw pipeline diagnostics
- Header: Shows total evidence passages used
- Raw scores: Available behind an info toggle for advanced users
- No misleading labels: Removed the old "Low / Below threshold" similarity labels that confused users despite the system working correctly
Evidence Considered Panel: Gap Impact Badges
Gap analysis outputs now include impact badges (Critical / Significant / Minor) with tooltip definitions explaining what each level means operationally (NQU-480).
Evidence Evaluation Framework
The AI uses 10 quality criteria from CIGIE/GAO standards as a conceptual framework when assessing evidence — they shape what the AI looks for, but they're no longer scored individually per item. Per NQU-100, the active output is a simpler 2-field assessment (confidence + reasoning) rolled up across all 10 dimensions.
The 10 framework dimensions:
- Relevance: Does it address the question?
- Reliability: Is the source credible?
- Sufficiency: Is there enough evidence?
- Validity: Does it prove what it claims?
- Competence: Is the source qualified?
- Completeness: Are there gaps?
- Timeliness: Is it current?
- Objectivity: Fact-based or opinion?
- Authenticity: Is it genuine?
- Consistency: Does it align with other evidence?
See Evidence Evaluation Framework for the historical complete framework.
Confidence Levels
The AI assigns confidence to conclusions:
| Level | Meaning |
|---|---|
| Established | Strong, sufficient, convergent evidence |
| Probable | Moderate evidence or strong with minor gaps |
| Possible | Some evidence but significant gaps or weaknesses |
| Insufficient | Evidence too weak or incomplete to support reliable conclusions |
Finding Status
After reviewing analysis, assign a finding status:
| Status | When to Use |
|---|---|
| Pending | Not yet reviewed |
| Substantiated | Evidence clearly supports the finding |
| Not Substantiated | Evidence does not support the allegation |
| Inconclusive | Evidence is insufficient to determine |
Finding status can be set on the Questions page or Analysis page.
Background Documents
Context documents that inform the AI:
- Charge letters
- Organizational charts
- Scope memos
- Relevant policies
Toggle "Include in Analysis" to control which are sent to AI.
Framework Documents
Evaluation criteria documents:
- Policies and procedures
- Regulations and standards
- Industry frameworks
- Audit criteria
The AI uses these as the basis for evaluating compliance.
Usage and Quotas
AI Generation Limits
| Plan | Monthly Limit |
|---|---|
| Trial | 15 total |
| Core | 25/month |
| Pro | 50/month |
AI generations include:
- Question/topic/gap/summary analysis
- Report section generation
- AI Agent chat messages
Monitoring Usage
View usage in Settings → Billing:
- Current period usage
- Limit and remaining
- Usage history
Best Practices
Before Generating
- Ensure evidence is uploaded and linked
- Check that embeddings are processed
- Configure background/framework documents
- Write a clear focus statement
Providing Direction
- Be specific about what to focus on
- Reference specific documents or policies
- Ask for particular aspects to be addressed
- Note any concerns to investigate
Reviewing Output
- Check quality metrics first
- Verify key claims against evidence
- Note any unsupported claims
- Review coverage gaps
- Assess if regeneration is needed
Iterating
- Add missing evidence before regenerating
- Provide more specific direction
- Split complex questions into parts
- Try different analysis types
Troubleshooting
Low Faithfulness Score
- Evidence may not directly support AI's phrasing
- Add more explicit evidence
- Check if relevant evidence was retrieved
Low Coverage Score
- Question may be too complex
- Evidence may not address all aspects
- Provide direction to focus on gaps
Weak Retrieval
- Add more evidence related to the topic
- Ensure evidence has been processed for embeddings
- Rephrase question to match evidence terminology
Validation Failed
- AI produced malformed output
- Try regenerating
- Contact support if persistent
Related Documentation
- AI Quality Metrics - Detailed metrics explanation
- Evidence Evaluation Framework - Full framework text
- Investigation Workflow - Analysis in context