Skip to main content

AI Analysis

Last Updated: March 27, 2026 (NQU-401 review) Last Reviewed: 2026-04-01

Overview

Nquiry uses AI to analyze evidence and generate findings based on professional investigation standards. The AI applies a rigorous evidence evaluation framework derived from CIGIE (Council of the Inspectors General on Integrity and Efficiency), GAO (Government Accountability Office), and other cross-sector standards.


Analysis Types

Question Analysis

Analyzes evidence to answer a specific investigation question.

Output includes:

  • Direct answer with confidence level
  • Evidence cited with relevance assessment
  • Alternative explanations considered
  • Evidence gaps identified
  • Recommended follow-up actions

Best for:

  • Answering specific factual questions
  • Evaluating allegations
  • Determining compliance with criteria

Topic Analysis (Topic Synthesis)

Synthesizes findings across all questions in a topic using a dedicated TopicSynthesisOutput schema (NQU-472, deployed 2026-03-26).

Output includes:

  • Source analysis summaries with provenance (which question analyses fed this synthesis)
  • Cross-question patterns with convergence/divergence indicators
  • Inter-question contradictions (where analyses disagree)
  • Emergent gaps (gaps only visible when looking across questions)
  • Cited findings with evidence references
  • Topic-level conclusions

Best for:

  • Understanding a subject area holistically
  • Identifying patterns and contradictions across questions
  • Discovering gaps that individual analyses miss
  • Preparing topic sections for reports

Gap Analysis

Identifies missing evidence and unanswered questions.

Output includes:

  • Questions with insufficient evidence
  • Evidence types that would strengthen findings
  • Recommended evidence sources
  • Priority gaps to address

Best for:

  • Planning additional evidence collection
  • Assessing readiness for conclusions
  • Identifying risks to findings

Overall Summary

High-level synthesis of the entire investigation.

Output includes:

  • Executive-level summary
  • Key findings across all topics
  • Overall evidence assessment
  • Conclusions and recommendations

Best for:

  • Executive briefings
  • Report introductions
  • Stakeholder communications

Generating Analysis

From the Analysis Page

  1. Navigate to Analysis in the sidebar
  2. Click "Generate Analysis" to open the dialog
  3. Select analysis type (Question, Topic, Gap, Summary)
  4. Choose specific questions or topics (if applicable)
  5. Optionally add investigator direction
  6. Click "Generate Analysis"

Async generation (NQU-484): The dialog closes immediately after firing the request. A blue "Generating..." banner appears in the analysis list. When generation completes, a toast notification appears and the status updates to "Complete" or "Failed." You can navigate freely while generation runs in the background.

Batch generation: Select multiple questions and generate analyses for all of them at once. Progress shows "Submitting 1/N..." during request phase, then "Generating analysis 1 of N..." during processing (NQU-467).

Gray-out of analyzed questions (NQU-485): Questions that already have a completed analysis are disabled (grayed out) in the Generate Analysis dialog with an "analyzed — vN" version indicator. "Select All" only selects unanalyzed questions. Questions with failed analyses remain selectable.

Investigator Direction

Provide specific guidance to focus the AI:

  • "Focus on timeline inconsistencies"
  • "Evaluate against the procurement policy Section 3.2"
  • "Consider the vendor's explanation in Exhibit 5"

Direction is included in the AI prompt but doesn't override the evaluation framework.

Regeneration via Request Revision

Once an analysis is complete, the only path to a new version is through Request Revision on the existing analysis (NQU-485/490):

  1. Open the existing analysis
  2. Click "Request Revision"
  3. Select a reason from the dropdown (e.g., new evidence, different direction, quality concerns)
  4. Add feedback text explaining what to change
  5. Click "Regenerate"

This preserves audit trail integrity — every new version has a documented reason.

Admin/test workflow: Admins can alternatively delete an existing analysis entirely, which re-enables the question in the Generate Analysis dialog for a clean regeneration.


Quality Metrics

Every analysis includes quality metrics to help you assess reliability.

Overall Confidence

LevelMeaning
EstablishedHigh quality across all metrics
ProbableGood quality with minor gaps
PossibleAcceptable but notable limitations
InsufficientBelow thresholds, review carefully

Faithfulness Score

Measures whether AI claims are grounded in evidence:

  • 95-100%: Nearly all claims verified
  • 85-94%: Most claims verified
  • 70-84%: Notable unsupported claims
  • <70%: Many unverified claims

Coverage Score

Measures whether the analysis addresses all aspects:

  • 95-100%: All aspects addressed
  • 85-94%: Minor gaps only
  • 70-84%: Notable gaps
  • <70%: Major gaps

Retrieval Quality

Measures how relevant the retrieved evidence was:

  • Strong: Average similarity > 0.85
  • Moderate: Average similarity 0.70-0.85
  • Weak: Average similarity < 0.70

See AI Quality Metrics for detailed information.


How AI Analysis Works

1. Evidence Retrieval (Three-Stage Hybrid Pipeline)

The system uses a three-stage hybrid retrieval approach (NQU-379, NQU-462):

Stage 1 — Keyword Search: PostgreSQL tsvector/tsquery finds evidence containing the same terms as the question. Catches exact-match evidence (case numbers, names, policy codes) that semantic search misses.

Stage 2 — Semantic Search (Vector Embeddings): Amazon Titan Text Embeddings V2 via pgvector converts questions and evidence into meaning vectors. Catches evidence that's relevant even when it uses different words.

Stage 3 — Reranking: Cohere Rerank 3.5 re-evaluates merged results from Stages 1 and 2 by reading question and evidence together. Promotes keyword-surfaced results that semantic search missed; demotes false positives.

Results are combined using team-draft interleaving (alternating picks from vector and keyword ranked lists) before reranking (NQU-462). Query-adaptive weighting automatically boosts keyword weight for entity-heavy queries containing proper nouns, dates, and identifiers.

2. Document Processing

For file attachments:

  • PDFs are sent directly to Claude for visual analysis
  • Word documents have text extracted
  • Excel files are converted to CSV format
  • Images are analyzed visually

3. Analysis Generation

The AI receives:

  • System prompt with evaluation framework
  • Investigation context (title, focus, work type)
  • Relevant evidence chunks
  • Full document files
  • Background documents (if configured)
  • Framework documents (if configured)
  • User direction (if provided)

4. Citation Repair

The AI sometimes generates valid cited findings with correct evidence titles but hallucinated UUIDs instead of real evidence IDs. A citation repair step (repairCitationIds(), NQU-469) runs between AI output and the sanitizer, matching hallucinated IDs to real ones by title. The sanitizer remains as a safety net.

5. Quality Checking

After generation:

  • Faithfulness checker verifies claims against evidence
  • Coverage checker identifies gaps in question coverage
  • Overall confidence is calculated
  • Results are stored with analysis
  • Retrieval quality is logged (retrieved vs. cited evidence) for continuous monitoring (NQU-462)

Quality checks now work correctly for all analysis types including gap analysis and error check (NQU-474 fixed text extraction for non-question output schemas).


Evidence Considered Panel

Each analysis includes an expandable "Evidence Considered" panel (NQU-486, redesigned 2026-03-26) showing which evidence passages were retrieved and how:

  • Plain-language retrieval story: "Found by: Keyword + Vector | Rerank confirmed" per passage, replacing raw pipeline diagnostics
  • Header: Shows total evidence passages used
  • Raw scores: Available behind an info toggle for advanced users
  • No misleading labels: Removed the old "Low / Below threshold" similarity labels that confused users despite the system working correctly

Evidence Considered Panel: Gap Impact Badges

Gap analysis outputs now include impact badges (Critical / Significant / Minor) with tooltip definitions explaining what each level means operationally (NQU-480).


Evidence Evaluation Framework

The AI uses 10 quality criteria from CIGIE/GAO standards as a conceptual framework when assessing evidence — they shape what the AI looks for, but they're no longer scored individually per item. Per NQU-100, the active output is a simpler 2-field assessment (confidence + reasoning) rolled up across all 10 dimensions.

The 10 framework dimensions:

  1. Relevance: Does it address the question?
  2. Reliability: Is the source credible?
  3. Sufficiency: Is there enough evidence?
  4. Validity: Does it prove what it claims?
  5. Competence: Is the source qualified?
  6. Completeness: Are there gaps?
  7. Timeliness: Is it current?
  8. Objectivity: Fact-based or opinion?
  9. Authenticity: Is it genuine?
  10. Consistency: Does it align with other evidence?

See Evidence Evaluation Framework for the historical complete framework.


Confidence Levels

The AI assigns confidence to conclusions:

LevelMeaning
EstablishedStrong, sufficient, convergent evidence
ProbableModerate evidence or strong with minor gaps
PossibleSome evidence but significant gaps or weaknesses
InsufficientEvidence too weak or incomplete to support reliable conclusions

Finding Status

After reviewing analysis, assign a finding status:

StatusWhen to Use
PendingNot yet reviewed
SubstantiatedEvidence clearly supports the finding
Not SubstantiatedEvidence does not support the allegation
InconclusiveEvidence is insufficient to determine

Finding status can be set on the Questions page or Analysis page.


Background Documents

Context documents that inform the AI:

  • Charge letters
  • Organizational charts
  • Scope memos
  • Relevant policies

Toggle "Include in Analysis" to control which are sent to AI.

Framework Documents

Evaluation criteria documents:

  • Policies and procedures
  • Regulations and standards
  • Industry frameworks
  • Audit criteria

The AI uses these as the basis for evaluating compliance.


Usage and Quotas

AI Generation Limits

PlanMonthly Limit
Trial15 total
Core25/month
Pro50/month

AI generations include:

  • Question/topic/gap/summary analysis
  • Report section generation
  • AI Agent chat messages

Monitoring Usage

View usage in Settings → Billing:

  • Current period usage
  • Limit and remaining
  • Usage history

Best Practices

Before Generating

  • Ensure evidence is uploaded and linked
  • Check that embeddings are processed
  • Configure background/framework documents
  • Write a clear focus statement

Providing Direction

  • Be specific about what to focus on
  • Reference specific documents or policies
  • Ask for particular aspects to be addressed
  • Note any concerns to investigate

Reviewing Output

  • Check quality metrics first
  • Verify key claims against evidence
  • Note any unsupported claims
  • Review coverage gaps
  • Assess if regeneration is needed

Iterating

  • Add missing evidence before regenerating
  • Provide more specific direction
  • Split complex questions into parts
  • Try different analysis types

Troubleshooting

Low Faithfulness Score

  • Evidence may not directly support AI's phrasing
  • Add more explicit evidence
  • Check if relevant evidence was retrieved

Low Coverage Score

  • Question may be too complex
  • Evidence may not address all aspects
  • Provide direction to focus on gaps

Weak Retrieval

  • Add more evidence related to the topic
  • Ensure evidence has been processed for embeddings
  • Rephrase question to match evidence terminology

Validation Failed

  • AI produced malformed output
  • Try regenerating
  • Contact support if persistent