Testing Guide
For Claude Code & Human Developers This document is the single source of truth for testing practices in Nquiry.
Quick Reference
npm run test # Run all tests
npm run test:unit # Unit tests only
npm run test:integration # Integration tests only
npm run test:e2e # End-to-end tests (Playwright)
npm run test:e2e:ui # E2E with interactive UI
npm run test:coverage # Generate coverage report
npm run test:watch # Watch mode for development
The Golden Rule
Every PR that adds or modifies functionality MUST include corresponding tests.
No exceptions. If you're not sure what to test, see the decision tree below.
New Feature Checklist
1. Before Writing Code
- Feature has clear acceptance criteria
- Identify which entities/tables are affected
- Identify which API routes are needed
- Note any security/auth requirements
2. While Building
- Write unit tests for new utility functions
- Write unit tests for new React hooks
- Write component tests for new UI components
- Write integration tests for new API routes
- Write integration tests for new database operations
3. Before Merging
- All tests pass locally
- New code has reasonable coverage (aim for 70%+)
- Security-sensitive code has explicit tests (auth, RLS, file access)
What to Test: Decision Tree
Is it a new feature?
├── YES → Does it touch the database?
│ ├── YES → Write integration tests for DB operations
│ │ Write unit tests for any business logic
│ │ If new entity: add to sample data generators
│ └── NO → Does it have business logic?
│ ├── YES → Write unit tests
│ └── NO → Does it affect user-facing behavior?
│ ├── YES → Write component tests
│ └── NO → Document why no tests needed
│
└── NO (it's a bug fix) →
Write a test that reproduces the bug first,
then fix until test passes
Test Types & When to Use
Unit Tests (__tests__/unit/)
Use for: Pure functions, utilities, hooks, business logic Don't use for: Database operations, API calls, full page rendering
// Example: Testing a utility function
// __tests__/unit/lib/formatEvidence.test.ts
import { formatEvidenceTitle } from "@/lib/formatEvidence";
describe("formatEvidenceTitle", () => {
it("truncates titles longer than 50 characters", () => {
const longTitle = "A".repeat(60);
expect(formatEvidenceTitle(longTitle)).toHaveLength(53); // 50 + '...'
});
});
Integration Tests (__tests__/integration/)
Use for: API routes, database operations, auth flows, S3 operations Don't use for: UI rendering, user interactions
// Example: Testing an API route
// __tests__/integration/api/investigations.test.ts
describe("GET /api/investigations", () => {
it("returns only investigations owned by the authenticated user", async () => {
const userA = await seedUser();
const userB = await seedUser();
const invA = await seedInvestigation({ user_id: userA.id });
const invB = await seedInvestigation({ user_id: userB.id });
const response = await authenticatedRequest(
userA,
"GET",
"/api/investigations",
);
expect(response.status).toBe(200);
expect(response.body).toHaveLength(1);
expect(response.body[0].id).toBe(invA.id);
});
});
Component Tests (__tests__/unit/components/)
Use for: React components in isolation, user interactions Don't use for: Full page flows, actual API calls
// Example: Testing a component
// __tests__/unit/components/EvidenceCard.test.tsx
import { render, screen } from '@testing-library/react';
import { EvidenceCard } from '@/components/EvidenceCard';
import { createEvidence } from '@/__fixtures__/generators';
describe('EvidenceCard', () => {
it('displays evidence type icon and title', () => {
const evidence = createEvidence({ evidence_type: 'interview', title: 'Interview - Dr. Smith' });
render(<EvidenceCard evidence={evidence} />);
expect(screen.getByText('Interview - Dr. Smith')).toBeInTheDocument();
expect(screen.getByTestId('icon-interview')).toBeInTheDocument();
});
});
E2E Tests (__tests__/e2e/)
Use for: Critical user journeys, smoke tests, cross-page flows Don't use for: Edge cases, error states (too slow)
// Example: E2E test for critical path
// __tests__/e2e/create-investigation.spec.ts
import { test, expect } from "@playwright/test";
test("user can create investigation and add first question", async ({
page,
}) => {
await page.goto("/dashboard");
await page.click('[data-testid="create-investigation"]');
await page.fill('[name="title"]', "Test Investigation");
await page.click('[data-testid="submit"]');
await expect(page).toHaveURL(/\/investigations\/[\w-]+/);
await expect(page.getByText("Test Investigation")).toBeVisible();
});
E2E Test Setup
Prerequisites
-
Install Playwright browsers:
npx playwright install chromium -
Environment variables:
# Add to .env.localE2E_TEST_EMAIL=test@nquiry.aiE2E_TEST_PASSWORD=TestPassword123! -
Test account requirements:
- At least one personal organization
- At least one investigation created
- Valid subscription (for billing tests)
Running E2E Tests
# Start dev server (in separate terminal)
npm run dev
# Run tests
npm run test:e2e
# Run with UI (interactive mode)
npm run test:e2e:ui
# Run specific test file
npx playwright test auth.spec.ts
# Run in headed mode (see browser)
npx playwright test --headed
CI Configuration
E2E tests run automatically on main branch merges. Required GitHub Secrets:
| Secret | Description |
|---|---|
E2E_TEST_EMAIL | Test user email |
E2E_TEST_PASSWORD | Test user password |
Critical Path Tests
Located in __tests__/integration/critical-paths/:
| Path | File |
|---|---|
| User signup → org creation → first investigation | signup-to-investigation.test.ts |
| File upload → download → delete (with auth checks) | file-lifecycle.test.ts |
| Trial signup → checkout → subscription active | trial-to-subscription.test.ts |
| MFA setup → login with MFA challenge | mfa-flow.test.ts |
| AI analysis generation → quota tracking | ai-analysis-quota.test.ts |
| Account deletion (GDPR) with cascade verification | gdpr-deletion.test.ts |
| Data export (GDPR) completeness | gdpr-export.test.ts |
| Organization invitation → accept → access granted | org-invitation.test.ts |
Security-Critical Tests (Required)
These areas MUST have explicit tests. No exceptions.
Authentication
- Unauthenticated requests return 401
- Expired tokens are rejected
- Token refresh works correctly
- Logout invalidates session
Authorization (RLS)
- User cannot access other users' investigations
- User cannot access other organizations' data
- Readonly users cannot modify data
- Deleted data is inaccessible
File Access
- Signed URLs expire correctly
- Users cannot access other users' files
- File type validation prevents malicious uploads
- File size limits are enforced
Sample Data Generators
Location: __fixtures__/generators/
Available Generators
| Generator | Entity | Notes |
|---|---|---|
createUser() | app_user | All roles and tiers |
createOrganization() | organization | Team/enterprise |
createInvestigation() | investigation | All statuses |
createQuestion() | question | Realistic question text |
createEvidence() | evidence | All 6 types |
createEvidenceAttachment() | evidence_attachment | Various file types |
createEvidenceNote() | evidence_note | All note types |
createBackgroundDocument() | background_document | All doc types |
createEvaluationGuide() | evaluation_guide | System/org/user |
createInvestigatorAssessment() | investigator_assessment | With rationale |
createAnalysis() | analysis | All analysis types |
createReport() | report | Draft/review/final |
Scenario Builders
| Builder | Description |
|---|---|
createFullInvestigationScenario() | Complete investigation with configurable counts |
createPresidioScenario() | Demo scenario matching UI prototype |
CI/CD Integration
Tests run automatically on every push:
| Stage | Tests Run | Blocking? |
|---|---|---|
| Push to any branch | Lint, Type Check, Unit | Yes |
| PR to main | All above + Integration | Yes |
| Merge to main | All above + E2E | Yes |
| Nightly | Security scan, Full E2E | Alert only |
Local Pre-commit Check
npm run lint && npm run type-check && npm run test:unit
Manual Testing on Staging
Prerequisites
- GitHub CLI installed:
gh auth login - tmux installed:
brew install tmux - Screen recording: macOS Cmd+Shift+5
Quick Start
Add to ~/.zshrc:
source ~/investigation-app/scripts/testing-aliases.sh
Then: source ~/.zshrc
Start Testing Session
ttest # Creates/attaches to tmux testing layout
When You Find a Bug
- Capture: Screen record (Cmd+Shift+5), copy console errors
- Report:
bug "Bug title" "Console output or description" - Update checklist: Mark item with
<!-- FAIL: #127 --> - Commit:
tcommit - Add evidence: Upload recording to GitHub issue
Shell Aliases
| Alias | Description |
|---|---|
ttest | Start/attach tmux testing session |
bug "title" "desc" | Create GitHub bug issue |
tcoverage | Check issue coverage status |
tcommit | Commit and push checklist |
tstaging | Open staging URL in browser |
Manual Testing: Analysis System (2026-02-06)
Target: Staging (main branch, commit 5373415)
Test Investigation: Dr. Marcus Chen - Professional Conduct Review
Prerequisites: DB migrations applied (20260206112013, 20260206120000)
Test 1: Evidence Readiness Assessment (WS1)
The "Generate Analysis" dialog now shows an evidence readiness assessment before running the AI.
- Open Dr. Chen investigation → Analysis page
- Click "Generate Analysis"
- Select "Question Analysis" → pick Q1 (Documentation Fraud)
- Click "Generate Analysis" button
- Verify assessment panel appears (NOT the AI generation spinner):
- Readiness score (0-100) with colored bar (green/yellow/red)
- Evidence counts: Total, Relevant, High Relevance
- Source breakdown: Content / Attachments / Background
- Coverage gaps with severity badges (if any)
- Recommendations (if any)
- Assessment timing (should be < 3 seconds)
- Click "Cancel" — verify returns to dialog config, no AI call made
- Click "Generate Analysis" again → let assessment load → click "Proceed to Analysis"
- Verify AI generation starts (spinner, polling)
- Repeat with "Overall Summary" type (no question selection)
- Repeat with "Topic Analysis" → select the topic
Known edge cases:
- If no embeddings exist for the evidence, readiness score will be low
- Assessment should still work even with 0 relevant chunks (shows "insufficient")
Test 2: Async Quality Checks (WS2)
Quality checks now fire automatically after analysis generation completes.
- Generate an analysis (any type, use the one from Test 1)
- After generation completes, wait 30-60 seconds
- Refresh the page
- Look for quality confidence badge next to the analysis in the list (e.g., "verified", "established", "probable", "possible")
- Click the analysis → look for Quality Metrics Panel (expandable):
- Faithfulness score (percentage)
- Coverage score (percentage)
- Retrieval stats
- Validation status
- If no quality badge appears after 2 minutes, check browser console for errors
Test 3: Evidence Retrieval Transparency (WS3)
Shows which evidence chunks the AI considered when generating the analysis.
- Click on any completed analysis to open the detail view
- Scroll down past "Evidence Cited" section
- Look for "Evidence Considered" expandable panel
- Click to expand:
- Chunk list with ranked entries
- Source titles for each chunk
- Type badges (content / attachment / background_doc)
- Included/Excluded badges
- Similarity score bars (green ≥85%, yellow ≥70%, orange ≥60%, gray <60%)
- Summary: total retrieved, included, excluded
- If "Evidence Considered" doesn't appear, the analysis may predate the retrieval logging
Test 4: User Feedback (WS5)
Accept/reject buttons and regeneration reason tracking.
- Click on a completed analysis
- Scroll to "Analysis Feedback" section
- Click "Accept":
- Button changes to green "Accepted" state
- Green "Accepted" badge appears in the analysis list
- Open a different analysis
- Click "Reject":
- Prompt appears for optional rejection reason
- Button changes to red "Rejected" state
- Red "Rejected" badge appears in the list
- Open another analysis → scroll to "Request Revision":
- Select a reason from the dropdown (e.g., "Missing evidence")
- Type revision direction
- Click "Regenerate"
- Blue "Regenerated" badge should appear after regeneration
Test 5: End-to-End Flow
Full workflow combining all features.
- Generate a new Question Analysis for Q2 (Opioid Prescribing)
- Assessment appears → note the readiness score → click "Proceed"
- AI generates the analysis → wait for completion
- Click the new analysis in the list
- Check: Quality Metrics Panel appears (may take 30-60s to populate)
- Check: "Evidence Considered" panel loads and shows chunks
- Click "Accept" to mark as trustworthy
- Verify "Accepted" badge in list
- Generate a Gap Analysis → assessment → proceed → verify it completes
Troubleshooting
Tests failing in CI but passing locally
- Check environment variables in CI
- Verify test database is seeded
- Check for timezone-dependent assertions
Flaky E2E tests
- Add explicit waits:
await page.waitForSelector() - Use
data-testidattributes instead of CSS selectors - Check for race conditions in async operations
Tests failing with auth errors
- Check that mocks are set up correctly
- Verify test user credentials are set in environment
E2E tests timing out
- Increase timeout in playwright.config.ts
- Check that dev server is running
- Verify test user can log in manually