Security Remediation Plan
Created: 2026-01-31 Status: In Progress Target Completion: Before Production Launch
This document tracks security and reliability issues identified in the 2026-01-31 senior engineer audit. Claude Code must read this file at the start of every session and continue work on incomplete items.
Session Continuity Protocol
At the start of each session:
- Read this file
- Check the "Current Sprint" section for in-progress work
- Continue where the previous session left off
- Update checkboxes and dates as work completes
Phase 1: Blocking Issues (Week 1)
These issues must be fixed before any other feature work.
1.1 Storage Route Authorization
Risk: HIGH - Authenticated users can access files from other organizations Files:
app/api/storage/download/route.tsapp/api/storage/download-url/route.tsapp/api/storage/delete/route.tsapp/api/storage/upload/route.ts
Required Changes:
- Extract organization_id from file path
- Verify user has org membership via
hasOrgRole() - Return 403 if not authorized
- Add audit logging for access attempts
- Add tests for authorization checks
Verification:
# Run storage authorization tests
npm test -- storage-authorization.test
# Expected: 22 passed, 9 skipped (FormData tests skipped due to test env)
Completed: [x] Date: 2026-01-31
1.2 Audit Logging Resilience
Risk: HIGH - FedRAMP non-compliance if audit writes fail silently
File: lib/shared/audit.ts
Current Behavior: Returns false on failure, callers ignore return value
Required Behavior: Failures must be visible and recoverable
Required Changes:
- Option A: Throw on audit failure (breaks user operation)
- Option B: Queue failed audits to dead letter table for retry
- Option C: Alert on failure but don't block operation
- Decision: Option C with enhancements (tracking, metrics, recovery logging)
- Implement chosen approach
- Add in-memory failure tracking with getAuditHealth() function
- Add CloudWatch alarm for audit failures (post-launch)
- Add strict mode option for critical operations
- Add admin endpoint /api/admin/audit-health for monitoring
Implementation Details:
- Failures tracked in memory with count and recent failure history
- Console logs prefixed with [AUDIT FAILURE] include full entry JSON for recovery
- getAuditHealth() returns failure count and recent failures
- Admin endpoint at /api/admin/audit-health for monitoring
- Optional strict mode:
logAudit(entry, { strict: true })throws on failure
Verification:
- Audit failures are tracked and logged with full details
- Check CloudWatch alarm fires (requires CloudWatch setup)
Completed: [x] Date: 2026-01-31
1.3 Redis Rate Limiting
Risk: HIGH - Current in-memory rate limiting resets on cold start
File: lib/shared/rate-limit.ts
Status: CODE COMPLETE - AWAITING INFRASTRUCTURE
Infrastructure Required:
- AWS ElastiCache Redis cluster in same VPC as Amplify
- Security group allowing Amplify → ElastiCache access
- VPC configuration for Amplify to access private subnets
- Terraform updates for ElastiCache provisioning
Code Changes:
- Install
ioredispackage - Create
lib/shared/redis.tsclient factory - Update
checkRateLimit()to use Redis INCR + EXPIRE - Add
checkRateLimitAsync()for Redis support - Add graceful fallback to in-memory if Redis unavailable
Terraform Tasks:
- Add ElastiCache module (
infrastructure/terraform/modules/elasticache/) - Add security group for ElastiCache
- Add ElastiCache to dev environment
To Deploy:
cd infrastructure/terraform/environments/dev
terraform init
terraform plan -var="redis_auth_token=YOUR_SECURE_TOKEN"
terraform apply -var="redis_auth_token=YOUR_SECURE_TOKEN"
Environment Variables (after Terraform apply):
REDIS_URL=rediss://<endpoint>:6379
REDIS_AUTH_TOKEN=<your_auth_token>
Verification:
# Hit rate limit, restart server, verify limit still enforced
Completed: [x] Date: 2026-01-31 (code + Terraform, pending infrastructure deploy)
1.4 Fix Failing Tests
Risk: MEDIUM - Indicates test maintenance not keeping pace
File: __tests__/integration/api/billing.test.ts
Issues:
createCheckoutSessionmock expects 4 params, implementation passes 5teamMembersundefined access in free plan test
Required Changes:
- Update mock to include
planparameter - Fix undefined access in free plan test
- Run full test suite, verify all passing
- Mock server-only package in vitest.setup.ts
Verification:
npm test
# Result: 217 passed, 9 skipped, 166 todo
Completed: [x] Date: 2026-01-31
Phase 2: Technical Enforcement (Week 2)
Automated safeguards to prevent future issues.
2.1 API Route Wrapper
Purpose: Make secure patterns the default
File to create: lib/api/route-wrapper.ts
Features:
- Automatic authentication check
- Organization context extraction
- Consistent error handling
- Automatic audit logging for failures
- Request timing/logging
Implementation:
// Pattern for all routes to use
export function createProtectedRoute(
handler: ProtectedRouteHandler,
options?: { requiredRole?: OrgRole; rateLimit?: RateLimitKey },
): RouteHandler;
Migration:
- Create wrapper
- Migrate 5 highest-risk routes first (storage/*)
- Migrate remaining routes incrementally
- Add CI check that all routes use wrapper (Phase 2.3)
Completed: [x] Date: 2026-01-31 (wrapper + storage routes)
2.2 Semgrep Security Scanning
Purpose: Catch security issues automatically in CI
File: .github/workflows/ci.yml
Required Changes:
- Add Semgrep step to CI workflow
- Configure rules for:
- Missing auth checks
- Unvalidated input
- SQL injection patterns
- Hardcoded secrets
- Run initial scan, fix any findings (requires SEMGREP_APP_TOKEN)
- Block PRs on security findings (job fails on findings)
Configuration:
- name: Security scan
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/typescript
p/security-audit
p/owasp-top-ten
p/secrets
Completed: [x] Date: 2026-01-31 (CI config added, needs token for full scan)
2.3 CI Enforcement Gates
Purpose: Automated quality gates
File: .github/workflows/ci.yml
Required Changes:
- Test failure blocks merge (verified - test job fails CI)
- Add coverage threshold (70% minimum - warns, enforcement pending)
- Add route wrapper check (commented out, enable when more routes migrated)
- Configure Amplify to only deploy on CI success (infrastructure task)
Completed: [x] Date: 2026-01-31 (coverage check added with warning)
2.4 Error Handling Standardization
Purpose: Consistent error responses across all routes
File: lib/api/errors.ts
Standard Format: { error: { code: "ERROR_CODE", message: "Human readable message" } }
Required Changes:
- Audit all routes for error handling patterns
- Create migration checklist of routes to update
- Update routes to use standardized errors
- Add error codes to all responses for debugging
Routes Migrated:
- Storage routes (using route wrapper)
- Billing routes (checkout, portal, usage)
- Auth routes (mfa/challenge, webauthn/credentials)
- Analysis/AI routes (analysis/generate, ai-agent/chat, reports/generate-section)
- Search route
All routes migrated: Error library now used across all API routes (storage, billing, admin, evidence, db/query, db/mutate, etc.)
Completed: [x] Date: 2026-01-31 (initial), 2026-02-04 (all routes)
Phase 3: Testing Debt (Weeks 3-4)
Critical path coverage.
3.1 Integration Test Infrastructure
Purpose: Test real service interactions Requirement: Separate test database, test S3 bucket
Required Changes:
- Create test environment configuration
- Set up test database (can be same RDS, different schema)
- Set up test S3 bucket
- Create test user in Cognito (test@nquiry.ai exists)
- Add integration test npm script (npm run test:integration)
- Document test environment setup (docs/test-environment-setup.md)
Completed: [x] Date: 2026-01-31 (partial - docs and config complete, infra pending)
3.2 Critical Path Integration Tests
One test per week until launch:
| Week | Critical Path | Status | File |
|---|---|---|---|
| 1 | [x] User signup → org creation → first investigation | COMPLETE | signup-to-investigation.test.ts |
| 2 | [x] File upload → download → delete (with auth checks) | COMPLETE | file-lifecycle.test.ts |
| 3 | [x] Trial signup → checkout → subscription active | COMPLETE | trial-to-subscription.test.ts |
| 4 | [x] MFA setup → login with MFA challenge | COMPLETE | mfa-flow.test.ts |
| 5 | [x] AI analysis generation → quota tracking | COMPLETE | ai-analysis-quota.test.ts |
| 6 | [x] Account deletion (GDPR) with cascade verification | COMPLETE | gdpr-deletion.test.ts |
| 7 | [x] Data export (GDPR) completeness | COMPLETE | gdpr-export.test.ts |
| 8 | [x] Organization invitation → accept → access granted | COMPLETE | org-invitation.test.ts |
Test results: __tests__/integration/critical-paths/ (140 tests passing, 4 skipped)
Completed: [x] Date: 2026-01-31
3.3 E2E Tests in CI
Purpose: Run Playwright tests automatically
File: .github/workflows/ci.yml
Required Changes:
- Add E2E test credentials to GitHub Secrets (E2E_TEST_EMAIL, E2E_TEST_PASSWORD)
- Add E2E test stage to CI workflow
- Configure to run on main branch merges (not every PR - too slow)
CI Job Details:
- Runs after lint-and-build and test jobs pass
- Only triggers on push to main branch
- Installs Playwright browsers
- Uploads Playwright report as artifact on failure
Completed: [x] Date: 2026-01-31 (CI config added, secrets needed)
Phase 4: Process & Documentation (Ongoing)
4.1 CLAUDE.md Updates
Required Additions:
- Reference this remediation plan in session protocol
- Add API route requirements checklist
- Add non-negotiable security requirements
- Add "before marking complete" verification steps
Completed: [x] Date: 2026-01-31
4.2 Production Blockers Tracking
File: docs/production-blockers.md
Purpose: Track known technical debt that must be resolved before launch
Current Status:
- 0 open blockers
- 8 resolved blockers (PB-001 through PB-008)
Rule: Any TODO comment or known limitation gets added here immediately.
Completed: [x] Date: 2026-01-31
4.3 Monthly Security Review Schedule
File: docs/admin/security/review-schedule.md
| Week | Focus Area | Checklist |
|---|---|---|
| 1st of month | Auth & Authorization | All routes have auth, org checks enforced |
| 2nd of month | Error Handling | Failures logged, no silent swallowing |
| 3rd of month | Test Coverage | Coverage stable, critical paths tested |
| 4th of month | Compliance | Audit logs complete, GDPR flows working |
Also includes: Quarterly deep dive checklists, escalation procedures, review log
Completed: [x] Date: 2026-01-31
Phase 5: Pre-Launch Gate
5.1 Pre-Launch Checklist
File: docs/pre-launch-checklist.md
Comprehensive checklist covering:
Security (6 sections):
- Authentication & Authorization checklist
- Infrastructure Security checklist
- Code Security checklist
Testing (2 sections):
- Test Suite Health checklist
- Critical Path Coverage tracking (8 paths)
Compliance (3 sections):
- Audit Logging verification
- GDPR compliance checklist
- FedRAMP Readiness checklist
Operations (3 sections):
- Monitoring & Alerting checklist
- Backup & Recovery checklist
- Deployment checklist
Documentation tracking + Final Sign-Off section
Completed: [x] Date: 2026-01-31 (checklist created, items pending verification)
5.2 External Security Review
Tracked in: docs/pre-launch-checklist.md Section 6
Before production launch:
- Budget approved for penetration test
- Vendor selected
- Scope defined (Web app, API, AWS infrastructure)
- Test scheduled
- Test completed
- Critical/high findings remediated
- Re-test passed
Note: External security review is required before FedRAMP pursuit.
Completed: [ ] Date: **__** (requires vendor engagement)
Current Sprint
Active Work (update each session):
| Task | Started | Assignee | Status |
|---|---|---|---|
| 1.4 Fix failing tests | 2026-01-31 | Claude | COMPLETE (217 passing) |
| 1.1 Storage route auth | 2026-01-31 | Claude | COMPLETE (with tests) |
| 1.2 Audit resilience | 2026-01-31 | Claude | COMPLETE |
| 1.3 Redis rate limiting | 2026-01-31 | Claude | COMPLETE (needs deploy) |
| 2.1 API Route Wrapper | 2026-01-31 | Claude | COMPLETE |
| 2.2 Semgrep scanning | 2026-01-31 | Claude | COMPLETE (CI config) |
| 2.3 CI gates | 2026-01-31 | Claude | COMPLETE (coverage) |
| 3.1 Test infrastructure | 2026-01-31 | Claude | COMPLETE (docs/config) |
| 3.2 Critical path tests | 2026-01-31 | Claude | COMPLETE (140 tests) |
| 3.3 E2E tests in CI | 2026-01-31 | Claude | COMPLETE (needs secrets) |
| 4.1 CLAUDE.md updates | 2026-01-31 | Claude | COMPLETE |
| 4.2 Production blockers | 2026-01-31 | Claude | COMPLETE |
| 4.3 Security review sched | 2026-01-31 | Claude | COMPLETE |
| 5.1 Pre-launch checklist | 2026-01-31 | Claude | COMPLETE (doc created) |
| 5.2 External security | - | Joe | PENDING (vendor needed) |
| SEC-001 Encryption key | 2026-02-04 | Claude | COMPLETE (Secrets Mgr) |
| SEC-003 Audit alarms | 2026-02-04 | Claude | COMPLETE (CloudWatch) |
| SEC-004 CSRF protection | 2026-02-04 | Claude | COMPLETE (middleware) |
| SEC-005 Upload validation | 2026-02-04 | Claude | COMPLETE (MIME+magic) |
| SEC-006 Security headers | 2026-02-04 | Claude | COMPLETE (next.config.ts) |
| Security documentation | 2026-02-04 | Claude | COMPLETE (docs/security) |
| Bedrock Guardrails | 2026-02-04 | Claude | COMPLETE (lib/ai/guardrails.ts + terraform) |
| HIPAA risk assessment | 2026-02-04 | Claude | COMPLETE (docs/admin/security/) |
| Asset inventory | 2026-02-04 | Claude | COMPLETE (docs/admin/security/) |
| BAA template | 2026-02-04 | Claude | COMPLETE (docs/admin/legal/) |
| E2E happy path tests | 2026-02-04 | Claude | COMPLETE (tests/e2e/happy-path.spec.ts) |
| Load testing baseline | 2026-02-04 | Claude | COMPLETE (tests/load/) |
| Evidence route org auth | 2026-02-06 | Claude | COMPLETE (hasOrgRole + audit logging) |
| NQU-166 Tenant isolation | 2026-02-13 | Claude | COMPLETE (all 3 batches, 12 new tests) |
Last Updated: 2026-02-14 Last Session Summary: NQU-179 Part 1: Enabled ECS Exec on dev ECS service (enable_execute_command + SSM Messages IAM policy). Unblocks running diagnostic scripts against RDS from inside VPC. Also fixed tsconfig excluding scripts/dump-prompts.ts (missing supabase dep breaking type-check). 1993 tests passing.
Verification Log
Record of completed verifications:
| Date | Item | Verified By | Method | Result |
|---|---|---|---|---|
Notes
- This plan was generated from a comprehensive security audit on 2026-01-31
- All blocking issues (Phase 1) must be complete before feature work resumes
- Phase 2-4 can proceed in parallel with careful prioritization
- External security review is non-negotiable before FedRAMP pursuit