Confidential · Internal Strategy Document

GovEase DCW
Project Master Plan

Digital Case Worker — AI-Powered Government Form Assistance Platform
Built on Google Cloud · Vertex AI · Security-First Architecture

Version
v1.0 — Phase 1 Scope
Motto
"The expertise of a case worker, the speed of AI, the heart of a navigator."
Infrastructure
Google Cloud / Vertex AI
MVP Focus
SSA / VA · 1–3 Form Types
⚠ Confirmed Out of Scope:  The "Pulse" portal login / automated government status scraping feature (originally Suggestion #5) has been removed from all build phases following client discussion. No portal credentials will be stored, no RPA/bot check-ins will occur. Status notifications are limited to internal case stage updates only.

Four Conversations
We Must Have First

Yes — we absolutely need a design and strategy discussion before anything is built. Here are the four mandatory pre-build conversations, what needs to be resolved in each, and why skipping any of them creates expensive rework downstream.

🎨

Design System Discussion

We need a dedicated design session before the first pixel is placed. This platform handles sensitive personal data for vulnerable users — every design decision carries trust implications.

  • Define brand voice and visual language (not just a logo)
  • Establish accessibility baseline: WCAG 2.1 AA minimum
  • Agree on mobile-first vs. responsive priorities
  • Bilingual UI behavior: language switcher placement, RTL fallback plans
  • Decide on component library (recommendation: Material Design + custom tokens)
  • Define "anxiety reduction" principles — this UI serves people under stress
  • Prototype intake tone: formal government-adjacent vs. warm conversational
  • Define what "trust" looks like visually — disclaimers, consent screens, progress indicators
Milestone 0 prerequisite
🗺

User Journey Mapping

The full journey from first visit to delivered form must be agreed on paper before architecture begins. It determines auth flows, data persistence points, notification triggers, and hand-off protocols.

  • Map every touchpoint: onboarding, intake, document upload, review, delivery
  • Define "save and resume" behavior and session timeout policies
  • Clarify how a reviewer requests more info from the client
  • Define what "form delivered" means: PDF download, secure link, email, all three?
  • Map edge cases: incomplete documents, failed validation, reviewer escalation
  • Define the language selection UX — when and how does the user choose?
  • Clarify multi-case scenarios: can one user have multiple active applications?
Informs data model design
💰

Cost Analysis with Client

The client needs a realistic cost conversation now — not after build. GCP infrastructure, Vertex AI token consumption, document storage, OCR processing, and per-case economics all need to be modeled before commit.

  • GCP project setup + VPC / security services baseline (~$800–1,200/mo MVP)
  • Vertex AI token cost: ~$0.0005–0.002 per 1K tokens depending on model
  • Cloud Storage + KMS encryption for documents
  • Cloud Vision API for OCR: ~$1.50 per 1,000 pages
  • Cloud Run or GKE for backend services
  • Estimate per-case cost (target: under $0.50/case at scale)
  • Development budget vs. infrastructure budget split
  • Staging vs. production environment cost separation
Client approval needed
🤖

Why Vertex AI, Not OpenAI/Anthropic API

The client must understand this decision — it's not a preference, it's a security and compliance imperative for a platform handling PII, disability records, and immigration data.

  • Data never leaves Google's infrastructure boundary
  • Customer Data Processing terms prevent training on PII inputs
  • VPC Service Controls: AI calls stay within the private network perimeter
  • SOC2/HIPAA-eligible infrastructure out of the box
  • Gemini fine-tuning capability for SSA/VA-specific language without data leaving GCP
  • Explainability logging: every AI decision is auditable
  • Cost at scale: committed use discounts, no OpenAI premium
  • Single vendor accountability: GCP, storage, AI, IAM all in one audit scope
Architecture decision locked

From First Login
to Delivered Form

This is the canonical user journey the entire platform must be designed around. Every step is a distinct security checkpoint, AI interaction, or human handoff point.

GovEase DCW — End-to-End Case Flow

Applicant (end user) perspective · SSA/VA Initial Claim · English or Spanish intake

1

Landing & Language Selection

User arrives at GovEase. Platform immediately presents language choice (English / Español) before any account creation. This selection is stored in session and user profile.

No PII collected yet Language token stored
2

Account Creation & Identity Verification

User creates account with email + password (bcrypt hashed, salted). MFA enforced via TOTP or SMS. Google Identity Platform handles auth tokens. SSO option for returning users.

MFA enforced AES-256 at rest from this point JWT session tokens
All data encrypted in transit via TLS 1.3 minimum. Cloud Armor DDoS protection active.
3

DCW Disclaimer & Consent

Full-screen "Digital Case Worker" disclaimer presented in selected language. User must actively check boxes (no pre-checked). Consent event logged with timestamp, IP, user ID, and version of consent text. Cannot proceed without completion.

Consent logged immutably Version-controlled disclaimer text
4

Case Type Selection & Pre-Flight Eligibility Check

User selects benefit type (SSDI / SSI / VA Disability / VA Healthcare). DCW runs 5-question "Pre-Flight" eligibility screen via Vertex AI. If obvious disqualifier detected (e.g., active work income above SGA for SSDI), user is advised before wasting time on a full application.

Vertex AI · Gemini Eligibility logic branching
5

Smart Intake — Conversational Q&A

DCW conducts a branching, conversational intake in the user's chosen language. Questions adapt based on prior answers (the "Smart Questions" engine). For SSA, this maps to the Sequential Evaluation Process (SGA → Duration → Blue Book → Vocational). For VA, this runs the Presumptive/PACT Act logic. All answers stored encrypted in Firestore.

Vertex AI · Branching logic Encrypted session storage Save & resume supported
No raw PII is sent to AI without tokenization layer stripping identifiers first.
6

Document Upload & Validation

System presents a dynamic document checklist based on case type and intake answers. User uploads documents (PDF, JPG, PNG — max 10MB each). Cloud Vision API performs OCR. Automated validation checks: file type, image quality/blur score, expiry dates, document type recognition. Blurry or incomplete documents trigger conversational re-upload guidance ("The registrar's seal appears cut off — please retake with the full page in view").

Cloud Vision OCR Cloud Storage · Encrypted Virus/malware scan on upload
All uploaded files scanned via Cloud Security Command Center before storage. Original filenames sanitized.
7

AI Consistency Check & Data Validation

Before generating the form, Vertex AI runs a consistency scan across all intake answers and OCR-extracted document data. Contradictions are flagged (e.g., stated lifting limit vs. described daily activities). User is presented with specific clarification prompts — not generic error messages. Internal Adjudicator logic applied for SSA Blue Book matching.

Vertex AI · Consistency engine Contradiction detection Pre-fill validation

Draft Form Generation

Vertex AI maps validated intake data to government form fields. Dual-language handling: intake in Spanish, form output in English (or as configured). System generates a draft PDF/structured data package. Translation of user-provided narrative is handled by Cloud Translation API with human review flag for all medical/legal statements.

Vertex AI · Form mapping Cloud Translation API Draft PDF generated
9

Mandatory Human Review

All cases enter human review queue — no auto-submission is possible. Reviewer sees the "Diff Dashboard" (discrepancies between OCR data and user input highlighted). Reviewer can approve, request clarification from client (in-platform messaging), or escalate to senior reviewer. Full audit trail of all reviewer actions logged.

Human in the loop — mandatory Reviewer access logged Diff view
RBAC enforced — reviewers only see cases assigned to them. No bulk access to all records.
10

Client Review & Acknowledgment

Approved draft is returned to the applicant for final review. Bilingual summary shown. User must actively acknowledge accuracy of all information before finalization. E-signature or digital acknowledgment captured (via DocuSign API or Google Workspace integration). Timestamp and version of form acknowledged is recorded.

Digital acknowledgment captured Bilingual summary Acknowledgment versioned

Finalized Form Delivery

Finalized form package delivered via: (a) secure download link (time-limited, signed URL from Cloud Storage), (b) email notification with instructions. Form package includes the completed government form, a document checklist summary, and filing instructions. The platform does NOT submit to government agencies — user/caseworker submits manually.

Signed URL delivery Filing instructions included Link expires in 72h
12

Data Retention & Case Archival

Case enters retention policy: active cases retained for agreed period (TBD with client), then moved to encrypted cold archive. Data deletion workflow triggered on user request (GDPR/CCPA compliance). Audit logs retained separately per compliance requirements. Deletion confirmed via email to user.

Data retention policy enforced Right to deletion workflow Audit log preserved

Why Vertex AI —
Not OpenAI or Anthropic

This platform handles Social Security Numbers, disability records, military service histories, and immigration status. The AI vendor choice is not a preference — it is a security and compliance decision that must be explained clearly to the client.

🚨
Critical Issue with Commercial LLM APIs: Sending user PII, disability narratives, or immigration history to OpenAI or Anthropic via their standard commercial APIs means that data leaves your infrastructure boundary and is processed on their servers. This creates HIPAA-equivalent exposure, breaks most government data handling policies, and introduces a third-party data processing relationship that requires explicit contractual coverage — which commercial APIs do not provide at startup pricing tiers.
Consideration Vertex AI (Gemini) on GCP OpenAI / Anthropic API
Data sovereignty Data stays inside your GCP project boundary. VPC Service Controls prevent any exfiltration. Data is sent to and processed on third-party servers. Standard API terms do not guarantee zero retention.
PII handling guarantee Google's Customer Data Processing Addendum explicitly prohibits using customer data to train models. ~ Enterprise agreements exist but are expensive. Not available on standard API tiers used by startups.
Compliance posture GCP is SOC 2 Type II and HIPAA BAA-eligible at infrastructure level. Single audit scope for the infrastructure layer. ~ OpenAI / Anthropic offer enterprise agreements but these are separate vendor relationships requiring their own DPA and security review every year.
Network-level isolation AI API calls can be routed entirely through private VPC with no public internet egress. Cloud Armor + IAP enforced. Calls must traverse the public internet to reach their API endpoints. No private peering available.
Fine-tuning on your data Vertex AI supports supervised fine-tuning with data that never leaves your GCP project. Ideal for SSA/VA-specific terminology and form logic. ~ Fine-tuning available but sends your training data to their platform. Inappropriate for sensitive government domain data.
Audit & explainability Full Cloud Audit Logs for every AI invocation. Vertex Explainable AI provides decision attribution. Required for cybersecurity review. Black box API calls. No per-call audit log accessible to you. Impossible to trace AI decisions during a security audit.
Cost at scale Committed Use Discounts, custom model hosting, per-token pricing. Single billing account with all other GCP services. ~ Competitive at small scale but no discount mechanism ties to your existing infrastructure spend. Separate billing.
Single vendor accountability GCP, Cloud Storage, Cloud Vision, Cloud Translation, Vertex AI — one vendor, one BAA, one audit scope, one support contract. Introduces a separate vendor relationship requiring its own security questionnaire, legal review, and DPA every year.
Bottom line for the client: Vertex AI is chosen because it keeps all data inside a single, auditable GCP boundary — which simplifies the security posture significantly for a platform handling sensitive personal data. It does not make us HIPAA-certified or SOC2-certified. It means our infrastructure layer has fewer moving parts, fewer vendor relationships, and a cleaner story to tell during any security review.

Cost Analysis
Framework

These are the cost dimensions that must be discussed with the client before commitment. Numbers are estimates based on GCP pricing as of 2025/2026 and should be modeled against the client's projected case volume.

Monthly Infrastructure (MVP Stage — ~500 cases/month)

Estimated: $1,200 – $2,800 / month
Cloud Run (Backend API) Auto-scaling, 2 vCPU / 4GB RAM baseline
~$120–300 / mo
Cloud SQL (PostgreSQL) Primary + replica, 2 vCPU, 7.5GB RAM, encrypted
~$200–350 / mo
Vertex AI (Gemini Pro) ~150K tokens per case × 500 cases = 75M tokens
~$150–300 / mo
Cloud Vision API (OCR) ~8 document pages per case × 500 = 4,000 pages
~$6 / mo
Cloud Storage (Documents) ~10MB per case × 500 cases + archive. Encrypted with CMEK.
~$15–40 / mo
Cloud Translation API Spanish ↔ English, ~5K characters per case
~$12 / mo
Cloud KMS (Key Management) CMEK for document encryption, audit key access
~$6 / mo
Cloud Armor + Load Balancer DDoS protection, WAF rules, managed SSL
~$200 / mo
Cloud Logging / Monitoring / SIEM Security audit logs, Chronicle integration
~$100–200 / mo
Cloud Identity Platform (Auth) MFA, session management, up to 50K MAU free tier
~$0–50 / mo
Email / SMS Notifications SendGrid or Google Workspace + Twilio
~$50–100 / mo

Per-Case Unit Economics

Target: under $0.50 / case at 1,000+ cases/month
AI processing (intake + validation + mapping) Vertex AI token cost, amortized
~$0.15 – $0.35
OCR document processing Cloud Vision, ~8 pages
~$0.012
Storage (per case, per year) Documents + audit logs in Cloud Storage
~$0.024
Translation (if bilingual) Cloud Translation API
~$0.01
💡
Client pricing model to discuss: Consider a per-case fee (e.g., $15–50/case depending on form complexity), a monthly subscription per reviewer seat, or a hybrid model. At 500 cases/month with $25/case, gross revenue covers infrastructure + human review labor with margin. The unit economics improve significantly above 1,000 cases/month due to GCP committed use discounts and fixed infrastructure baseline costs.

Project Milestones

Ten milestones from discovery lock to controlled launch. Milestone 1 is UI first — no backend work begins until the design system, user journey, and core screens are approved. Security requirements are embedded into every milestone, not bolted on at the end.

00
Pre-Build
Discovery Lock & Client Alignment
2 weeks
Deliverables
  • Confirm 1–3 MVP form types (SSA-16, SSA-3368, VA 21-526EZ recommended)
  • Lock supported languages: English + Spanish for MVP
  • Define user roles: Applicant, Reviewer, Admin, Super Admin
  • Client sign-off on out-of-scope items (portal login confirmed removed)
  • User journey map approved by client and tech lead
  • Cost model presented and agreed
  • GCP project provisioned, IAM baseline configured
  • Signed Data Processing Agreement in place
Security Setup
  • GCP organization policy constraints defined
  • VPC and private subnet topology designed
  • Security controls baseline documented (OWASP-aligned, not a certification claim)
  • Threat model first draft — STRIDE analysis
  • Pen test scope agreed with cyber team
01
UI First — Priority
Design System & Core UI
3–4 weeks
Deliverables
  • Design system: color tokens, typography scale, component library
  • Language selection + onboarding screen (English/Spanish)
  • Account creation + MFA setup flow
  • DCW Disclaimer / Consent screen (bilingual)
  • Pre-flight eligibility questionnaire UI
  • Smart intake conversational interface (chat-like, progress-tracked)
  • Document upload UI with drag-and-drop, quality preview, error states
  • Case dashboard for applicants (status, progress, messages)
  • Reviewer dashboard wireframe (Diff view, queue management)
  • Form preview / final review screen
  • Figma prototype linked for all critical flows — signed off by client before dev starts
Security in UI
  • Session timeout UI behavior designed (15 min inactivity warning)
  • Sensitive field masking patterns defined (SSN, DOB display rules)
  • Consent UX reviewed for GDPR/CCPA compliance
  • Accessibility audit: WCAG 2.1 AA minimum across all screens
  • No sensitive data in browser local storage — design constraint documented
02
Infrastructure
Secure GCP Architecture & Data Model
2–3 weeks
Deliverables
  • VPC with private subnets, no public IPs on application servers
  • Cloud SQL (PostgreSQL) with Private Service Connect
  • Cloud Storage buckets with CMEK, versioning, lifecycle policies
  • Cloud Run services with VPC connector
  • Terraform IaC for all infrastructure (reproducible, auditable)
  • Staging and production environments fully separated
  • CI/CD pipeline with Cloud Build, artifact registry, secret scanning
  • Data model: Cases, Documents, Users, AuditEvents, ConsentRecords
Security Controls
  • VPC Service Controls perimeter around all sensitive services
  • Cloud Armor WAF rules deployed (OWASP Top 10 rulesets)
  • Secret Manager for all credentials — zero hardcoded secrets in code
  • Binary Authorization for container image signing
  • Cloud KMS CMEK for all data at rest
  • Cloud Audit Logs: Data Access logs enabled on all services
03
Backend Core
Authentication, Authorization & Case Management API
3 weeks
Deliverables
  • Google Identity Platform integration (MFA, session management)
  • RBAC: Applicant / Reviewer / Admin / Super Admin roles enforced at API layer
  • Case CRUD API: create, read (own), update (draft only), status transitions
  • Consent record creation and retrieval API
  • Audit event logging service (every state change recorded)
  • Notification service (email via SendGrid, in-app notifications)
  • API Gateway with rate limiting and authentication enforcement
Security Controls
  • JWT validation on every API endpoint
  • Row-level security: users cannot query other users' cases at DB level
  • Input validation and sanitization on all endpoints (prevent SQLi/XSS)
  • Rate limiting: 100 req/min per authenticated user
  • Brute-force protection on auth endpoints (lockout after 5 failures)
  • OWASP API Security Top 10 checklist reviewed against every endpoint
04
AI Engine
Vertex AI Smart Question & Eligibility Engine
4 weeks
Deliverables
  • Pre-flight eligibility engine (5-question screen, per agency)
  • SSA Sequential Evaluation Process logic (Steps 1–5, SGA/Blue Book/Vocational)
  • VA Presumptive logic (PACT Act cross-reference, priority group calculator)
  • Conversational intake engine: branching Q&A with context memory per session
  • Consistency checker: cross-validates answers for contradictions before form generation
  • Prompt engineering for each form type — tested and version-controlled
  • Vertex AI integration layer with PII tokenization before every API call
Security Controls
  • PII tokenization: SSN, DOB, name stripped and replaced with tokens before Vertex AI calls
  • AI response validation: output sanitized before storing or displaying
  • Prompt injection detection layer on all user-facing inputs to AI
  • Every Vertex AI call logged with case ID, token count, prompt hash
  • AI outputs never auto-applied to form fields without validation gate
05
Document Engine
Secure Document Upload, OCR & Validation
3 weeks
Deliverables
  • Secure signed URL upload flow (files never pass through application server)
  • Cloud Vision OCR pipeline for all accepted document types
  • Document type recognition (ID, birth certificate, DD-214, medical record, etc.)
  • Quality scoring: blur detection, completeness check, expiry date extraction
  • Conversational re-upload guidance ("seal appears cut off" not "error 422")
  • Dynamic checklist: required documents per case type + intake answers
  • Document-to-intake data cross-reference (OCR vs. stated info)
Security Controls
  • Malware scanning on every upload via Cloud Security Command Center
  • File type validation at byte-level (not just extension)
  • Original filenames sanitized before storage (prevent path traversal)
  • Documents stored in isolated bucket — no public access, signed URLs only
  • Max file size enforced at CDN layer (not just application)
  • OCR data encrypted in transit and at rest, purged after form generation
06
Form Generation
Form Mapping, Translation & PDF Generation
3 weeks
Deliverables
  • Field mapping engine: validated intake answers → government form fields
  • Cloud Translation API integration for Spanish narrative → English form output
  • Human-review flag on all translated medical/legal narrative statements
  • PDF generation from field-mapped data (fillable PDF or server-side render)
  • Form versioning: system tracks which version of a government form was used
  • Jurisdiction-specific logic: agency-by-agency field requirement enforcement
Security Controls
  • Generated PDFs watermarked as "DRAFT — Not for Submission" until approved
  • Form data encrypted in transit; draft PDF stored encrypted
  • Field mapping logic is deterministic and auditable — no AI in final mapping step
  • Translation confidence scores logged; low-confidence outputs flagged for review
07
Review Dashboard
Human Reviewer Interface & Case Management
2 weeks
Deliverables
  • Reviewer queue: assigned cases, priority sorting, SLA indicators
  • Diff Dashboard: OCR vs. stated data discrepancies highlighted in yellow
  • In-platform messaging: reviewer can request clarification from applicant
  • Approve / Request Changes / Escalate actions with mandatory comment
  • Full case history timeline: every action, AI decision, and edit visible
  • Admin dashboard: case volume, average review time, backlog metrics
Security Controls
  • Reviewers see only assigned cases — enforced at API and DB layer
  • Every reviewer action logged with timestamp and user ID
  • Escalation workflow requires Super Admin acknowledgment
  • Session-level access logging: when reviewer views a case file, it is recorded
08
Security Testing
Penetration Testing, Hardening & Compliance Review
3 weeks
Deliverables
  • External penetration test on all public-facing endpoints
  • Internal network penetration test on private services
  • OWASP Top 10 and OWASP API Security Top 10 formal checklist
  • SAST (static analysis) and DAST (dynamic analysis) run on full codebase
  • Dependency vulnerability scan (all npm/pip packages)
  • Cloud Security Command Center findings reviewed and remediated
  • IAM permissions audit — principle of least privilege enforced
  • Compliance gap analysis against SOC2 Type II trust service criteria
Key Test Areas
  • Auth bypass and privilege escalation attempts
  • Prompt injection attacks on all AI input surfaces
  • Document upload abuse (malware, SSRF via uploaded files)
  • Data exfiltration via API (BOLA/IDOR testing)
  • PII exposure in logs, error messages, and API responses
  • Session hijacking and CSRF scenarios
  • Encryption-at-rest verification on all storage resources
09
Launch
UAT, Controlled Launch & Monitoring
2 weeks
Deliverables
  • User acceptance testing with 5–10 real users (non-sensitive test cases)
  • End-to-end smoke test: all 12 journey steps validated
  • Load testing: 10× expected peak case volume sustained for 30 minutes
  • Disaster recovery test: backup restoration verified
  • Runbook for incident response and data breach protocol
  • Controlled launch: invite-only, max 50 cases for first 2 weeks
  • Real-time monitoring dashboard (Uptime, error rate, AI cost per case)
Go-Live Security Gates
  • All critical and high severity pen test findings remediated before go-live (medium/low tracked in backlog)
  • SAST/DAST high-severity issues resolved — findings documented honestly, not hidden
  • Data retention policy active and tested
  • Incident response plan reviewed by client
  • Cloud Security Command Center set to alert on critical findings in real time
📅
Total estimated timeline: 25–35 weeks from Discovery Lock to controlled launch, depending on client feedback cycle times and pen test remediation complexity. Phase 4 (expansion to more forms and jurisdictions) begins after 60 days of stable controlled launch with no critical security findings.

Security Framework

These are the technical controls we are building into the platform. They are designed to pass OWASP Top 10 and API Security Top 10 pen testing. We are not claiming certifications — we are implementing verifiable, testable controls. Where a control is a target rather than confirmed, it is marked as such.

🔐

Identity & Access

  • MFA enforced for all users
  • RBAC at API + database layer
  • Zero standing admin access (JIT)
  • Service account least privilege
  • Session timeout: 15 min idle
  • Concurrent session limits
🛡

Network Security

  • VPC Service Controls perimeter
  • Cloud Armor WAF (OWASP rules)
  • No public IPs on app servers
  • Private Service Connect for DB
  • TLS 1.3 minimum in transit
  • DDoS protection active
🔒

Data Encryption

  • AES-256 at rest (CMEK/Cloud KMS)
  • TLS 1.3 in transit
  • PII tokenized before AI calls
  • Encrypted backups with separate keys
  • Key rotation policy: 90 days
  • No sensitive data in logs
📋

Audit & Logging

  • Immutable audit log (every state change)
  • Data Access logs on all GCP services
  • Every AI invocation logged
  • Reviewer document access logged
  • Log retention: 7 years minimum
  • SIEM integration (Chronicle)
🧪

Application Security

  • SAST on every pull request
  • Dependency scanning (Snyk/Dependabot)
  • Prompt injection detection layer
  • Input validation at every layer
  • Content Security Policy enforced
  • CORS policy locked to known origins
🚨

Incident Response

  • Runbook for data breach scenarios
  • 72-hour breach notification plan
  • Automated alerting on anomalous access
  • User data deletion workflow tested
  • DR/backup restoration tested monthly
  • Annual pen test + quarterly scans

Google Cloud
Infrastructure Stack

Every component runs within Google's infrastructure. No third-party AI vendors. No data leaving the GCP project boundary. Single audit scope, single compliance framework, single vendor accountability.

Service → Purpose Mapping

All services within single GCP project / VPC
Vertex AI (Gemini)AI / ML Layer
Powers all conversational intake, eligibility logic, consistency checking, and form field mapping. Called exclusively via private VPC endpoint. PII tokenized before every call.
Cloud Vision APIDocument OCR
Extracts text from uploaded documents (IDs, medical records, DD-214). Identifies document type, quality issues, and expiry dates. Results encrypted before storage.
Cloud TranslationMultilingual
Handles Spanish ↔ English translation for intake narratives and form field population. All translated medical/legal content flagged for human review.
Cloud RunApplication Layer
Serverless containerized backend (Node.js / Python). Auto-scales. Connected to VPC via VPC connector. No public IP. Accessed via load balancer only.
Cloud SQLRelational Database
PostgreSQL for case data, user records, audit events, consent records. Private Service Connect — no public IP ever. Encrypted with CMEK. Daily automated backups.
Cloud StorageDocument Vault
Encrypted document storage (uploaded files + generated PDFs). CMEK encryption, versioning enabled, lifecycle policies for retention/deletion. Signed URL access only — no public bucket.
Cloud KMSKey Management
Customer-managed encryption keys for all storage. Key rotation every 90 days. Separate key rings for documents, database, and audit logs. Access to keys logged via Cloud Audit Logs.
Cloud Identity PlatformAuthentication
User authentication, MFA (TOTP + SMS), session management, JWT issuance. Replaces need for a custom auth system. Integrates directly with IAM for RBAC.
Cloud ArmorWAF / DDoS
Web Application Firewall with OWASP Top 10 managed rules. DDoS protection at network layer. Rate limiting enforced at CDN edge before requests reach application.
Secret ManagerSecrets
All API keys, database passwords, and credentials stored here. Zero hardcoded secrets in source code or environment variables. Rotation alerts configured.
Cloud Logging / ChronicleSIEM
Centralized security logging. Data Access logs, Admin Activity logs, application logs. Chronicle SIEM for threat detection and incident investigation. 7-year log retention.
Cloud Build + Artifact RegistryCI/CD
Automated build, test, SAST, and deployment pipeline. Binary Authorization enforces that only signed, verified container images can be deployed to production. No manual deployments.

Structure & Process Flow

The canonical system flow — from client arrival to case closed. Every node maps to a built component. The Security & Audit Layer is not a final step; it wraps the entire platform and is active throughout every stage.

GovEase DCW — Complete Case Flow

Client-facing journey + internal processing + security perimeter

Start / End
Process
Decision
System Layer
GovEase DCW Complete System Process Flow Flowchart from client arrival through authentication, smart intake, document validation, AI processing, human review, and secure form delivery to case closed SECURITY & AUDIT LAYER — ACTIVE THROUGHOUT ENTIRE FLOW Audit Logs Encryption at Rest / Transit Secrets Vault Cloud KMS Notify Client Missing / Invalid Docs Client Arrival Language Selection English / Español — chosen before any account creation 🔐 Secure Authentication Google Identity Platform · MFA enforced New User MFA Setup & Registration Existing User MFA Login 💬 Client Intake Layer Smart Questions · Branching Logic · Save & Resume Rules Engine Logic Check Determine Form Type Document Checklist Generation Dynamic per case type + intake answers 📎 Document Collection Layer Secure Uploads · Cloud Vision OCR · Virus Scan Automated Validation 🤖 Form Preparation Engine Vertex AI · Field Mapping · Translation Layer Generate Draft Forms / Templates Watermarked DRAFT · Not for submission 👁 Internal Reviewer Layer Diff Dashboard · Queue · In-Platform Messaging Staff Verification 📦 Prepare Final Export Packet Form + Checklist + Filing Instructions 🔒 Controlled Secure Export Signed URL · 72h expiry · Download only Apply Data Retention Policy Archive · Schedule deletion · GDPR/CCPA principles ✓ Case Closed new user existing Failed Corrections needed Success Changes requested Approved
Portal Status Checking Removed from scope. No automated login to government portals. No credential storage.
Human Review — Mandatory Every case passes through Staff Verification. No auto-approval path exists in the system.
Agency Submission Platform generates a complete packet only. The applicant or caseworker submits to the agency manually.

Preferred Tech Stack

Every choice is made with security, auditability, and long-term maintainability in mind. Where alternatives exist, they are noted. All AI/ML infrastructure runs on Google Cloud — no data leaves the GCP project boundary.

Frontend

Client Layer
Next.js 14 Primary
React framework with App Router. SSR for SEO + security (no sensitive state in client JS). TypeScript enforced.
TypeScript
Strict type safety across frontend and API contracts. Reduces entire categories of runtime bugs.
Tailwind CSS
Utility-first, no CSS-in-JS runtime cost. Design tokens via CSS variables for brand consistency + dark mode.
i18next
Industry standard for EN/ES bilingual UI. Namespace-based — legal/medical strings managed separately with version control.
React Hook Form
Performant form validation. Uncontrolled inputs reduce re-renders on long intake forms.
Zod
Schema validation shared between frontend and backend. Single source of truth for input rules.

Backend

API Layer
Node.js / NestJS Primary
Structured, opinionated framework with built-in DI, guards, interceptors. Enforces security patterns at architecture level.
Python / FastAPI Alt (AI Services)
Python used for AI microservices (Vertex AI SDK, OCR processing). FastAPI for async, high-performance endpoints.
REST + OpenAPI 3.0
Fully documented API spec. Enables automated contract testing and security scanning against declared schema.
Prisma ORM
Type-safe database access. Parameterized queries by default — eliminates SQL injection risk entirely.
BullMQ (Redis)
Async job queue for OCR processing, form generation, and notifications — keeps API response times fast.
Helmet.js + CORS
HTTP security headers enforced on all responses. CORS restricted to verified origin domains only.

Data & Storage

Persistence Layer
Cloud SQL (PostgreSQL) GCP
Primary relational database. Row-level security enforced. No public IP. CMEK encryption. Automated daily backups.
Cloud Storage GCP
Document vault. CMEK-encrypted buckets. Signed URLs for time-limited access. Versioning + lifecycle policies.
Firestore GCP
Real-time session state + case status updates pushed to frontend. Document-model fits case object structure.
Cloud Memorystore (Redis)
Session cache + BullMQ job queue backend. Private VPC only. No public endpoint.
Cloud KMS GCP
Customer-managed encryption keys for all data stores. Key ring separation by data sensitivity tier. 90-day rotation.

AI / ML Services

Intelligence Layer
Vertex AI — Gemini Pro GCP
Core LLM for intake logic, eligibility scoring, consistency checking, and form field narrative mapping. Private VPC endpoint. PII tokenized before every call.
Cloud Vision API GCP
OCR for all uploaded documents. Document type classification. Quality and completeness scoring. Expiry date extraction.
Cloud Translation API GCP
ES→EN translation for intake narratives and form outputs. Confidence scores logged. Low confidence outputs flagged for human review.
Vertex AI Explainability
Decision attribution for every AI output used in a case. Required for cybersecurity audit trail and regulatory review.
LangChain (Python)
Orchestration layer for multi-step AI workflows (intake → validation → mapping). Prompt versioning and A/B testing built-in.

Security Infrastructure

Protection Layer
Cloud Identity Platform GCP
Auth, MFA (TOTP + SMS), JWT issuance, session management. Replaces custom auth — reduces attack surface.
Cloud Armor GCP
WAF (OWASP Top 10 managed rules), DDoS protection, rate limiting at CDN edge before requests hit the app.
VPC Service Controls GCP
Security perimeter around all GCP resources. Prevents data exfiltration even if credentials are compromised.
Secret Manager GCP
All credentials and API keys stored with audit log on every access. Zero hardcoded secrets anywhere in codebase.
Chronicle SIEM GCP
Centralized security event log. Real-time threat detection, anomalous access alerting, 7-year log retention.
Snyk + OWASP ZAP
SAST/SCA on every PR (Snyk). DAST automated testing in staging (OWASP ZAP). Results block deployments if critical findings.

DevOps & Infrastructure

Platform Layer
Terraform IaC
All GCP infrastructure defined as code. Reproducible, version-controlled, auditable. No manual console changes in production.
Cloud Run GCP
Serverless containers. Auto-scales to zero. VPC-connected. No VM management overhead. Binary Authorization enforced.
Cloud Build + Artifact Registry GCP
Automated CI/CD pipeline. SAST integrated. Signed container images only — Binary Authorization blocks unsigned deployments.
GitHub + Branch Protection
Source control. Required PR reviews. Secret scanning on every push. Main branch requires 2 approvals + passing CI.
Cloud Monitoring + Uptime
SLA monitoring, error rate alerting, AI cost-per-case dashboards, reviewer queue depth tracking.
DocuSign / HelloSign API TBD
E-signature and digital acknowledgment capture. Decision pending compliance review — Google Workspace eSign is an alternative.
📌
PDF Generation: PDFLib (Node.js) for server-side fillable PDF generation from mapped form data. Puppeteer as fallback for complex layouts. Generated PDFs are watermarked "DRAFT" until human review approval is recorded. All generation happens server-side — no client-side PDF construction that could be tampered with.

Compliance Framework

GovEase DCW handles sensitive personal data — disability records, military service history, and Social Security numbers. The table below maps what we will implement as controls, what we inherit from GCP's infrastructure, and what we are aligned to as a framework. We are not claiming certifications we do not yet hold.

Cyber Test Baseline
OWASP Top 10
Open Web Application Security Project — Web + API
  • This is what a pen tester will run against us — it is the primary test target
  • Injection (SQL, prompt): parameterized queries + AI input sanitization layer
  • Broken Access Control: RBAC enforced at API, DB, and UI layers
  • Cryptographic Failures: AES-256 at rest, TLS 1.3 in transit, no weak ciphers
  • OWASP API Security Top 10 applied separately to every backend endpoint
  • DAST scan (OWASP ZAP) in staging before any production deployment
  • We make no certification claim — we implement the controls
Controls Implemented
NIST CSF 2.0
NIST Cybersecurity Framework — used as a reference structure
  • Used as an internal design reference, not claimed as a certification
  • Identify: asset inventory, data classification, threat model (STRIDE)
  • Protect: access controls, encryption, secure CI/CD pipeline
  • Detect: Cloud Logging, Chronicle SIEM, anomaly alerts
  • Respond: incident response runbook written before go-live
  • Recover: backup restoration tested, DR procedure documented
GCP Infrastructure
HIPAA-Eligible Architecture
Health Insurance Portability and Accountability Act
  • We are not claiming HIPAA certification — the platform is not a covered entity
  • We run on GCP services that are HIPAA-eligible (Google BAA available)
  • PHI-handling principles applied: minimum necessary, encryption, audit logs
  • No PHI sent to any service outside the GCP project boundary
  • If the client's use case requires formal HIPAA compliance, a BAA with Google must be signed and a formal risk assessment conducted — not included in MVP scope
GCP Infrastructure
SOC 2 — Aligned Controls
Service Organization Control 2
  • We do not hold a SOC 2 certificate at MVP — this takes 6–12 months to audit
  • GCP is SOC 2 Type II certified — our infrastructure layer inherits this
  • Platform-level controls are designed to be SOC 2 aligned (Security + Availability criteria)
  • SOC 2 audit is a Year 2 target once the platform has operational history
  • Security controls documented now so evidence collection is ready when audit begins
By Design (MVP)
GDPR Principles
General Data Protection Regulation — design principles applied
  • Data minimisation: only fields required per form type are collected
  • Consent: explicit opt-in captured, timestamped, version-controlled
  • Right to erasure: deletion workflow built and tested before go-live
  • We are not making a formal GDPR certification claim — we apply the principles
  • If EU users are in scope, a formal DPIA and legal basis review is required — this is a client decision, not included in build cost
By Design (MVP)
CCPA / CPRA Principles
California Consumer Privacy Act / Privacy Rights Act
  • Right to know and right to delete workflows built into case management
  • No user data sold or shared with third parties for advertising purposes
  • Privacy policy covers all collected data categories
  • Formal legal review of the privacy policy is the client's responsibility before launch
Design Target
WCAG 2.1 Level AA
Web Content Accessibility Guidelines
  • AA is the design target — not a certified claim at MVP launch
  • Screen reader compatibility (ARIA labels, semantic HTML structure)
  • Color contrast ratio ≥ 4.5:1 for body text enforced in design system
  • Keyboard navigation operable without mouse on all critical flows
  • Full independent WCAG audit recommended before any public-sector rollout
GCP-Inherited Only
ISO 27001 / 27017
Information Security Management (Cloud-Specific)
  • GCP holds ISO 27001 and 27017 certifications — this covers the infrastructure we run on
  • These are Google's certifications, not ours — we do not inherit the certificate
  • We can reference GCP's ISO attestation in security questionnaires as infrastructure evidence
  • Platform-level ISO 27001 certification is not in MVP or near-term scope
Not In Scope
FedRAMP
Federal Risk and Authorization Management Program
  • Not applicable to the current product — GovEase is a private platform, not a federal system
  • GCP services used are FedRAMP-authorized, which is the relevant infrastructure fact
  • If future contracts require FedRAMP authorization for the platform itself, this is a multi-year, multi-hundred-thousand-dollar effort — not a Phase 1–4 commitment
  • Removed from commitments entirely

What We Actually Claim vs. What We Inherit

Key Principle: A cyber tester will test your controls, not your paperwork. Claiming a certification you don't hold is a documentation risk. We claim only what is technically verifiable in our deployed system. Pen test findings are addressed openly — not hidden or disputed.
Area What We Claim (Testable) What We Do NOT Claim
Encryption AES-256 at rest (CMEK), TLS 1.3 in transit — verifiable in GCP config No encryption certification — it's a control, not a certificate
Authentication MFA enforced, JWT validation on all endpoints, brute-force lockout — pen-testable Not claiming zero auth vulnerabilities — that's what pen testing determines
HIPAA GCP HIPAA-eligible infrastructure; PHI principles applied; no data outside GCP boundary Not HIPAA certified — platform is not a covered entity at MVP. BAA with Google requires separate agreement
SOC 2 Controls aligned to SOC 2 Security criteria; GCP infrastructure is SOC 2 Type II certified No platform-level SOC 2 certificate — audit not conducted. Target Year 2
ISO 27001 GCP holds ISO 27001/27017 — relevant as infrastructure evidence in questionnaires Not our certificate — Google's. Cannot be cited as a platform-level claim
GDPR / CCPA Design principles applied: consent, deletion workflow, data minimisation built in Not certified compliant — formal legal review is client's responsibility pre-launch
OWASP Top 10 All controls implemented and DAST-tested — this is the primary pen test target No blanket "OWASP compliant" claim — findings from testing are tracked and remediated
FedRAMP GCP services are FedRAMP-authorized (Google's authorization for their infrastructure) No FedRAMP authorization for the platform — removed from all commitments