GovEase DCW — Project Master Plan

Before We Write a Single Line of Code

Four Conversations
We Must Have First

Yes — we absolutely need a design and strategy discussion before anything is built. Here are the four mandatory pre-build conversations, what needs to be resolved in each, and why skipping any of them creates expensive rework downstream.

🎨

Design System Discussion

We need a dedicated design session before the first pixel is placed. This platform handles sensitive personal data for vulnerable users — every design decision carries trust implications.

Define brand voice and visual language (not just a logo)
Establish accessibility baseline: WCAG 2.1 AA minimum
Agree on mobile-first vs. responsive priorities
Bilingual UI behavior: language switcher placement, RTL fallback plans
Decide on component library (recommendation: Material Design + custom tokens)
Define "anxiety reduction" principles — this UI serves people under stress
Prototype intake tone: formal government-adjacent vs. warm conversational
Define what "trust" looks like visually — disclaimers, consent screens, progress indicators

Milestone 0 prerequisite

🗺

User Journey Mapping

The full journey from first visit to delivered form must be agreed on paper before architecture begins. It determines auth flows, data persistence points, notification triggers, and hand-off protocols.

Map every touchpoint: onboarding, intake, document upload, review, delivery
Define "save and resume" behavior and session timeout policies
Clarify how a reviewer requests more info from the client
Define what "form delivered" means: PDF download, secure link, email, all three?
Map edge cases: incomplete documents, failed validation, reviewer escalation
Define the language selection UX — when and how does the user choose?
Clarify multi-case scenarios: can one user have multiple active applications?

Informs data model design

💰

Cost Analysis with Client

The client needs a realistic cost conversation now — not after build. GCP infrastructure, Vertex AI token consumption, document storage, OCR processing, and per-case economics all need to be modeled before commit.

GCP project setup + VPC / security services baseline (~$800–1,200/mo MVP)
Vertex AI token cost: ~$0.0005–0.002 per 1K tokens depending on model
Cloud Storage + KMS encryption for documents
Cloud Vision API for OCR: ~$1.50 per 1,000 pages
Cloud Run or GKE for backend services
Estimate per-case cost (target: under $0.50/case at scale)
Development budget vs. infrastructure budget split
Staging vs. production environment cost separation

Client approval needed

🤖

Why Vertex AI, Not OpenAI/Anthropic API

The client must understand this decision — it's not a preference, it's a security and compliance imperative for a platform handling PII, disability records, and immigration data.

Data never leaves Google's infrastructure boundary
Customer Data Processing terms prevent training on PII inputs
VPC Service Controls: AI calls stay within the private network perimeter
SOC2/HIPAA-eligible infrastructure out of the box
Gemini fine-tuning capability for SSA/VA-specific language without data leaving GCP
Explainability logging: every AI decision is auditable
Cost at scale: committed use discounts, no OpenAI premium
Single vendor accountability: GCP, storage, AI, IAM all in one audit scope

Architecture decision locked

The Complete User Journey

From First Login
to Delivered Form

This is the canonical user journey the entire platform must be designed around. Every step is a distinct security checkpoint, AI interaction, or human handoff point.

GovEase DCW — End-to-End Case Flow

Applicant (end user) perspective · SSA/VA Initial Claim · English or Spanish intake

Landing & Language Selection

User arrives at GovEase. Platform immediately presents language choice (English / Español) before any account creation. This selection is stored in session and user profile.

No PII collected yet Language token stored

Account Creation & Identity Verification

User creates account with email + password (bcrypt hashed, salted). MFA enforced via TOTP or SMS. Google Identity Platform handles auth tokens. SSO option for returning users.

MFA enforced AES-256 at rest from this point JWT session tokens

All data encrypted in transit via TLS 1.3 minimum. Cloud Armor DDoS protection active.

DCW Disclaimer & Consent

Full-screen "Digital Case Worker" disclaimer presented in selected language. User must actively check boxes (no pre-checked). Consent event logged with timestamp, IP, user ID, and version of consent text. Cannot proceed without completion.

Consent logged immutably Version-controlled disclaimer text

Case Type Selection & Pre-Flight Eligibility Check

User selects benefit type (SSDI / SSI / VA Disability / VA Healthcare). DCW runs 5-question "Pre-Flight" eligibility screen via Vertex AI. If obvious disqualifier detected (e.g., active work income above SGA for SSDI), user is advised before wasting time on a full application.

Vertex AI · Gemini Eligibility logic branching

Smart Intake — Conversational Q&A

DCW conducts a branching, conversational intake in the user's chosen language. Questions adapt based on prior answers (the "Smart Questions" engine). For SSA, this maps to the Sequential Evaluation Process (SGA → Duration → Blue Book → Vocational). For VA, this runs the Presumptive/PACT Act logic. All answers stored encrypted in Firestore.

Vertex AI · Branching logic Encrypted session storage Save & resume supported

No raw PII is sent to AI without tokenization layer stripping identifiers first.

Document Upload & Validation

System presents a dynamic document checklist based on case type and intake answers. User uploads documents (PDF, JPG, PNG — max 10MB each). Cloud Vision API performs OCR. Automated validation checks: file type, image quality/blur score, expiry dates, document type recognition. Blurry or incomplete documents trigger conversational re-upload guidance ("The registrar's seal appears cut off — please retake with the full page in view").

Cloud Vision OCR Cloud Storage · Encrypted Virus/malware scan on upload

All uploaded files scanned via Cloud Security Command Center before storage. Original filenames sanitized.

AI Consistency Check & Data Validation

Before generating the form, Vertex AI runs a consistency scan across all intake answers and OCR-extracted document data. Contradictions are flagged (e.g., stated lifting limit vs. described daily activities). User is presented with specific clarification prompts — not generic error messages. Internal Adjudicator logic applied for SSA Blue Book matching.

Vertex AI · Consistency engine Contradiction detection Pre-fill validation

Draft Form Generation

Vertex AI maps validated intake data to government form fields. Dual-language handling: intake in Spanish, form output in English (or as configured). System generates a draft PDF/structured data package. Translation of user-provided narrative is handled by Cloud Translation API with human review flag for all medical/legal statements.

Vertex AI · Form mapping Cloud Translation API Draft PDF generated

Mandatory Human Review

All cases enter human review queue — no auto-submission is possible. Reviewer sees the "Diff Dashboard" (discrepancies between OCR data and user input highlighted). Reviewer can approve, request clarification from client (in-platform messaging), or escalate to senior reviewer. Full audit trail of all reviewer actions logged.

Human in the loop — mandatory Reviewer access logged Diff view

RBAC enforced — reviewers only see cases assigned to them. No bulk access to all records.

Client Review & Acknowledgment

Approved draft is returned to the applicant for final review. Bilingual summary shown. User must actively acknowledge accuracy of all information before finalization. E-signature or digital acknowledgment captured (via DocuSign API or Google Workspace integration). Timestamp and version of form acknowledged is recorded.

Digital acknowledgment captured Bilingual summary Acknowledgment versioned

Finalized Form Delivery

Finalized form package delivered via: (a) secure download link (time-limited, signed URL from Cloud Storage), (b) email notification with instructions. Form package includes the completed government form, a document checklist summary, and filing instructions. The platform does NOT submit to government agencies — user/caseworker submits manually.

Signed URL delivery Filing instructions included Link expires in 72h

Data Retention & Case Archival

Case enters retention policy: active cases retained for agreed period (TBD with client), then moved to encrypted cold archive. Data deletion workflow triggered on user request (GDPR/CCPA compliance). Audit logs retained separately per compliance requirements. Deletion confirmed via email to user.

Data retention policy enforced Right to deletion workflow Audit log preserved

Architecture Decision · AI Infrastructure

Why Vertex AI —
Not OpenAI or Anthropic

This platform handles Social Security Numbers, disability records, military service histories, and immigration status. The AI vendor choice is not a preference — it is a security and compliance decision that must be explained clearly to the client.

🚨

Critical Issue with Commercial LLM APIs: Sending user PII, disability narratives, or immigration history to OpenAI or Anthropic via their standard commercial APIs means that data leaves your infrastructure boundary and is processed on their servers. This creates HIPAA-equivalent exposure, breaks most government data handling policies, and introduces a third-party data processing relationship that requires explicit contractual coverage — which commercial APIs do not provide at startup pricing tiers.

Consideration	Vertex AI (Gemini) on GCP	OpenAI / Anthropic API
Data sovereignty	✔ Data stays inside your GCP project boundary. VPC Service Controls prevent any exfiltration.	✘ Data is sent to and processed on third-party servers. Standard API terms do not guarantee zero retention.
PII handling guarantee	✔ Google's Customer Data Processing Addendum explicitly prohibits using customer data to train models.	~ Enterprise agreements exist but are expensive. Not available on standard API tiers used by startups.
Compliance posture	✔ GCP is SOC 2 Type II and HIPAA BAA-eligible at infrastructure level. Single audit scope for the infrastructure layer.	~ OpenAI / Anthropic offer enterprise agreements but these are separate vendor relationships requiring their own DPA and security review every year.
Network-level isolation	✔ AI API calls can be routed entirely through private VPC with no public internet egress. Cloud Armor + IAP enforced.	✘ Calls must traverse the public internet to reach their API endpoints. No private peering available.
Fine-tuning on your data	✔ Vertex AI supports supervised fine-tuning with data that never leaves your GCP project. Ideal for SSA/VA-specific terminology and form logic.	~ Fine-tuning available but sends your training data to their platform. Inappropriate for sensitive government domain data.
Audit & explainability	✔ Full Cloud Audit Logs for every AI invocation. Vertex Explainable AI provides decision attribution. Required for cybersecurity review.	✘ Black box API calls. No per-call audit log accessible to you. Impossible to trace AI decisions during a security audit.
Cost at scale	✔ Committed Use Discounts, custom model hosting, per-token pricing. Single billing account with all other GCP services.	~ Competitive at small scale but no discount mechanism ties to your existing infrastructure spend. Separate billing.
Single vendor accountability	✔ GCP, Cloud Storage, Cloud Vision, Cloud Translation, Vertex AI — one vendor, one BAA, one audit scope, one support contract.	✘ Introduces a separate vendor relationship requiring its own security questionnaire, legal review, and DPA every year.

✅

Bottom line for the client: Vertex AI is chosen because it keeps all data inside a single, auditable GCP boundary — which simplifies the security posture significantly for a platform handling sensitive personal data. It does not make us HIPAA-certified or SOC2-certified. It means our infrastructure layer has fewer moving parts, fewer vendor relationships, and a cleaner story to tell during any security review.

Financial Planning · Client Discussion

Cost Analysis
Framework

These are the cost dimensions that must be discussed with the client before commitment. Numbers are estimates based on GCP pricing as of 2025/2026 and should be modeled against the client's projected case volume.

Monthly Infrastructure (MVP Stage — ~500 cases/month)

Estimated: $1,200 – $2,800 / month

Cloud Run (Backend API) Auto-scaling, 2 vCPU / 4GB RAM baseline

~$120–300 / mo

Cloud SQL (PostgreSQL) Primary + replica, 2 vCPU, 7.5GB RAM, encrypted

~$200–350 / mo

Vertex AI (Gemini Pro) ~150K tokens per case × 500 cases = 75M tokens

~$150–300 / mo

Cloud Vision API (OCR) ~8 document pages per case × 500 = 4,000 pages

~$6 / mo

Cloud Storage (Documents) ~10MB per case × 500 cases + archive. Encrypted with CMEK.

~$15–40 / mo

Cloud Translation API Spanish ↔ English, ~5K characters per case

~$12 / mo

Cloud KMS (Key Management) CMEK for document encryption, audit key access

~$6 / mo

Cloud Armor + Load Balancer DDoS protection, WAF rules, managed SSL

~$200 / mo

Cloud Logging / Monitoring / SIEM Security audit logs, Chronicle integration

~$100–200 / mo

Cloud Identity Platform (Auth) MFA, session management, up to 50K MAU free tier

~$0–50 / mo

Email / SMS Notifications SendGrid or Google Workspace + Twilio

~$50–100 / mo

Per-Case Unit Economics

Target: under $0.50 / case at 1,000+ cases/month

AI processing (intake + validation + mapping) Vertex AI token cost, amortized

~$0.15 – $0.35

OCR document processing Cloud Vision, ~8 pages

~$0.012

Storage (per case, per year) Documents + audit logs in Cloud Storage

~$0.024

Translation (if bilingual) Cloud Translation API

~$0.01

💡

Client pricing model to discuss: Consider a per-case fee (e.g., $15–50/case depending on form complexity), a monthly subscription per reviewer seat, or a hybrid model. At 500 cases/month with $25/case, gross revenue covers infrastructure + human review labor with margin. The unit economics improve significantly above 1,000 cases/month due to GCP committed use discounts and fixed infrastructure baseline costs.

Build Plan · Phase 1 MVP

Project Milestones

Ten milestones from discovery lock to controlled launch. Milestone 1 is UI first — no backend work begins until the design system, user journey, and core screens are approved. Security requirements are embedded into every milestone, not bolted on at the end.

Pre-Build

Discovery Lock & Client Alignment

2 weeks

Deliverables

Confirm 1–3 MVP form types (SSA-16, SSA-3368, VA 21-526EZ recommended)
Lock supported languages: English + Spanish for MVP
Define user roles: Applicant, Reviewer, Admin, Super Admin
Client sign-off on out-of-scope items (portal login confirmed removed)
User journey map approved by client and tech lead
Cost model presented and agreed
GCP project provisioned, IAM baseline configured
Signed Data Processing Agreement in place

Security Setup

GCP organization policy constraints defined
VPC and private subnet topology designed
Security controls baseline documented (OWASP-aligned, not a certification claim)
Threat model first draft — STRIDE analysis
Pen test scope agreed with cyber team

UI First — Priority

Design System & Core UI

3–4 weeks

Deliverables

Design system: color tokens, typography scale, component library
Language selection + onboarding screen (English/Spanish)
Account creation + MFA setup flow
DCW Disclaimer / Consent screen (bilingual)
Pre-flight eligibility questionnaire UI
Smart intake conversational interface (chat-like, progress-tracked)
Document upload UI with drag-and-drop, quality preview, error states
Case dashboard for applicants (status, progress, messages)
Reviewer dashboard wireframe (Diff view, queue management)
Form preview / final review screen
Figma prototype linked for all critical flows — signed off by client before dev starts

Security in UI

Session timeout UI behavior designed (15 min inactivity warning)
Sensitive field masking patterns defined (SSN, DOB display rules)
Consent UX reviewed for GDPR/CCPA compliance
Accessibility audit: WCAG 2.1 AA minimum across all screens
No sensitive data in browser local storage — design constraint documented

Infrastructure

Secure GCP Architecture & Data Model

2–3 weeks

Deliverables

VPC with private subnets, no public IPs on application servers
Cloud SQL (PostgreSQL) with Private Service Connect
Cloud Storage buckets with CMEK, versioning, lifecycle policies
Cloud Run services with VPC connector
Terraform IaC for all infrastructure (reproducible, auditable)
Staging and production environments fully separated
CI/CD pipeline with Cloud Build, artifact registry, secret scanning
Data model: Cases, Documents, Users, AuditEvents, ConsentRecords

Security Controls

VPC Service Controls perimeter around all sensitive services
Cloud Armor WAF rules deployed (OWASP Top 10 rulesets)
Secret Manager for all credentials — zero hardcoded secrets in code
Binary Authorization for container image signing
Cloud KMS CMEK for all data at rest
Cloud Audit Logs: Data Access logs enabled on all services

Backend Core

Authentication, Authorization & Case Management API

3 weeks

Deliverables

Google Identity Platform integration (MFA, session management)
RBAC: Applicant / Reviewer / Admin / Super Admin roles enforced at API layer
Case CRUD API: create, read (own), update (draft only), status transitions
Consent record creation and retrieval API
Audit event logging service (every state change recorded)
Notification service (email via SendGrid, in-app notifications)
API Gateway with rate limiting and authentication enforcement

Security Controls

JWT validation on every API endpoint
Row-level security: users cannot query other users' cases at DB level
Input validation and sanitization on all endpoints (prevent SQLi/XSS)
Rate limiting: 100 req/min per authenticated user
Brute-force protection on auth endpoints (lockout after 5 failures)
OWASP API Security Top 10 checklist reviewed against every endpoint

AI Engine

Vertex AI Smart Question & Eligibility Engine

4 weeks

Deliverables

Pre-flight eligibility engine (5-question screen, per agency)
SSA Sequential Evaluation Process logic (Steps 1–5, SGA/Blue Book/Vocational)
VA Presumptive logic (PACT Act cross-reference, priority group calculator)
Conversational intake engine: branching Q&A with context memory per session
Consistency checker: cross-validates answers for contradictions before form generation
Prompt engineering for each form type — tested and version-controlled
Vertex AI integration layer with PII tokenization before every API call

Security Controls

PII tokenization: SSN, DOB, name stripped and replaced with tokens before Vertex AI calls
AI response validation: output sanitized before storing or displaying
Prompt injection detection layer on all user-facing inputs to AI
Every Vertex AI call logged with case ID, token count, prompt hash
AI outputs never auto-applied to form fields without validation gate

Document Engine

Secure Document Upload, OCR & Validation

3 weeks

Deliverables

Secure signed URL upload flow (files never pass through application server)
Cloud Vision OCR pipeline for all accepted document types
Document type recognition (ID, birth certificate, DD-214, medical record, etc.)
Quality scoring: blur detection, completeness check, expiry date extraction
Conversational re-upload guidance ("seal appears cut off" not "error 422")
Dynamic checklist: required documents per case type + intake answers
Document-to-intake data cross-reference (OCR vs. stated info)

Security Controls

Malware scanning on every upload via Cloud Security Command Center
File type validation at byte-level (not just extension)
Original filenames sanitized before storage (prevent path traversal)
Documents stored in isolated bucket — no public access, signed URLs only
Max file size enforced at CDN layer (not just application)
OCR data encrypted in transit and at rest, purged after form generation

Form Generation

Form Mapping, Translation & PDF Generation

3 weeks

Deliverables

Field mapping engine: validated intake answers → government form fields
Cloud Translation API integration for Spanish narrative → English form output
Human-review flag on all translated medical/legal narrative statements
PDF generation from field-mapped data (fillable PDF or server-side render)
Form versioning: system tracks which version of a government form was used
Jurisdiction-specific logic: agency-by-agency field requirement enforcement

Security Controls

Generated PDFs watermarked as "DRAFT — Not for Submission" until approved
Form data encrypted in transit; draft PDF stored encrypted
Field mapping logic is deterministic and auditable — no AI in final mapping step
Translation confidence scores logged; low-confidence outputs flagged for review

Review Dashboard

Human Reviewer Interface & Case Management

2 weeks

Deliverables

Reviewer queue: assigned cases, priority sorting, SLA indicators
Diff Dashboard: OCR vs. stated data discrepancies highlighted in yellow
In-platform messaging: reviewer can request clarification from applicant
Approve / Request Changes / Escalate actions with mandatory comment
Full case history timeline: every action, AI decision, and edit visible
Admin dashboard: case volume, average review time, backlog metrics

Security Controls

Reviewers see only assigned cases — enforced at API and DB layer
Every reviewer action logged with timestamp and user ID
Escalation workflow requires Super Admin acknowledgment
Session-level access logging: when reviewer views a case file, it is recorded

Security Testing

Penetration Testing, Hardening & Compliance Review

3 weeks

Deliverables

External penetration test on all public-facing endpoints
Internal network penetration test on private services
OWASP Top 10 and OWASP API Security Top 10 formal checklist
SAST (static analysis) and DAST (dynamic analysis) run on full codebase
Dependency vulnerability scan (all npm/pip packages)
Cloud Security Command Center findings reviewed and remediated
IAM permissions audit — principle of least privilege enforced
Compliance gap analysis against SOC2 Type II trust service criteria

Key Test Areas

Auth bypass and privilege escalation attempts
Prompt injection attacks on all AI input surfaces
Document upload abuse (malware, SSRF via uploaded files)
Data exfiltration via API (BOLA/IDOR testing)
PII exposure in logs, error messages, and API responses
Session hijacking and CSRF scenarios
Encryption-at-rest verification on all storage resources

Launch

UAT, Controlled Launch & Monitoring

2 weeks

Deliverables

User acceptance testing with 5–10 real users (non-sensitive test cases)
End-to-end smoke test: all 12 journey steps validated
Load testing: 10× expected peak case volume sustained for 30 minutes
Disaster recovery test: backup restoration verified
Runbook for incident response and data breach protocol
Controlled launch: invite-only, max 50 cases for first 2 weeks
Real-time monitoring dashboard (Uptime, error rate, AI cost per case)

Go-Live Security Gates

All critical and high severity pen test findings remediated before go-live (medium/low tracked in backlog)
SAST/DAST high-severity issues resolved — findings documented honestly, not hidden
Data retention policy active and tested
Incident response plan reviewed by client
Cloud Security Command Center set to alert on critical findings in real time

📅

Total estimated timeline: 25–35 weeks from Discovery Lock to controlled launch, depending on client feedback cycle times and pen test remediation complexity. Phase 4 (expansion to more forms and jurisdictions) begins after 60 days of stable controlled launch with no critical security findings.

Non-Negotiable · Embedded in Every Milestone

Security Framework

These are the technical controls we are building into the platform. They are designed to pass OWASP Top 10 and API Security Top 10 pen testing. We are not claiming certifications — we are implementing verifiable, testable controls. Where a control is a target rather than confirmed, it is marked as such.

🔐

Identity & Access

MFA enforced for all users
RBAC at API + database layer
Zero standing admin access (JIT)
Service account least privilege
Session timeout: 15 min idle
Concurrent session limits

🛡

Network Security

VPC Service Controls perimeter
Cloud Armor WAF (OWASP rules)
No public IPs on app servers
Private Service Connect for DB
TLS 1.3 minimum in transit
DDoS protection active

🔒

Data Encryption

AES-256 at rest (CMEK/Cloud KMS)
TLS 1.3 in transit
PII tokenized before AI calls
Encrypted backups with separate keys
Key rotation policy: 90 days
No sensitive data in logs

📋

Audit & Logging

Immutable audit log (every state change)
Data Access logs on all GCP services
Every AI invocation logged
Reviewer document access logged
Log retention: 7 years minimum
SIEM integration (Chronicle)

🧪

Application Security

SAST on every pull request
Dependency scanning (Snyk/Dependabot)
Prompt injection detection layer
Input validation at every layer
Content Security Policy enforced
CORS policy locked to known origins

🚨

Incident Response

Runbook for data breach scenarios
72-hour breach notification plan
Automated alerting on anomalous access
User data deletion workflow tested
DR/backup restoration tested monthly
Annual pen test + quarterly scans

Technical Architecture · Google Cloud Platform

Google Cloud
Infrastructure Stack

Every component runs within Google's infrastructure. No third-party AI vendors. No data leaving the GCP project boundary. Single audit scope, single compliance framework, single vendor accountability.

Service → Purpose Mapping

All services within single GCP project / VPC

Vertex AI (Gemini)AI / ML Layer

Powers all conversational intake, eligibility logic, consistency checking, and form field mapping. Called exclusively via private VPC endpoint. PII tokenized before every call.

Cloud Vision APIDocument OCR

Extracts text from uploaded documents (IDs, medical records, DD-214). Identifies document type, quality issues, and expiry dates. Results encrypted before storage.

Cloud TranslationMultilingual

Handles Spanish ↔ English translation for intake narratives and form field population. All translated medical/legal content flagged for human review.

Cloud RunApplication Layer

Serverless containerized backend (Node.js / Python). Auto-scales. Connected to VPC via VPC connector. No public IP. Accessed via load balancer only.

Cloud SQLRelational Database

PostgreSQL for case data, user records, audit events, consent records. Private Service Connect — no public IP ever. Encrypted with CMEK. Daily automated backups.

Cloud StorageDocument Vault

Encrypted document storage (uploaded files + generated PDFs). CMEK encryption, versioning enabled, lifecycle policies for retention/deletion. Signed URL access only — no public bucket.

Cloud KMSKey Management

Customer-managed encryption keys for all storage. Key rotation every 90 days. Separate key rings for documents, database, and audit logs. Access to keys logged via Cloud Audit Logs.

Cloud Identity PlatformAuthentication

User authentication, MFA (TOTP + SMS), session management, JWT issuance. Replaces need for a custom auth system. Integrates directly with IAM for RBAC.

Cloud ArmorWAF / DDoS

Web Application Firewall with OWASP Top 10 managed rules. DDoS protection at network layer. Rate limiting enforced at CDN edge before requests reach application.

Secret ManagerSecrets

All API keys, database passwords, and credentials stored here. Zero hardcoded secrets in source code or environment variables. Rotation alerts configured.

Cloud Logging / ChronicleSIEM

Centralized security logging. Data Access logs, Admin Activity logs, application logs. Chronicle SIEM for threat detection and incident investigation. 7-year log retention.

Cloud Build + Artifact RegistryCI/CD

Automated build, test, SAST, and deployment pipeline. Binary Authorization enforces that only signed, verified container images can be deployed to production. No manual deployments.

System Architecture · Validated Flow Diagram

Structure & Process Flow

The canonical system flow — from client arrival to case closed. Every node maps to a built component. The Security & Audit Layer is not a final step; it wraps the entire platform and is active throughout every stage.

GovEase DCW — Complete Case Flow

Client-facing journey + internal processing + security perimeter

Start / End

Process

Decision

System Layer

Portal Status Checking Removed from scope. No automated login to government portals. No credential storage.

Human Review — Mandatory Every case passes through Staff Verification. No auto-approval path exists in the system.

Agency Submission Platform generates a complete packet only. The applicant or caseworker submits to the agency manually.

Technical Choices · Rationale Included

Preferred Tech Stack

Every choice is made with security, auditability, and long-term maintainability in mind. Where alternatives exist, they are noted. All AI/ML infrastructure runs on Google Cloud — no data leaves the GCP project boundary.

Frontend

Client Layer

Next.js 14 Primary

React framework with App Router. SSR for SEO + security (no sensitive state in client JS). TypeScript enforced.

TypeScript

Strict type safety across frontend and API contracts. Reduces entire categories of runtime bugs.

Tailwind CSS

Utility-first, no CSS-in-JS runtime cost. Design tokens via CSS variables for brand consistency + dark mode.

i18next

Industry standard for EN/ES bilingual UI. Namespace-based — legal/medical strings managed separately with version control.

React Hook Form

Performant form validation. Uncontrolled inputs reduce re-renders on long intake forms.

Zod

Schema validation shared between frontend and backend. Single source of truth for input rules.

Backend

API Layer

Node.js / NestJS Primary

Structured, opinionated framework with built-in DI, guards, interceptors. Enforces security patterns at architecture level.

Python / FastAPI Alt (AI Services)

Python used for AI microservices (Vertex AI SDK, OCR processing). FastAPI for async, high-performance endpoints.

REST + OpenAPI 3.0

Fully documented API spec. Enables automated contract testing and security scanning against declared schema.

Prisma ORM

Type-safe database access. Parameterized queries by default — eliminates SQL injection risk entirely.

BullMQ (Redis)

Async job queue for OCR processing, form generation, and notifications — keeps API response times fast.

Helmet.js + CORS

HTTP security headers enforced on all responses. CORS restricted to verified origin domains only.

Data & Storage

Persistence Layer

Cloud SQL (PostgreSQL) GCP

Primary relational database. Row-level security enforced. No public IP. CMEK encryption. Automated daily backups.

Cloud Storage GCP

Document vault. CMEK-encrypted buckets. Signed URLs for time-limited access. Versioning + lifecycle policies.

Firestore GCP

Real-time session state + case status updates pushed to frontend. Document-model fits case object structure.

Cloud Memorystore (Redis)

Session cache + BullMQ job queue backend. Private VPC only. No public endpoint.

Cloud KMS GCP

Customer-managed encryption keys for all data stores. Key ring separation by data sensitivity tier. 90-day rotation.

AI / ML Services

Intelligence Layer

Vertex AI — Gemini Pro GCP

Core LLM for intake logic, eligibility scoring, consistency checking, and form field narrative mapping. Private VPC endpoint. PII tokenized before every call.

Cloud Vision API GCP

OCR for all uploaded documents. Document type classification. Quality and completeness scoring. Expiry date extraction.

Cloud Translation API GCP

ES→EN translation for intake narratives and form outputs. Confidence scores logged. Low confidence outputs flagged for human review.

Vertex AI Explainability

Decision attribution for every AI output used in a case. Required for cybersecurity audit trail and regulatory review.

LangChain (Python)

Orchestration layer for multi-step AI workflows (intake → validation → mapping). Prompt versioning and A/B testing built-in.

Security Infrastructure

Protection Layer

Cloud Identity Platform GCP

Auth, MFA (TOTP + SMS), JWT issuance, session management. Replaces custom auth — reduces attack surface.

Cloud Armor GCP

WAF (OWASP Top 10 managed rules), DDoS protection, rate limiting at CDN edge before requests hit the app.

VPC Service Controls GCP

Security perimeter around all GCP resources. Prevents data exfiltration even if credentials are compromised.

Secret Manager GCP

All credentials and API keys stored with audit log on every access. Zero hardcoded secrets anywhere in codebase.

Chronicle SIEM GCP

Centralized security event log. Real-time threat detection, anomalous access alerting, 7-year log retention.

Snyk + OWASP ZAP

SAST/SCA on every PR (Snyk). DAST automated testing in staging (OWASP ZAP). Results block deployments if critical findings.

DevOps & Infrastructure

Platform Layer

Terraform IaC

All GCP infrastructure defined as code. Reproducible, version-controlled, auditable. No manual console changes in production.

Cloud Run GCP

Serverless containers. Auto-scales to zero. VPC-connected. No VM management overhead. Binary Authorization enforced.

Cloud Build + Artifact Registry GCP

Automated CI/CD pipeline. SAST integrated. Signed container images only — Binary Authorization blocks unsigned deployments.

GitHub + Branch Protection

Source control. Required PR reviews. Secret scanning on every push. Main branch requires 2 approvals + passing CI.

Cloud Monitoring + Uptime

SLA monitoring, error rate alerting, AI cost-per-case dashboards, reviewer queue depth tracking.

DocuSign / HelloSign API TBD

E-signature and digital acknowledgment capture. Decision pending compliance review — Google Workspace eSign is an alternative.

📌

PDF Generation: PDFLib (Node.js) for server-side fillable PDF generation from mapped form data. Puppeteer as fallback for complex layouts. Generated PDFs are watermarked "DRAFT" until human review approval is recorded. All generation happens server-side — no client-side PDF construction that could be tampered with.

Regulatory & Standards Compliance

Compliance Framework

GovEase DCW handles sensitive personal data — disability records, military service history, and Social Security numbers. The table below maps what we will implement as controls, what we inherit from GCP's infrastructure, and what we are aligned to as a framework. We are not claiming certifications we do not yet hold.

Cyber Test Baseline

OWASP Top 10

Open Web Application Security Project — Web + API

This is what a pen tester will run against us — it is the primary test target
Injection (SQL, prompt): parameterized queries + AI input sanitization layer
Broken Access Control: RBAC enforced at API, DB, and UI layers
Cryptographic Failures: AES-256 at rest, TLS 1.3 in transit, no weak ciphers
OWASP API Security Top 10 applied separately to every backend endpoint
DAST scan (OWASP ZAP) in staging before any production deployment
We make no certification claim — we implement the controls

Controls Implemented

NIST CSF 2.0

NIST Cybersecurity Framework — used as a reference structure

Used as an internal design reference, not claimed as a certification
Identify: asset inventory, data classification, threat model (STRIDE)
Protect: access controls, encryption, secure CI/CD pipeline
Detect: Cloud Logging, Chronicle SIEM, anomaly alerts
Respond: incident response runbook written before go-live
Recover: backup restoration tested, DR procedure documented

GCP Infrastructure

HIPAA-Eligible Architecture

Health Insurance Portability and Accountability Act

We are not claiming HIPAA certification — the platform is not a covered entity
We run on GCP services that are HIPAA-eligible (Google BAA available)
PHI-handling principles applied: minimum necessary, encryption, audit logs
No PHI sent to any service outside the GCP project boundary
If the client's use case requires formal HIPAA compliance, a BAA with Google must be signed and a formal risk assessment conducted — not included in MVP scope

GCP Infrastructure

SOC 2 — Aligned Controls

Service Organization Control 2

We do not hold a SOC 2 certificate at MVP — this takes 6–12 months to audit
GCP is SOC 2 Type II certified — our infrastructure layer inherits this
Platform-level controls are designed to be SOC 2 aligned (Security + Availability criteria)
SOC 2 audit is a Year 2 target once the platform has operational history
Security controls documented now so evidence collection is ready when audit begins

By Design (MVP)

GDPR Principles

General Data Protection Regulation — design principles applied

Data minimisation: only fields required per form type are collected
Consent: explicit opt-in captured, timestamped, version-controlled
Right to erasure: deletion workflow built and tested before go-live
We are not making a formal GDPR certification claim — we apply the principles
If EU users are in scope, a formal DPIA and legal basis review is required — this is a client decision, not included in build cost

By Design (MVP)

CCPA / CPRA Principles

California Consumer Privacy Act / Privacy Rights Act

Right to know and right to delete workflows built into case management
No user data sold or shared with third parties for advertising purposes
Privacy policy covers all collected data categories
Formal legal review of the privacy policy is the client's responsibility before launch

Design Target

WCAG 2.1 Level AA

Web Content Accessibility Guidelines

AA is the design target — not a certified claim at MVP launch
Screen reader compatibility (ARIA labels, semantic HTML structure)
Color contrast ratio ≥ 4.5:1 for body text enforced in design system
Keyboard navigation operable without mouse on all critical flows
Full independent WCAG audit recommended before any public-sector rollout

GCP-Inherited Only

ISO 27001 / 27017

Information Security Management (Cloud-Specific)

GCP holds ISO 27001 and 27017 certifications — this covers the infrastructure we run on
These are Google's certifications, not ours — we do not inherit the certificate
We can reference GCP's ISO attestation in security questionnaires as infrastructure evidence
Platform-level ISO 27001 certification is not in MVP or near-term scope

Not In Scope

FedRAMP

Federal Risk and Authorization Management Program

Not applicable to the current product — GovEase is a private platform, not a federal system
GCP services used are FedRAMP-authorized, which is the relevant infrastructure fact
If future contracts require FedRAMP authorization for the platform itself, this is a multi-year, multi-hundred-thousand-dollar effort — not a Phase 1–4 commitment
Removed from commitments entirely

What We Actually Claim vs. What We Inherit

⚠

Key Principle: A cyber tester will test your controls, not your paperwork. Claiming a certification you don't hold is a documentation risk. We claim only what is technically verifiable in our deployed system. Pen test findings are addressed openly — not hidden or disputed.

Area	What We Claim (Testable)	What We Do NOT Claim
Encryption	AES-256 at rest (CMEK), TLS 1.3 in transit — verifiable in GCP config	No encryption certification — it's a control, not a certificate
Authentication	MFA enforced, JWT validation on all endpoints, brute-force lockout — pen-testable	Not claiming zero auth vulnerabilities — that's what pen testing determines
HIPAA	GCP HIPAA-eligible infrastructure; PHI principles applied; no data outside GCP boundary	Not HIPAA certified — platform is not a covered entity at MVP. BAA with Google requires separate agreement
SOC 2	Controls aligned to SOC 2 Security criteria; GCP infrastructure is SOC 2 Type II certified	No platform-level SOC 2 certificate — audit not conducted. Target Year 2
ISO 27001	GCP holds ISO 27001/27017 — relevant as infrastructure evidence in questionnaires	Not our certificate — Google's. Cannot be cited as a platform-level claim
GDPR / CCPA	Design principles applied: consent, deletion workflow, data minimisation built in	Not certified compliant — formal legal review is client's responsibility pre-launch
OWASP Top 10	All controls implemented and DAST-tested — this is the primary pen test target	No blanket "OWASP compliant" claim — findings from testing are tracked and remediated
FedRAMP	GCP services are FedRAMP-authorized (Google's authorization for their infrastructure)	No FedRAMP authorization for the platform — removed from all commitments

GovEase DCWProject Master Plan

Four ConversationsWe Must Have First

Design System Discussion

User Journey Mapping

Cost Analysis with Client

Why Vertex AI, Not OpenAI/Anthropic API

From First Loginto Delivered Form

GovEase DCW — End-to-End Case Flow

Landing & Language Selection

Account Creation & Identity Verification

DCW Disclaimer & Consent

Case Type Selection & Pre-Flight Eligibility Check

Smart Intake — Conversational Q&A

Document Upload & Validation

AI Consistency Check & Data Validation

Draft Form Generation

Mandatory Human Review

Client Review & Acknowledgment

Finalized Form Delivery

Data Retention & Case Archival

Why Vertex AI —Not OpenAI or Anthropic

Cost AnalysisFramework

Monthly Infrastructure (MVP Stage — ~500 cases/month)

Per-Case Unit Economics

Project Milestones

Deliverables

Security Setup

Deliverables

Security in UI

Deliverables

Security Controls

Deliverables

Security Controls

Deliverables

Security Controls

Deliverables

Security Controls

Deliverables

Security Controls

Deliverables

Security Controls

Deliverables

Key Test Areas

Deliverables

Go-Live Security Gates

Security Framework

Identity & Access

Network Security

Data Encryption

Audit & Logging

Application Security

Incident Response

Google CloudInfrastructure Stack

Service → Purpose Mapping

Structure & Process Flow

GovEase DCW — Complete Case Flow

Preferred Tech Stack

Frontend

Backend

Data & Storage

AI / ML Services

Security Infrastructure

DevOps & Infrastructure

Compliance Framework

What We Actually Claim vs. What We Inherit

GovEase DCW
Project Master Plan

Four Conversations
We Must Have First

From First Login
to Delivered Form

Why Vertex AI —
Not OpenAI or Anthropic

Cost Analysis
Framework

Google Cloud
Infrastructure Stack