Module 1 — Site & Investigator Intelligence
sites— Clinical trial sites sourced from ClinicalTrials.gov| Column | Type | Description |
|---|---|---|
| id | UUID | PK, auto-generated |
| org_id | VARCHAR(50) | Tenant isolation key, indexed |
| external_id | VARCHAR(100) | ClinicalTrials.gov NCT number, indexed |
| name | VARCHAR(500) | Site/facility name |
| institution | VARCHAR(500) | Parent institution |
| city | VARCHAR(200) | City |
| country | VARCHAR(100) | Country, indexed |
| region | VARCHAR(100) | Geographic region |
| latitude | FLOAT | GPS latitude |
| longitude | FLOAT | GPS longitude |
| site_type | VARCHAR(50) | hospital, clinic, research_center |
| capacity | INTEGER | Estimated patient capacity |
| specialties | TEXT | JSON array of therapeutic areas |
| total_trials | INTEGER | Cumulative trial count |
| active_trials | INTEGER | Currently active trials |
| created_at | TIMESTAMP | Record creation time |
| updated_at | TIMESTAMP | Last update time |
investigators— Investigator profiles enriched from PubMed| Column | Type | Description |
|---|---|---|
| id | UUID | PK |
| org_id | VARCHAR(50) | Tenant key |
| external_id | VARCHAR(100) | External identifier |
| name | VARCHAR(300) | Full name |
| affiliation | VARCHAR(500) | Institution affiliation |
| country | VARCHAR(100) | Country, indexed |
| specialty | VARCHAR(200) | Primary specialty |
| h_index | INTEGER | H-index from PubMed |
| publication_count | INTEGER | Total publications |
| trial_count | INTEGER | Total trials participated |
| active_trials | INTEGER | Currently active |
| VARCHAR(255) | Contact email | |
| bio_summary | TEXT | AI-generated profile summary |
| bio_summary_encrypted | BYTEA | pgcrypto encrypted copy |
site_scores— Computed scores with 5-dimension breakdown| Column | Type | Description |
|---|---|---|
| id | UUID | PK |
| org_id | VARCHAR(50) | Tenant key |
| site_id | UUID | FK to sites.id, indexed |
| project_id | UUID | FK to projects.id (optional) |
| score_total | FLOAT | Composite 0-100 |
| dim_recruitment | FLOAT | Recruitment dimension 0-1 |
| dim_experience | FLOAT | Experience dimension 0-1 |
| dim_publications | FLOAT | Publications dimension 0-1 |
| dim_infrastructure | FLOAT | Infrastructure dimension 0-1 |
| dim_regulatory | FLOAT | Regulatory dimension 0-1 |
| explanation | TEXT | LLM-generated explanation |
audit_log— Immutable audit trail (21 CFR Part 11)| Column | Type | Description |
|---|---|---|
| id | UUID | PK |
| org_id | VARCHAR(50) | Tenant key |
| user_id | UUID | FK to users.id |
| action | VARCHAR(20) | HTTP method |
| entity_type | VARCHAR(500) | URL path |
| entity_id | VARCHAR(100) | Affected entity |
| details | JSON | Action details |
| ip_address | VARCHAR(64) | SHA-256 hashed IP |
| record_hash | VARCHAR(64) | SHA-256 of record |
| prev_hash | VARCHAR(64) | Previous record hash |
| created_at | TIMESTAMP | Immutable timestamp |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /api/v1/sites/search | JWT | Search sites with filters |
| POST | /api/v1/sites/search/cursor | JWT | Cursor-based pagination |
| GET | /api/v1/sites/{id} | JWT | Site detail with score |
| GET | /api/v1/sites/{id}/score | JWT | Score breakdown |
| GET | /api/v1/sites/{id}/score/explain | JWT | Score explainability |
| POST | /api/v1/sites/{id}/score/customize | JWT | Custom weight scoring |
| GET | /api/v1/sites/therapeutic-areas | JWT | Filter options |
| GET | /api/v1/sites/phases | JWT | Phase filter options |
| GET | /api/v1/investigators | JWT | Search investigators |
| GET | /api/v1/investigators/{id} | JWT | Investigator detail |
| POST | /api/v1/investigators/compare | JWT | Compare 2-5 investigators |
| POST | /api/v1/site-agent | JWT | Conversational AI query |
| POST | /api/v1/predictions/recruitment | JWT | Recruitment prediction |
| POST | /api/v1/exports/sites/pdf | JWT | PDF export |
| POST | /api/v1/exports/sites/excel | JWT | Excel export |
| POST | /api/v1/ingestion/trigger | Admin | Trigger data ingestion |
| GET | /api/v1/ingestion/status | Admin | Ingestion status |
Region: europe-west6 (Zurich, Switzerland) — GDPR/nLPD compliance
Backend: Cloud Run (auto-scaling, min 0 / max 100 instances)
Database: Cloud SQL PostgreSQL 16 (HA, 34.65.122.178, private IP)
Frontend: Cloud Run (Next.js SSR, Docker container)
Secrets: GCP Secret Manager (DATABASE_URL, JWT_SECRET_KEY, etc.)
CDN: Cloudflare (310 PoPs, TLS 1.3 termination)
CI/CD: GitHub Actions → Cloud Build → Cloud Run deploy
Monitoring: Cloud Run metrics, Cloud Logging, custom health endpoint
Source files: All paths relative to src/backend/app/
Database version: Migration 014 (current)