Security Practices

Security implementation details and best practices for Noumaris.

Overview

This document covers practical security implementation. For HIPAA compliance requirements, see HIPAA Documentation.

Authentication & Authorization

JWT Token Validation

Implementation:

All endpoints use Depends(get_current_user) for authentication
JWT validated against Keycloak public key (RS256 algorithm)
Token includes user ID, roles, expiration

python

# backend/src/noumaris_backend/api/auth.py
def get_current_user(token: str = Depends(oauth2_scheme)):
    try:
        payload = jwt.decode(
            token,
            keycloak_public_key,
            algorithms=["RS256"],
            audience="account"
        )
        user_id = payload.get("sub")
        # Fetch user from database
        return user
    except JWTError:
        raise HTTPException(status_code=401, detail="Invalid token")

Role-Based Access Control (RBAC)

4 Roles:

superadmin: System-wide administration
institution_admin: Manages institution users and permissions
resident: Institution-controlled feature access
user: Standard physician role

Endpoint Protection:

python

# Superadmin only
@router.post("/admin/institutions")
async def create_institution(
    current_user: User = Depends(get_current_user)
):
    if "superadmin" not in current_user.roles:
        raise HTTPException(status_code=403, detail="Insufficient permissions")

Session Management

Token Expiration:

Access token: 30 minutes
Refresh token: 7 days (Keycloak default)
Frontend shows timeout warning at 2 minutes before expiry

Auto-Logout:

Frontend AuthContext monitors token expiration
Redirects to login when token expires
LocalStorage cleared on logout

WebSocket Authentication

Special handling (no middleware):

python

# Token passed as query parameter
ws://localhost:8000/transcribe?token=eyJhbGci...

# Inline validation in endpoint
@app.websocket("/transcribe")
async def transcribe(websocket: WebSocket, token: str):
    try:
        user = validate_jwt(token)
        # Check rate limit (3 concurrent connections)
        if not check_rate_limit(user.id):
            await websocket.close(code=1008)
    except:
        await websocket.close(code=1008)

Why not middleware? - WebSocket middleware causes connection issues with FastAPI

Input Validation

Pydantic Models

All endpoints validate input via Pydantic:

python

from pydantic import BaseModel, EmailStr, validator

class CreateUserRequest(BaseModel):
    email: EmailStr  # Validates email format
    name: str
    role: str

    @validator('role')
    def validate_role(cls, v):
        allowed = ['superadmin', 'institution_admin', 'resident', 'user']
        if v not in allowed:
            raise ValueError(f'Role must be one of {allowed}')
        return v

Benefits:

Type safety at runtime
Automatic validation errors (400 Bad Request)
Prevents injection attacks via strict typing

SQL Injection Prevention

✅ Safe - SQLAlchemy ORM with parameterized queries:

python

# Parameters are safely escaped
user = session.query(User).filter_by(email=email).first()

❌ Unsafe - Raw SQL with string interpolation (never do this):

python

# Vulnerable to SQL injection
query = f"SELECT * FROM users WHERE email = '{email}'"

XSS Prevention

Frontend:

React automatically escapes HTML in JSX
Use dangerouslySetInnerHTML only for trusted content

jsx

// ✅ Safe - React escapes automatically
<div>{user.name}</div>

// ❌ Unsafe - bypasses escaping
<div dangerouslySetInnerHTML={{__html: user.name}} />

Backend:

Pydantic validates string inputs
No HTML stored in database (TipTap uses JSON format)

CSRF Protection

Not needed - API uses JWT in Authorization header (not cookies)

CSRF attacks require cookies
JWTs are sent explicitly in headers, not automatically like cookies

Rate Limiting

API Rate Limits

SlowAPI implementation:

python

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.get("/documents")
@limiter.limit("10/minute")  # 10 requests per minute per IP
async def get_documents(request: Request, current_user: User = Depends(get_current_user)):
    pass

Rate Limits by Endpoint:

Endpoint	Limit	Reason
`/health`	100/minute	Health checks
`/documents`	10/minute	Prevent data scraping
`/summarize_transcription`	5/minute	Expensive LLM calls
`/admin/*`	20/minute	Administrative operations

WebSocket Rate Limiting

Custom limiter (3 concurrent connections per user):

python

# backend/src/noumaris_backend/api/websocket_auth.py
class WebSocketRateLimiter:
    def __init__(self, max_connections: int = 3):
        self.active_connections: Dict[str, int] = {}

    def check_limit(self, user_id: str) -> bool:
        current = self.active_connections.get(user_id, 0)
        return current < self.max_connections

    def increment(self, user_id: str):
        self.active_connections[user_id] = self.active_connections.get(user_id, 0) + 1

    def decrement(self, user_id: str):
        if user_id in self.active_connections:
            self.active_connections[user_id] -= 1

CORS Configuration

Backend (main.py):

python

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=[
        "http://localhost:5173",  # Local frontend
        "https://noumaris.com",   # Production frontend
    ],
    allow_credentials=True,  # Allow cookies and auth headers
    allow_methods=["*"],     # GET, POST, PUT, DELETE
    allow_headers=["*"],     # Authorization, Content-Type, etc.
)

Security notes:

Never use allow_origins=["*"] in production
allow_credentials=True requires specific origins (not wildcard)

Security Headers

Recommended headers (add to Cloud Run or load balancer):

yaml

# Example: Cloud Run service configuration
headers:
  - "Strict-Transport-Security: max-age=31536000; includeSubDomains"  # HSTS
  - "X-Content-Type-Options: nosniff"  # Prevent MIME sniffing
  - "X-Frame-Options: DENY"  # Prevent clickjacking
  - "Content-Security-Policy: default-src 'self'"  # CSP
  - "Referrer-Policy: strict-origin-when-cross-origin"

Implementation status: ⚠️ Not yet implemented - add in Q1 2026

Secrets Management

Google Secret Manager

All API keys stored in Secret Manager:

anthropic-api-key
deepgram-api-key
database-url
keycloak-admin-password

Access:

bash

# View secret
gcloud secrets versions access latest --secret anthropic-api-key

# Grant Cloud Run access
gcloud secrets add-iam-policy-binding anthropic-api-key \
  --member=serviceAccount:[email protected] \
  --role=roles/secretmanager.secretAccessor

Environment Variables

Never commit:

.env files (backend)
.env.localhost files (frontend)
API keys in code

Gitignore:

.env
.env.*
!.env.example

Use .env.example for documentation:

bash

# .env.example (safe to commit)
DATABASE_URL=postgresql://user:password@localhost:5433/dbname
ANTHROPIC_API_KEY=your_key_here

Audit Logging

Application Logs

Logged actions:

API requests (user ID, endpoint, timestamp)
WebSocket connections (connection ID, user ID)
Permission changes (PermissionChangeLog table)
Failed authentication attempts

python

# Structured logging
import logging
logger = logging.getLogger(__name__)

logger.info(f"User {user.id} accessed /documents at {datetime.now()}")
logger.error(f"Failed login attempt for email {email}")

Database Audit Trail

PermissionChangeLog:

python

class PermissionChangeLog(Base):
    id = Column(UUID, primary_key=True)
    user_id = Column(UUID)  # Affected user
    changed_by_id = Column(UUID)  # Admin who made change
    change_type = Column(String)  # 'grant', 'revoke', 'bulk_grant', etc.
    old_value = Column(String)
    new_value = Column(String)
    change_reason = Column(String)
    changed_at = Column(DateTime, default=datetime.utcnow)

Cloud Logging

Google Cloud Logging:

All Cloud Run logs automatically captured
Retention: 30 days
Searchable by severity, timestamp, user_id

bash

# Search logs for specific user
gcloud logging read "jsonPayload.user_id='abc-123'" --limit 100

# Search for errors
gcloud logging read "severity>=ERROR" --limit 50

Password Policies

Keycloak configuration:

Minimum length: 8 characters
Require uppercase: Yes
Require lowercase: Yes
Require numbers: Yes
Require special characters: No (for user convenience)
Expire passwords: 90 days (optional, not enforced by default)
Account lockout: 5 failed attempts, 30-minute lockout

Configure via Terraform:

hcl

resource "keycloak_realm" "noumaris" {
  # ...
  password_policy = "length(8) and upperCase(1) and lowerCase(1) and digits(1)"
}

Network Security

VPC Isolation

Database isolation:

Cloud SQL has only private IP (no public IP)
Backend connects via VPC connector
Database not accessible from internet

Internet → [Cloud Run] → [VPC Connector] → [Cloud SQL Private IP]
                ↑                              ↑
           Public access              No public access

Firewall Rules

Cloud SQL:

Only accepts connections from VPC connector IP range
No direct internet connections

Cloud Run:

Public endpoints (requires authentication)
Health check endpoint public (no sensitive data)

Security Checklist for New Features

Before deploying new features:

Authentication

[ ] Endpoint uses Depends(get_current_user)?
[ ] WebSocket validates JWT token?
[ ] Correct role required (superadmin, institution_admin, etc.)?

Input Validation

[ ] Pydantic model validates all inputs?
[ ] No raw SQL queries (use ORM)?
[ ] Email addresses validated with EmailStr?

Authorization

[ ] User can only access their own data?
[ ] Institution admin can only manage their institution?
[ ] Resident permissions checked via permission_service?

Rate Limiting

[ ] Appropriate rate limit applied?
[ ] Expensive operations limited (LLM, transcription)?

Logging

[ ] Important actions logged with user ID?
[ ] No sensitive data (passwords, PHI) in logs?

Data Handling

[ ] No hardcoded secrets or API keys?
[ ] Sensitive data encrypted in transit (HTTPS)?
[ ] Database sessions use context manager?

Frontend

[ ] No dangerouslySetInnerHTML without sanitization?
[ ] JWT token in Authorization header?
[ ] No sensitive data in localStorage (only token)?

Vulnerability Management

Dependency Scanning

Automated:

Dependabot enabled on GitHub
Alerts for security vulnerabilities
Auto-PR for patch updates

Manual:

bash

# Frontend
npm audit
npm audit fix

# Backend
poetry show --outdated
poetry update

Security Updates

Policy:

Critical (CVE score 9-10): Patch within 24 hours
High (CVE score 7-8.9): Patch within 7 days
Medium (CVE score 4-6.9): Patch within 30 days
Low (CVE score 0-3.9): Patch in next release

Penetration Testing

Planned:

Q2 2026: Third-party penetration test
Annual thereafter

Incident Response

See HIPAA Documentation for detailed incident response plan.

Quick response:

Detect: Monitor logs, user reports
Contain: Revoke tokens, disable accounts
Investigate: Audit logs, identify scope
Remediate: Patch vulnerabilities
Document: Incident report, post-mortem

Security Training

Required for all team members:

HIPAA awareness training
Secure coding practices
Phishing awareness
Incident reporting procedures

Annual refresh required

Known Limitations

Current gaps:

⚠️ No MFA enforcement (available but not required)
⚠️ No automated security scanning (SAST/DAST)
⚠️ No intrusion detection system
⚠️ No Web Application Firewall (WAF)
⚠️ Security headers not configured

Planned improvements (Q1-Q2 2026):

Enforce MFA for all production users
Implement automated security scanning in CI/CD
Add security headers to Cloud Run
Consider Google Cloud Armor (WAF)

Resources

Next Steps

HIPAA Compliance - HIPAA requirements
Infrastructure Documentation - Technical security architecture
Deployment Guide - Secure deployment procedures

Security Practices ​

Overview ​

Authentication & Authorization ​

JWT Token Validation ​

Role-Based Access Control (RBAC) ​

Session Management ​

WebSocket Authentication ​

Input Validation ​

Pydantic Models ​

SQL Injection Prevention ​

XSS Prevention ​

CSRF Protection ​

Rate Limiting ​

API Rate Limits ​

WebSocket Rate Limiting ​

CORS Configuration ​

Security Headers ​

Secrets Management ​

Google Secret Manager ​

Environment Variables ​

Audit Logging ​

Application Logs ​

Database Audit Trail ​

Cloud Logging ​

Password Policies ​

Network Security ​

VPC Isolation ​

Firewall Rules ​

Security Checklist for New Features ​

Authentication ​

Input Validation ​

Authorization ​

Rate Limiting ​

Logging ​

Data Handling ​

Frontend ​

Vulnerability Management ​

Dependency Scanning ​

Security Updates ​

Penetration Testing ​

Incident Response ​

Security Training ​

Known Limitations ​

Resources ​

Next Steps ​

Security Practices

Overview

Authentication & Authorization

JWT Token Validation

Role-Based Access Control (RBAC)

Session Management

WebSocket Authentication

Input Validation

Pydantic Models

SQL Injection Prevention

XSS Prevention

CSRF Protection

Rate Limiting

API Rate Limits

WebSocket Rate Limiting

CORS Configuration

Security Headers

Secrets Management

Google Secret Manager

Environment Variables

Audit Logging

Application Logs

Database Audit Trail

Cloud Logging

Password Policies

Network Security

VPC Isolation

Firewall Rules

Security Checklist for New Features

Authentication

Input Validation

Authorization

Rate Limiting

Logging

Data Handling

Frontend

Vulnerability Management

Dependency Scanning

Security Updates

Penetration Testing

Incident Response

Security Training

Known Limitations

Resources

Next Steps