Skip to content

Security Practices

Security implementation details and best practices for Noumaris.

Overview

This document covers practical security implementation. For HIPAA compliance requirements, see HIPAA Documentation.

Authentication & Authorization

JWT Token Validation

Implementation:

  • All endpoints use Depends(get_current_user) for authentication
  • JWT validated against Keycloak public key (RS256 algorithm)
  • Token includes user ID, roles, expiration
python
# backend/src/noumaris_backend/api/auth.py
def get_current_user(token: str = Depends(oauth2_scheme)):
    try:
        payload = jwt.decode(
            token,
            keycloak_public_key,
            algorithms=["RS256"],
            audience="account"
        )
        user_id = payload.get("sub")
        # Fetch user from database
        return user
    except JWTError:
        raise HTTPException(status_code=401, detail="Invalid token")

Role-Based Access Control (RBAC)

4 Roles:

  1. superadmin: System-wide administration
  2. institution_admin: Manages institution users and permissions
  3. resident: Institution-controlled feature access
  4. user: Standard physician role

Endpoint Protection:

python
# Superadmin only
@router.post("/admin/institutions")
async def create_institution(
    current_user: User = Depends(get_current_user)
):
    if "superadmin" not in current_user.roles:
        raise HTTPException(status_code=403, detail="Insufficient permissions")

Session Management

Token Expiration:

  • Access token: 30 minutes
  • Refresh token: 7 days (Keycloak default)
  • Frontend shows timeout warning at 2 minutes before expiry

Auto-Logout:

  • Frontend AuthContext monitors token expiration
  • Redirects to login when token expires
  • LocalStorage cleared on logout

WebSocket Authentication

Special handling (no middleware):

python
# Token passed as query parameter
ws://localhost:8000/transcribe?token=eyJhbGci...

# Inline validation in endpoint
@app.websocket("/transcribe")
async def transcribe(websocket: WebSocket, token: str):
    try:
        user = validate_jwt(token)
        # Check rate limit (3 concurrent connections)
        if not check_rate_limit(user.id):
            await websocket.close(code=1008)
    except:
        await websocket.close(code=1008)

Why not middleware? - WebSocket middleware causes connection issues with FastAPI

Input Validation

Pydantic Models

All endpoints validate input via Pydantic:

python
from pydantic import BaseModel, EmailStr, validator

class CreateUserRequest(BaseModel):
    email: EmailStr  # Validates email format
    name: str
    role: str

    @validator('role')
    def validate_role(cls, v):
        allowed = ['superadmin', 'institution_admin', 'resident', 'user']
        if v not in allowed:
            raise ValueError(f'Role must be one of {allowed}')
        return v

Benefits:

  • Type safety at runtime
  • Automatic validation errors (400 Bad Request)
  • Prevents injection attacks via strict typing

SQL Injection Prevention

✅ Safe - SQLAlchemy ORM with parameterized queries:

python
# Parameters are safely escaped
user = session.query(User).filter_by(email=email).first()

❌ Unsafe - Raw SQL with string interpolation (never do this):

python
# Vulnerable to SQL injection
query = f"SELECT * FROM users WHERE email = '{email}'"

XSS Prevention

Frontend:

  • React automatically escapes HTML in JSX
  • Use dangerouslySetInnerHTML only for trusted content
jsx
// ✅ Safe - React escapes automatically
<div>{user.name}</div>

// ❌ Unsafe - bypasses escaping
<div dangerouslySetInnerHTML={{__html: user.name}} />

Backend:

  • Pydantic validates string inputs
  • No HTML stored in database (TipTap uses JSON format)

CSRF Protection

Not needed - API uses JWT in Authorization header (not cookies)

  • CSRF attacks require cookies
  • JWTs are sent explicitly in headers, not automatically like cookies

Rate Limiting

API Rate Limits

SlowAPI implementation:

python
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.get("/documents")
@limiter.limit("10/minute")  # 10 requests per minute per IP
async def get_documents(request: Request, current_user: User = Depends(get_current_user)):
    pass

Rate Limits by Endpoint:

EndpointLimitReason
/health100/minuteHealth checks
/documents10/minutePrevent data scraping
/summarize_transcription5/minuteExpensive LLM calls
/admin/*20/minuteAdministrative operations

WebSocket Rate Limiting

Custom limiter (3 concurrent connections per user):

python
# backend/src/noumaris_backend/api/websocket_auth.py
class WebSocketRateLimiter:
    def __init__(self, max_connections: int = 3):
        self.active_connections: Dict[str, int] = {}

    def check_limit(self, user_id: str) -> bool:
        current = self.active_connections.get(user_id, 0)
        return current < self.max_connections

    def increment(self, user_id: str):
        self.active_connections[user_id] = self.active_connections.get(user_id, 0) + 1

    def decrement(self, user_id: str):
        if user_id in self.active_connections:
            self.active_connections[user_id] -= 1

CORS Configuration

Backend (main.py):

python
from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=[
        "http://localhost:5173",  # Local frontend
        "https://noumaris.com",   # Production frontend
    ],
    allow_credentials=True,  # Allow cookies and auth headers
    allow_methods=["*"],     # GET, POST, PUT, DELETE
    allow_headers=["*"],     # Authorization, Content-Type, etc.
)

Security notes:

  • Never use allow_origins=["*"] in production
  • allow_credentials=True requires specific origins (not wildcard)

Security Headers

Recommended headers (add to Cloud Run or load balancer):

yaml
# Example: Cloud Run service configuration
headers:
  - "Strict-Transport-Security: max-age=31536000; includeSubDomains"  # HSTS
  - "X-Content-Type-Options: nosniff"  # Prevent MIME sniffing
  - "X-Frame-Options: DENY"  # Prevent clickjacking
  - "Content-Security-Policy: default-src 'self'"  # CSP
  - "Referrer-Policy: strict-origin-when-cross-origin"

Implementation status: ⚠️ Not yet implemented - add in Q1 2026

Secrets Management

Google Secret Manager

All API keys stored in Secret Manager:

  • anthropic-api-key
  • deepgram-api-key
  • database-url
  • keycloak-admin-password

Access:

bash
# View secret
gcloud secrets versions access latest --secret anthropic-api-key

# Grant Cloud Run access
gcloud secrets add-iam-policy-binding anthropic-api-key \
  --member=serviceAccount:[email protected] \
  --role=roles/secretmanager.secretAccessor

Environment Variables

Never commit:

  • .env files (backend)
  • .env.localhost files (frontend)
  • API keys in code

Gitignore:

.env
.env.*
!.env.example

Use .env.example for documentation:

bash
# .env.example (safe to commit)
DATABASE_URL=postgresql://user:password@localhost:5433/dbname
ANTHROPIC_API_KEY=your_key_here

Audit Logging

Application Logs

Logged actions:

  • API requests (user ID, endpoint, timestamp)
  • WebSocket connections (connection ID, user ID)
  • Permission changes (PermissionChangeLog table)
  • Failed authentication attempts
python
# Structured logging
import logging
logger = logging.getLogger(__name__)

logger.info(f"User {user.id} accessed /documents at {datetime.now()}")
logger.error(f"Failed login attempt for email {email}")

Database Audit Trail

PermissionChangeLog:

python
class PermissionChangeLog(Base):
    id = Column(UUID, primary_key=True)
    user_id = Column(UUID)  # Affected user
    changed_by_id = Column(UUID)  # Admin who made change
    change_type = Column(String)  # 'grant', 'revoke', 'bulk_grant', etc.
    old_value = Column(String)
    new_value = Column(String)
    change_reason = Column(String)
    changed_at = Column(DateTime, default=datetime.utcnow)

Cloud Logging

Google Cloud Logging:

  • All Cloud Run logs automatically captured
  • Retention: 30 days
  • Searchable by severity, timestamp, user_id
bash
# Search logs for specific user
gcloud logging read "jsonPayload.user_id='abc-123'" --limit 100

# Search for errors
gcloud logging read "severity>=ERROR" --limit 50

Password Policies

Keycloak configuration:

  • Minimum length: 8 characters
  • Require uppercase: Yes
  • Require lowercase: Yes
  • Require numbers: Yes
  • Require special characters: No (for user convenience)
  • Expire passwords: 90 days (optional, not enforced by default)
  • Account lockout: 5 failed attempts, 30-minute lockout

Configure via Terraform:

hcl
resource "keycloak_realm" "noumaris" {
  # ...
  password_policy = "length(8) and upperCase(1) and lowerCase(1) and digits(1)"
}

Network Security

VPC Isolation

Database isolation:

  • Cloud SQL has only private IP (no public IP)
  • Backend connects via VPC connector
  • Database not accessible from internet
Internet → [Cloud Run] → [VPC Connector] → [Cloud SQL Private IP]
                ↑                              ↑
           Public access              No public access

Firewall Rules

Cloud SQL:

  • Only accepts connections from VPC connector IP range
  • No direct internet connections

Cloud Run:

  • Public endpoints (requires authentication)
  • Health check endpoint public (no sensitive data)

Security Checklist for New Features

Before deploying new features:

Authentication

  • [ ] Endpoint uses Depends(get_current_user)?
  • [ ] WebSocket validates JWT token?
  • [ ] Correct role required (superadmin, institution_admin, etc.)?

Input Validation

  • [ ] Pydantic model validates all inputs?
  • [ ] No raw SQL queries (use ORM)?
  • [ ] Email addresses validated with EmailStr?

Authorization

  • [ ] User can only access their own data?
  • [ ] Institution admin can only manage their institution?
  • [ ] Resident permissions checked via permission_service?

Rate Limiting

  • [ ] Appropriate rate limit applied?
  • [ ] Expensive operations limited (LLM, transcription)?

Logging

  • [ ] Important actions logged with user ID?
  • [ ] No sensitive data (passwords, PHI) in logs?

Data Handling

  • [ ] No hardcoded secrets or API keys?
  • [ ] Sensitive data encrypted in transit (HTTPS)?
  • [ ] Database sessions use context manager?

Frontend

  • [ ] No dangerouslySetInnerHTML without sanitization?
  • [ ] JWT token in Authorization header?
  • [ ] No sensitive data in localStorage (only token)?

Vulnerability Management

Dependency Scanning

Automated:

  • Dependabot enabled on GitHub
  • Alerts for security vulnerabilities
  • Auto-PR for patch updates

Manual:

bash
# Frontend
npm audit
npm audit fix

# Backend
poetry show --outdated
poetry update

Security Updates

Policy:

  1. Critical (CVE score 9-10): Patch within 24 hours
  2. High (CVE score 7-8.9): Patch within 7 days
  3. Medium (CVE score 4-6.9): Patch within 30 days
  4. Low (CVE score 0-3.9): Patch in next release

Penetration Testing

Planned:

  • Q2 2026: Third-party penetration test
  • Annual thereafter

Incident Response

See HIPAA Documentation for detailed incident response plan.

Quick response:

  1. Detect: Monitor logs, user reports
  2. Contain: Revoke tokens, disable accounts
  3. Investigate: Audit logs, identify scope
  4. Remediate: Patch vulnerabilities
  5. Document: Incident report, post-mortem

Security Training

Required for all team members:

  • HIPAA awareness training
  • Secure coding practices
  • Phishing awareness
  • Incident reporting procedures

Annual refresh required

Known Limitations

Current gaps:

  1. ⚠️ No MFA enforcement (available but not required)
  2. ⚠️ No automated security scanning (SAST/DAST)
  3. ⚠️ No intrusion detection system
  4. ⚠️ No Web Application Firewall (WAF)
  5. ⚠️ Security headers not configured

Planned improvements (Q1-Q2 2026):

  • Enforce MFA for all production users
  • Implement automated security scanning in CI/CD
  • Add security headers to Cloud Run
  • Consider Google Cloud Armor (WAF)

Resources

Next Steps

Internal documentation for Noumaris platform