Migrate to Supabase: Replace custom backend with hosted BaaS

## Summary

Migrate CoherenceLab from the custom Node.js/Fastify backend to hosted Supabase, enabling the web app to communicate directly with Supabase over the internet. This eliminates `apps/backend-collector` while keeping `apps/analyzer` as a separate Python service.

## Motivation

- **Faster development**: Supabase provides Auth, REST API, Realtime, and Storage out of the box
- **Simpler operations**: No custom backend to deploy/maintain
- **Better DX**: Auto-generated TypeScript types, built-in auth UI components available
- **Cost-effective**: Pay-per-use for a research project with variable load

## Architecture Change

```
BEFORE:                              AFTER:
┌─────────┐    ┌──────────────┐     ┌─────────┐    ┌─────────────────┐
│ Web App │───▶│ backend-     │     │ Web App │───▶│ Supabase        │
└─────────┘    │ collector    │     └─────────┘    │ (Auth, DB, API, │
               │ (Node.js)    │                    │  Realtime,      │
┌─────────┐    └──────┬───────┘     ┌─────────┐    │  Storage)       │
│ Devices │───────────┤             │ Devices │───▶│                 │
└─────────┘           ▼             └─────────┘    └────────┬────────┘
               ┌──────────────┐                             │
               │ PostgreSQL   │     ┌──────────┐            │
               └──────────────┘     │ Analyzer │────────────┘
                                    │ (Python) │
               ┌──────────────┐     └──────────┘
               │ Storj        │
               └──────────────┘     (Supabase Storage for MVP)
```

## Key Decisions

| Component | Decision |
|-----------|----------|
| **Analyzer service** | Keep as separate Python service, connects to Supabase PostgreSQL |
| **Device ingestion** | Supabase Realtime + Edge Function for validation |
| **Object storage** | Supabase Storage for MVP, upgrade to Storj later |

## Prerequisites

Before starting implementation:
1. Create Supabase project at https://supabase.com
2. Note the project URL and anon key
3. Get the PostgreSQL connection string for the analyzer

## Implementation Checklist

### Phase 1: Supabase Project Setup

- [ ] Create Supabase project
- [ ] Run schema SQL in SQL Editor (see Schema section below)
- [ ] Run RLS policies SQL (see RLS section below)
- [ ] Configure email/password auth in Auth settings
- [ ] Create `entropy-chunks` storage bucket

### Phase 2: Web App Migration

**New files to create:**
- [ ] `apps/web/src/lib/supabase.ts` - Supabase client instance
- [ ] `apps/web/src/lib/database.types.ts` - Generated types (run `supabase gen types typescript`)
- [ ] `apps/web/src/hooks/useDevices.ts` - Device CRUD hooks
- [ ] `apps/web/src/hooks/useProfiles.ts` - Sampling profile hooks
- [ ] `apps/web/src/hooks/useAnalysis.ts` - Analysis job hooks
- [ ] `apps/web/.env` - Supabase URL and anon key

**Files to modify:**
- [ ] `apps/web/package.json` - Add `@supabase/supabase-js`
- [ ] `apps/web/src/hooks/useAuth.ts` - Replace with Supabase Auth
- [ ] `apps/web/src/pages/Auth/LoginPage.tsx` - Use Supabase auth
- [ ] `apps/web/src/pages/Auth/SignupPage.tsx` - Use Supabase auth
- [ ] `apps/web/src/pages/Devices/DevicesPage.tsx` - Use new hooks
- [ ] `apps/web/src/pages/Devices/DeviceDetailPage.tsx` - Use new hooks
- [ ] `apps/web/src/pages/Analysis/AnalysisPage.tsx` - Use new hooks
- [ ] `apps/web/src/pages/Analysis/AnalysisDetailPage.tsx` - Use new hooks
- [ ] `apps/web/src/App.tsx` - Update ProtectedRoute for Supabase session
- [ ] `apps/web/src/components/Layout.tsx` - Update logout
- [ ] `apps/web/vite.config.ts` - Remove API proxy

**Files to delete:**
- [ ] `apps/web/src/lib/api.ts` - Replaced by Supabase client

### Phase 3: Analyzer Service Update

- [ ] Update `apps/analyzer/src/config.py` - Add Supabase DB URL option
- [ ] Update `apps/analyzer/src/database.py` - Use Supabase connection pooler
- [ ] Update `apps/analyzer/.env` - Set DATABASE_URL to Supabase

### Phase 4: Device Ingestion (Edge Functions)

- [ ] Create `device-auth` Edge Function - Validates device_token, issues JWT
- [ ] Create `ingest-entropy` Edge Function - Receives and stores samples
- [ ] Create `insert_entropy_sample` database function for validation

### Phase 5: Cleanup

- [ ] Archive backend-collector to `archive/backend-collector-pre-supabase` branch
- [ ] Remove `apps/backend-collector` directory
- [ ] Update `pnpm-workspace.yaml`
- [ ] Update `turbo.json`
- [ ] Update root `package.json` scripts

---

## Database Schema SQL

<details>
<summary>Click to expand schema SQL</summary>

```sql
-- Enable required extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Enums
CREATE TYPE user_role AS ENUM ('user', 'admin');
CREATE TYPE device_client_type AS ENUM ('raspi', 'android', 'ios', 'linux', 'unknown');
CREATE TYPE device_status AS ENUM ('active', 'inactive', 'error');
CREATE TYPE analysis_job_status AS ENUM ('queued', 'running', 'completed', 'failed');
CREATE TYPE storj_chunk_status AS ENUM ('pending', 'uploading', 'uploaded', 'failed');

-- Profiles table (links to auth.users)
CREATE TABLE profiles (
  id UUID PRIMARY KEY REFERENCES auth.users(id) ON DELETE CASCADE,
  email TEXT NOT NULL,
  role user_role NOT NULL DEFAULT 'user',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Devices table
CREATE TABLE devices (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  user_id UUID NOT NULL REFERENCES profiles(id) ON DELETE CASCADE,
  name TEXT NOT NULL,
  description TEXT,
  client_type device_client_type NOT NULL DEFAULT 'unknown',
  device_token TEXT NOT NULL UNIQUE,
  status device_status NOT NULL DEFAULT 'active',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  last_seen_at TIMESTAMPTZ
);
CREATE INDEX idx_devices_user_id ON devices(user_id);
CREATE INDEX idx_devices_device_token ON devices(device_token);

-- Sampling profiles table
CREATE TABLE sampling_profiles (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  device_id UUID NOT NULL REFERENCES devices(id) ON DELETE CASCADE,
  label TEXT NOT NULL,
  sample_rate_hz DECIMAL(10, 6) NOT NULL,
  sample_size_bytes INTEGER NOT NULL,
  metadata JSONB NOT NULL DEFAULT '{}',
  is_active BOOLEAN NOT NULL DEFAULT TRUE,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_sampling_profiles_device_id ON sampling_profiles(device_id);

-- Storj chunks table
CREATE TABLE storj_chunks (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  bucket TEXT NOT NULL,
  object_key TEXT NOT NULL,
  size_bytes BIGINT NOT NULL,
  byte_count BIGINT NOT NULL,
  sample_count INTEGER NOT NULL,
  min_timestamp TIMESTAMPTZ NOT NULL,
  max_timestamp TIMESTAMPTZ NOT NULL,
  device_ids TEXT[] NOT NULL,
  uploaded_at TIMESTAMPTZ,
  status storj_chunk_status NOT NULL DEFAULT 'pending',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_storj_chunks_status ON storj_chunks(status);
CREATE INDEX idx_storj_chunks_timestamps ON storj_chunks(min_timestamp, max_timestamp);

-- Entropy samples table
CREATE TABLE entropy_samples (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  device_id UUID NOT NULL REFERENCES devices(id) ON DELETE CASCADE,
  profile_id UUID NOT NULL REFERENCES sampling_profiles(id) ON DELETE CASCADE,
  timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  sequence BIGINT NOT NULL,
  byte_length INTEGER NOT NULL,
  sample_data BYTEA NOT NULL,
  storj_chunk_id UUID REFERENCES storj_chunks(id) ON DELETE SET NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_entropy_samples_device_timestamp ON entropy_samples(device_id, timestamp);
CREATE INDEX idx_entropy_samples_storj_chunk_id ON entropy_samples(storj_chunk_id);
CREATE INDEX idx_entropy_samples_timestamp ON entropy_samples(timestamp);

-- Analysis jobs table
CREATE TABLE analysis_jobs (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  user_id UUID NOT NULL REFERENCES profiles(id) ON DELETE CASCADE,
  device_ids TEXT[] NOT NULL DEFAULT '{}',
  time_range_start TIMESTAMPTZ NOT NULL,
  time_range_end TIMESTAMPTZ NOT NULL,
  test_names TEXT[] NOT NULL,
  parameters JSONB NOT NULL DEFAULT '{}',
  status analysis_job_status NOT NULL DEFAULT 'queued',
  error_message TEXT,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ
);
CREATE INDEX idx_analysis_jobs_user_id ON analysis_jobs(user_id);
CREATE INDEX idx_analysis_jobs_status ON analysis_jobs(status);

-- Analysis results table
CREATE TABLE analysis_results (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  job_id UUID NOT NULL REFERENCES analysis_jobs(id) ON DELETE CASCADE,
  test_name TEXT NOT NULL,
  device_id TEXT,
  summary JSONB NOT NULL,
  metrics JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_analysis_results_job_id ON analysis_results(job_id);

-- Chunk analysis table
CREATE TABLE chunk_analysis (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  chunk_id UUID NOT NULL REFERENCES storj_chunks(id) ON DELETE CASCADE,
  tests JSONB NOT NULL,
  metrics JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_chunk_analysis_chunk_id ON chunk_analysis(chunk_id);

-- Device timeslice stats table
CREATE TABLE device_timeslice_stats (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  device_id UUID NOT NULL,
  bucket_start TIMESTAMPTZ NOT NULL,
  bucket_end TIMESTAMPTZ NOT NULL,
  sample_count BIGINT NOT NULL,
  byte_count BIGINT NOT NULL,
  entropy_estimate DOUBLE PRECISION,
  uniformity_chi_square JSONB,
  runs_test JSONB,
  autocorrelation JSONB,
  extra JSONB NOT NULL DEFAULT '{}',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_device_timeslice_stats_device_bucket ON device_timeslice_stats(device_id, bucket_start);

-- Coherence stats table
CREATE TABLE coherence_stats (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  bucket_start TIMESTAMPTZ NOT NULL,
  bucket_end TIMESTAMPTZ NOT NULL,
  device_a_id UUID NOT NULL,
  device_b_id UUID NOT NULL,
  correlation_metrics JSONB NOT NULL,
  tests JSONB NOT NULL DEFAULT '{}',
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_coherence_stats_devices_bucket ON coherence_stats(device_a_id, device_b_id, bucket_start);

-- Maintenance log table
CREATE TABLE maintenance_log (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  operation_type TEXT NOT NULL,
  target_table TEXT NOT NULL,
  details JSONB NOT NULL DEFAULT '{}',
  rows_affected BIGINT,
  bytes_freed BIGINT,
  executed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  duration_ms INTEGER
);

-- Trigger to auto-create profile on signup
CREATE OR REPLACE FUNCTION handle_new_user()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO public.profiles (id, email, role)
  VALUES (NEW.id, NEW.email, 'user');
  RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

CREATE TRIGGER on_auth_user_created
  AFTER INSERT ON auth.users
  FOR EACH ROW EXECUTE FUNCTION handle_new_user();

-- Trigger to auto-update updated_at
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER update_profiles_updated_at
  BEFORE UPDATE ON profiles
  FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();

CREATE TRIGGER update_sampling_profiles_updated_at
  BEFORE UPDATE ON sampling_profiles
  FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
```

</details>

---

## Row Level Security (RLS) Policies

<details>
<summary>Click to expand RLS policies SQL</summary>

```sql
-- Enable RLS on all tables
ALTER TABLE profiles ENABLE ROW LEVEL SECURITY;
ALTER TABLE devices ENABLE ROW LEVEL SECURITY;
ALTER TABLE sampling_profiles ENABLE ROW LEVEL SECURITY;
ALTER TABLE entropy_samples ENABLE ROW LEVEL SECURITY;
ALTER TABLE analysis_jobs ENABLE ROW LEVEL SECURITY;
ALTER TABLE analysis_results ENABLE ROW LEVEL SECURITY;
ALTER TABLE storj_chunks ENABLE ROW LEVEL SECURITY;
ALTER TABLE chunk_analysis ENABLE ROW LEVEL SECURITY;
ALTER TABLE device_timeslice_stats ENABLE ROW LEVEL SECURITY;
ALTER TABLE coherence_stats ENABLE ROW LEVEL SECURITY;
ALTER TABLE maintenance_log ENABLE ROW LEVEL SECURITY;

-- Profiles: Users can only see/update their own profile
CREATE POLICY "Users can view own profile"
  ON profiles FOR SELECT USING (auth.uid() = id);

CREATE POLICY "Users can update own profile"
  ON profiles FOR UPDATE USING (auth.uid() = id);

-- Devices: Users can CRUD their own devices
CREATE POLICY "Users can view own devices"
  ON devices FOR SELECT USING (auth.uid() = user_id);

CREATE POLICY "Users can create own devices"
  ON devices FOR INSERT WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users can update own devices"
  ON devices FOR UPDATE USING (auth.uid() = user_id);

CREATE POLICY "Users can delete own devices"
  ON devices FOR DELETE USING (auth.uid() = user_id);

-- Sampling Profiles: Access through device ownership
CREATE POLICY "Users can view own sampling profiles"
  ON sampling_profiles FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = sampling_profiles.device_id
    AND devices.user_id = auth.uid()
  ));

CREATE POLICY "Users can create sampling profiles for own devices"
  ON sampling_profiles FOR INSERT
  WITH CHECK (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = sampling_profiles.device_id
    AND devices.user_id = auth.uid()
  ));

CREATE POLICY "Users can update own sampling profiles"
  ON sampling_profiles FOR UPDATE
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = sampling_profiles.device_id
    AND devices.user_id = auth.uid()
  ));

CREATE POLICY "Users can delete own sampling profiles"
  ON sampling_profiles FOR DELETE
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = sampling_profiles.device_id
    AND devices.user_id = auth.uid()
  ));

-- Entropy Samples: Users can view samples from their devices
CREATE POLICY "Users can view own entropy samples"
  ON entropy_samples FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = entropy_samples.device_id
    AND devices.user_id = auth.uid()
  ));

-- Analysis Jobs: Users can CRUD their own jobs
CREATE POLICY "Users can view own analysis jobs"
  ON analysis_jobs FOR SELECT USING (auth.uid() = user_id);

CREATE POLICY "Users can create own analysis jobs"
  ON analysis_jobs FOR INSERT WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users can update own analysis jobs"
  ON analysis_jobs FOR UPDATE USING (auth.uid() = user_id);

CREATE POLICY "Users can delete own analysis jobs"
  ON analysis_jobs FOR DELETE USING (auth.uid() = user_id);

-- Analysis Results: Access through job ownership
CREATE POLICY "Users can view own analysis results"
  ON analysis_results FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM analysis_jobs
    WHERE analysis_jobs.id = analysis_results.job_id
    AND analysis_jobs.user_id = auth.uid()
  ));

-- Storj Chunks: Users can view chunks containing their devices
CREATE POLICY "Users can view chunks with their devices"
  ON storj_chunks FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.user_id = auth.uid()
    AND devices.id::text = ANY(storj_chunks.device_ids)
  ));

-- Device Timeslice Stats: Access through device ownership
CREATE POLICY "Users can view own device timeslice stats"
  ON device_timeslice_stats FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.id = device_timeslice_stats.device_id
    AND devices.user_id = auth.uid()
  ));

-- Coherence Stats: Access if user owns either device
CREATE POLICY "Users can view coherence stats for their devices"
  ON coherence_stats FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM devices
    WHERE devices.user_id = auth.uid()
    AND (devices.id = coherence_stats.device_a_id OR devices.id = coherence_stats.device_b_id)
  ));

-- Maintenance Log: Admin only
CREATE POLICY "Admins can view maintenance log"
  ON maintenance_log FOR SELECT
  USING (EXISTS (
    SELECT 1 FROM profiles
    WHERE profiles.id = auth.uid()
    AND profiles.role = 'admin'
  ));
```

</details>

---

## Environment Variables Needed

**Web App (`apps/web/.env`):**
```
VITE_SUPABASE_URL=https://[PROJECT_REF].supabase.co
VITE_SUPABASE_ANON_KEY=eyJ...
```

**Analyzer (`apps/analyzer/.env`):**
```
DATABASE_URL=postgresql://postgres.[PROJECT_REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate to Supabase: Replace custom backend with hosted BaaS #2

Summary

Motivation

Architecture Change

Key Decisions

Prerequisites

Implementation Checklist

Phase 1: Supabase Project Setup

Phase 2: Web App Migration

Phase 3: Analyzer Service Update

Phase 4: Device Ingestion (Edge Functions)

Phase 5: Cleanup

Database Schema SQL

Row Level Security (RLS) Policies

Environment Variables Needed

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Decision
Analyzer service	Keep as separate Python service, connects to Supabase PostgreSQL
Device ingestion	Supabase Realtime + Edge Function for validation
Object storage	Supabase Storage for MVP, upgrade to Storj later

Migrate to Supabase: Replace custom backend with hosted BaaS #2

Description

Summary

Motivation

Architecture Change

Key Decisions

Prerequisites

Implementation Checklist

Phase 1: Supabase Project Setup

Phase 2: Web App Migration

Phase 3: Analyzer Service Update

Phase 4: Device Ingestion (Edge Functions)

Phase 5: Cleanup

Database Schema SQL

Row Level Security (RLS) Policies

Environment Variables Needed

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions