Skip to main content

Neon Database Migration Guide

Step-by-step guide to migrate StemBlock AI from DigitalOcean PostgreSQL to Neon, including RAG/pgvector migration, directUrl support, and HNSW index upgrade.

Prerequisites

  • Neon account (sign up at https://neon.tech)
  • Access to DigitalOcean database credentials
  • PostgreSQL client tools (psql, pg_dump)
  • Node.js and npm installed

Step 1: Create Neon Project

  1. Go to https://console.neon.tech
  2. Click "Create a project"
  3. Configure:
    • Project name: stemblock-production (or stemblock-staging)
    • Postgres version: 16 (or latest available)
    • Region: Choose closest to your users
  4. Click "Create project"
  5. Copy both connection strings — pooled and direct

Neon Connection String Formats

# Pooled (for application runtime — goes through PgBouncer)
postgresql://[user]:[password]@[endpoint]-pooler.[region].aws.neon.tech/[database]?sslmode=require

# Direct (for migrations and schema changes — bypasses PgBouncer)
postgresql://[user]:[password]@[endpoint].[region].aws.neon.tech/[database]?sslmode=require

You need both. The pooled connection is for app runtime; the direct connection is for Prisma migrations.


Step 2: Configure Cost Controls

Before migrating data, set spending limits:

  1. In Neon Console, go to Settings > Billing
  2. Set Autoscaling limits:
    • Minimum: 0.25 CU (allows scale-to-zero)
    • Maximum: 2 CU (cost ceiling for dev/staging)
    • For production: 4 CU recommended
  3. Set Auto-suspend delay: 5 minutes (default)
  4. Enable Spending alerts at your budget threshold
EnvironmentMin CUMax CUPlan
Development0.251Free
Staging0.252Launch ($19/mo)
Production0.54Scale ($69/mo)

Step 3: Export Data from DigitalOcean

3.1 Get DigitalOcean Connection Details

From DigitalOcean dashboard, get: Host, Port, Database name, Username, Password.

3.2 Export Schema and Data

# Set environment variables
export DO_HOST="your-do-host.db.ondigitalocean.com"
export DO_PORT="25060"
export DO_USER="stemblock"
export DO_DB="stemblock_db"
export DO_PASSWORD="your-password"

# Export full database (schema + data)
PGPASSWORD=$DO_PASSWORD pg_dump \
-h $DO_HOST \
-p $DO_PORT \
-U $DO_USER \
-d $DO_DB \
-F c \
-f stemblock_backup.dump

# Verify export size
ls -lh stemblock_backup.dump

3.3 Alternative: Export as SQL

PGPASSWORD=$DO_PASSWORD pg_dump \
-h $DO_HOST \
-p $DO_PORT \
-U $DO_USER \
-d $DO_DB \
--no-owner \
--no-privileges \
> stemblock_backup.sql

Step 4: Import Data to Neon

  1. Go to Neon Console > Your Project > Import
  2. Upload your dump file or provide source connection string
  3. Follow the guided import process

4.2 Using pg_restore (Custom format)

# Use DIRECT connection (not pooled) for data import
export NEON_DIRECT_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/dbname?sslmode=require"

pg_restore \
-d "$NEON_DIRECT_URL" \
--no-owner \
--no-privileges \
--clean \
--if-exists \
stemblock_backup.dump

4.3 Using psql (SQL format)

psql "$NEON_DIRECT_URL" < stemblock_backup.sql

Step 5: Verify Core Migration

5.1 Check Table Counts

psql "$NEON_DIRECT_URL" -c "
SELECT schemaname, tablename, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
"

5.2 Verify ENUMs Created

psql "$NEON_DIRECT_URL" -c "
SELECT typname FROM pg_type WHERE typtype = 'e' ORDER BY typname;
"

5.3 Verify Indexes

psql "$NEON_DIRECT_URL" -c "
SELECT indexname, tablename
FROM pg_indexes
WHERE schemaname = 'public'
ORDER BY tablename;
"

Step 6: Verify RAG / pgvector Migration

Neon has native pgvector support — no extra installation needed.

6.1 Verify pgvector Extension

psql "$NEON_DIRECT_URL" -c "SELECT * FROM pg_extension WHERE extname = 'vector';"

If missing (shouldn't be after pg_restore):

psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"

6.2 Verify RAG Tables

psql "$NEON_DIRECT_URL" -c "
SELECT 'document_embeddings' AS table_name, COUNT(*) FROM document_embeddings
UNION ALL
SELECT 'ai_feedback', COUNT(*) FROM ai_feedback
UNION ALL
SELECT 'fine_tuning_jobs', COUNT(*) FROM fine_tuning_jobs;
"

6.3 Verify Vector Index

psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
AND indexdef LIKE '%vector%';
"
psql "$NEON_DIRECT_URL" -c "
SELECT id, 1 - (embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)) AS score
FROM document_embeddings
ORDER BY embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)
LIMIT 5;
"

6.5 Upgrade IVFFlat → HNSW Index

The codebase includes a migration (20260227100000_upgrade_ivfflat_to_hnsw_index) that upgrades the vector index. If running Prisma migrations, this runs automatically. To apply manually:

psql "$NEON_DIRECT_URL" -c "
-- Drop old IVFFlat index
DROP INDEX IF EXISTS document_embeddings_embedding_idx;

-- Create HNSW index (better recall for dynamic datasets)
CREATE INDEX document_embeddings_embedding_idx ON document_embeddings
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
"

Why HNSW over IVFFlat:

FactorIVFFlat (old)HNSW (new)
Recall~95%~99%
Dynamic insertsNeeds periodic re-indexingHandles inserts well
Best forStatic datasetsGrowing datasets (RAG + feedback loop)
Build timeFasterSlower (one-time cost)

Step 7: Update Application Configuration

7.1 Prisma Schema Change (Already Applied)

The Prisma schema now supports Neon's dual-connection architecture:

datasource db {
provider = "postgresql"
url = env("DATABASE_URL") // Pooled (app runtime)
directUrl = env("DIRECT_DATABASE_URL") // Direct (migrations)
}

How it works:

  • DATABASE_URL (pooled) — used by the running application for all queries, goes through PgBouncer
  • DIRECT_DATABASE_URL (direct) — used only by prisma migrate and prisma db push, bypasses PgBouncer

7.2 Update Environment Variables

# Before (DigitalOcean — single connection)
# DATABASE_URL="postgresql://stemblock:password@host.db.ondigitalocean.com:25060/stemblock_db?sslmode=require"

# After (Neon — dual connections)
DATABASE_URL="postgresql://user:pass@ep-xyz-pooler.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"
DIRECT_DATABASE_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"

Note the difference: pooled URL has -pooler in the hostname, direct URL does not.

7.3 Update All Environments

Update these environment variables in:

  • Local .env file
  • Staging deployment platform
  • Production deployment platform
  • CI/CD pipeline (for migration jobs)

Step 8: Run Prisma Migrations on Neon

cd stemblockai-backend

# Verify Prisma can connect via direct URL
npx prisma db pull

# Deploy all pending migrations (including HNSW index upgrade)
npx prisma migrate deploy

# Generate fresh Prisma client
npx prisma generate

Verify Migration Applied

# Check HNSW index was created
psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
AND indexdef LIKE '%hnsw%';
"

Expected output:

document_embeddings_embedding_idx | CREATE INDEX ... USING hnsw (embedding vector_cosine_ops) WITH (m=16, ef_construction=64)

Step 9: Test Application

# Start the application
npm run start:dev

# Test health endpoint
curl http://localhost:3001/health

# Test a database query via API
curl http://localhost:3001/api/v1/users/me \
-H "Authorization: Bearer YOUR_JWT_TOKEN"

RAG-Specific Verification

Trigger an AI evaluation to verify the full RAG pipeline works:

  1. Create a test submission
  2. Call POST /api/v1/evaluations/generate/\{submissionId\}
  3. Check logs for RAG context retrieval (or graceful skip if no documents ingested)

Step 10: Handle Cold Starts

Neon scales to zero after 5 minutes of inactivity. First connection after idle takes 500ms-2s.

Impact on RAG

The first RAG query after cold start will:

  1. Wake the database (~500ms-2s)
  2. Load pgvector extension
  3. Execute vector search

This is acceptable for background AI evaluation tasks.

Option A: Accept Cold Starts (Dev/Staging)

No action needed. First request will be slower.

Option B: Keep-Alive (Production)

Ensure your monitoring pings /health every 4 minutes:

# Example cron or uptime monitor
curl https://api.stemblock.ai/health

Option C: Minimum Compute (Premium)

Set minimum CU to 0.5 in Neon Console (prevents scale-to-zero, higher cost).


Step 11: Storage Planning for RAG

RAG data storage per document chunk:

  • content: TEXT (~1KB)
  • metadata: JSONB (~200 bytes)
  • embedding: vector(768) = 3,072 bytes
  • Total per chunk: ~4.3 KB
Neon PlanStorageEst. RAG Capacity
Free512 MB~120K chunks
Launch ($19/mo)10 GB~2.3M chunks
Scale ($69/mo)50 GB~11.5M chunks

For curriculum standards + rubrics + feedback corrections, even the Free tier is sufficient for early stages.


Step 12: Validate and Cutover

12.1 Run Full Test Suite

cd stemblockai-backend
npm run test
npm run test:e2e

12.2 Smoke Test Critical Paths

  • User login/registration
  • Assignment creation
  • Submission upload
  • AI evaluation (verifies RAG pipeline + Neon + pgvector)
  • Parent view
  • English writing evaluation

12.3 Production Cutover Checklist

  • All tests pass on Neon
  • Cold start latency acceptable
  • HNSW index verified
  • DATABASE_URL (pooled) set in production
  • DIRECT_DATABASE_URL (direct) set in production
  • Environment variables updated in all deployments
  • Team notified of migration
  • Rollback plan documented (keep DO running 1 week)

Step 13: Cleanup

After 1 week of successful operation:

  1. Verify no issues in production logs
  2. Delete DigitalOcean database to stop charges
  3. Update documentation with new connection info
  4. Archive backup file to secure storage

Rollback Plan

If issues occur:

  1. Change DATABASE_URL back to DigitalOcean connection string
  2. Remove DIRECT_DATABASE_URL (not needed for DO)
  3. Redeploy application

Note: Data created in Neon after migration won't be in DigitalOcean. Plan cutover during low-traffic period.


Troubleshooting

Connection Timeout

Error: Connection timeout

Solution: Add pool timeout to the pooled connection string:

?pool_timeout=30&connection_limit=10

Prepared Statement Error

Error: prepared statement 's0' already exists

Solution: This occurs with PgBouncer in session pooling mode. Neon's default pooled connection should handle this. Ensure you're using the -pooler URL for DATABASE_URL.

SSL Certificate Error

Error: self signed certificate

Solution: Ensure ?sslmode=require is in both connection strings.

Migration Failed

Error: Migration failed

Solution:

  1. Ensure you're using DIRECT_DATABASE_URL (not pooled) for migrations
  2. Check Neon Console — compute may be suspended
  3. Try: npx prisma migrate reset (dev only!)

pgvector Extension Missing

Error: type "vector" does not exist

Solution:

psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"
npx prisma migrate deploy

Vector Search Returns No Results

Possible causes:

  1. No documents ingested yet — RAG gracefully handles this
  2. HNSW index not created — check with \di in psql
  3. Embedding dimensions mismatch — verify vector(768) matches outputDimensionality: 768

Cost Monitoring

Check Usage in Neon Console

  1. Go to Usage tab
  2. Monitor:
    • Compute hours used
    • Storage GB (watch RAG document growth)
    • Data transfer

Set Up Billing Alerts

  1. Go to Settings > Billing
  2. Set alert threshold (e.g., 80% of budget)
  3. Add email notification

References


Document Version: 2.0 Last Updated: February 27, 2026 Previous Location: stemblockai-docs/NEON_MIGRATION_GUIDE.md