Neon Database Migration Guide

Step-by-step guide to migrate StemBlock AI from DigitalOcean PostgreSQL to Neon, including RAG/pgvector migration, directUrl support, and HNSW index upgrade.

Prerequisites

Neon account (sign up at https://neon.tech)
Access to DigitalOcean database credentials
PostgreSQL client tools (psql, pg_dump)
Node.js and npm installed

Step 1: Create Neon Project

Go to https://console.neon.tech
Click "Create a project"
Configure:
- Project name: stemblock-production (or stemblock-staging)
- Postgres version: 16 (or latest available)
- Region: Choose closest to your users
Click "Create project"
Copy both connection strings — pooled and direct

Neon Connection String Formats

# Pooled (for application runtime — goes through PgBouncer)
postgresql://[user]:[password]@[endpoint]-pooler.[region].aws.neon.tech/[database]?sslmode=require

# Direct (for migrations and schema changes — bypasses PgBouncer)
postgresql://[user]:[password]@[endpoint].[region].aws.neon.tech/[database]?sslmode=require

You need both. The pooled connection is for app runtime; the direct connection is for Prisma migrations.

Step 2: Configure Cost Controls

Before migrating data, set spending limits:

In Neon Console, go to Settings > Billing
Set Autoscaling limits:
- Minimum: 0.25 CU (allows scale-to-zero)
- Maximum: 2 CU (cost ceiling for dev/staging)
- For production: 4 CU recommended
Set Auto-suspend delay: 5 minutes (default)
Enable Spending alerts at your budget threshold

Recommended Settings by Environment

Environment	Min CU	Max CU	Plan
Development	0.25	1	Free
Staging	0.25	2	Launch ($19/mo)
Production	0.5	4	Scale ($69/mo)

Step 3: Export Data from DigitalOcean

3.1 Get DigitalOcean Connection Details

From DigitalOcean dashboard, get: Host, Port, Database name, Username, Password.

3.2 Export Schema and Data

# Set environment variables
export DO_HOST="your-do-host.db.ondigitalocean.com"
export DO_PORT="25060"
export DO_USER="stemblock"
export DO_DB="stemblock_db"
export DO_PASSWORD="your-password"

# Export full database (schema + data)
PGPASSWORD=$DO_PASSWORD pg_dump \
  -h $DO_HOST \
  -p $DO_PORT \
  -U $DO_USER \
  -d $DO_DB \
  -F c \
  -f stemblock_backup.dump

# Verify export size
ls -lh stemblock_backup.dump

3.3 Alternative: Export as SQL

PGPASSWORD=$DO_PASSWORD pg_dump \
  -h $DO_HOST \
  -p $DO_PORT \
  -U $DO_USER \
  -d $DO_DB \
  --no-owner \
  --no-privileges \
  > stemblock_backup.sql

Step 4: Import Data to Neon

4.1 Using Neon Import Tool (Recommended for large DBs)

Go to Neon Console > Your Project > Import
Upload your dump file or provide source connection string
Follow the guided import process

4.2 Using pg_restore (Custom format)

# Use DIRECT connection (not pooled) for data import
export NEON_DIRECT_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/dbname?sslmode=require"

pg_restore \
  -d "$NEON_DIRECT_URL" \
  --no-owner \
  --no-privileges \
  --clean \
  --if-exists \
  stemblock_backup.dump

4.3 Using psql (SQL format)

psql "$NEON_DIRECT_URL" < stemblock_backup.sql

Step 5: Verify Core Migration

5.1 Check Table Counts

psql "$NEON_DIRECT_URL" -c "
SELECT schemaname, tablename, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
"

5.2 Verify ENUMs Created

psql "$NEON_DIRECT_URL" -c "
SELECT typname FROM pg_type WHERE typtype = 'e' ORDER BY typname;
"

5.3 Verify Indexes

psql "$NEON_DIRECT_URL" -c "
SELECT indexname, tablename
FROM pg_indexes
WHERE schemaname = 'public'
ORDER BY tablename;
"

Step 6: Verify RAG / pgvector Migration

Neon has native pgvector support — no extra installation needed.

6.1 Verify pgvector Extension

psql "$NEON_DIRECT_URL" -c "SELECT * FROM pg_extension WHERE extname = 'vector';"

If missing (shouldn't be after pg_restore):

psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"

6.2 Verify RAG Tables

psql "$NEON_DIRECT_URL" -c "
SELECT 'document_embeddings' AS table_name, COUNT(*) FROM document_embeddings
UNION ALL
SELECT 'ai_feedback', COUNT(*) FROM ai_feedback
UNION ALL
SELECT 'fine_tuning_jobs', COUNT(*) FROM fine_tuning_jobs;
"

6.3 Verify Vector Index

psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
  AND indexdef LIKE '%vector%';
"

6.4 Test Vector Similarity Search

psql "$NEON_DIRECT_URL" -c "
SELECT id, 1 - (embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)) AS score
FROM document_embeddings
ORDER BY embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)
LIMIT 5;
"

6.5 Upgrade IVFFlat → HNSW Index

The codebase includes a migration (20260227100000_upgrade_ivfflat_to_hnsw_index) that upgrades the vector index. If running Prisma migrations, this runs automatically. To apply manually:

psql "$NEON_DIRECT_URL" -c "
-- Drop old IVFFlat index
DROP INDEX IF EXISTS document_embeddings_embedding_idx;

-- Create HNSW index (better recall for dynamic datasets)
CREATE INDEX document_embeddings_embedding_idx ON document_embeddings
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
"

Why HNSW over IVFFlat:

Factor	IVFFlat (old)	HNSW (new)
Recall	~95%	~99%
Dynamic inserts	Needs periodic re-indexing	Handles inserts well
Best for	Static datasets	Growing datasets (RAG + feedback loop)
Build time	Faster	Slower (one-time cost)

Step 7: Update Application Configuration

7.1 Prisma Schema Change (Already Applied)

The Prisma schema now supports Neon's dual-connection architecture:

datasource db {
  provider  = "postgresql"
  url       = env("DATABASE_URL")          // Pooled (app runtime)
  directUrl = env("DIRECT_DATABASE_URL")   // Direct (migrations)
}

How it works:

DATABASE_URL (pooled) — used by the running application for all queries, goes through PgBouncer
DIRECT_DATABASE_URL (direct) — used only by prisma migrate and prisma db push, bypasses PgBouncer

7.2 Update Environment Variables

# Before (DigitalOcean — single connection)
# DATABASE_URL="postgresql://stemblock:password@host.db.ondigitalocean.com:25060/stemblock_db?sslmode=require"

# After (Neon — dual connections)
DATABASE_URL="postgresql://user:pass@ep-xyz-pooler.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"
DIRECT_DATABASE_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"

Note the difference: pooled URL has -pooler in the hostname, direct URL does not.

7.3 Update All Environments

Update these environment variables in:

Local .env file
Staging deployment platform
Production deployment platform
CI/CD pipeline (for migration jobs)

Step 8: Run Prisma Migrations on Neon

cd stemblockai-backend

# Verify Prisma can connect via direct URL
npx prisma db pull

# Deploy all pending migrations (including HNSW index upgrade)
npx prisma migrate deploy

# Generate fresh Prisma client
npx prisma generate

Verify Migration Applied

# Check HNSW index was created
psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
  AND indexdef LIKE '%hnsw%';
"

Expected output:

document_embeddings_embedding_idx | CREATE INDEX ... USING hnsw (embedding vector_cosine_ops) WITH (m=16, ef_construction=64)

Step 9: Test Application

# Start the application
npm run start:dev

# Test health endpoint
curl http://localhost:3001/health

# Test a database query via API
curl http://localhost:3001/api/v1/users/me \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

RAG-Specific Verification

Trigger an AI evaluation to verify the full RAG pipeline works:

Create a test submission
Call POST /api/v1/evaluations/generate/\{submissionId\}
Check logs for RAG context retrieval (or graceful skip if no documents ingested)

Step 10: Handle Cold Starts

Neon scales to zero after 5 minutes of inactivity. First connection after idle takes 500ms-2s.

Impact on RAG

The first RAG query after cold start will:

Wake the database (~500ms-2s)
Load pgvector extension
Execute vector search

This is acceptable for background AI evaluation tasks.

Option A: Accept Cold Starts (Dev/Staging)

No action needed. First request will be slower.

Option B: Keep-Alive (Production)

Ensure your monitoring pings /health every 4 minutes:

# Example cron or uptime monitor
curl https://api.stemblock.ai/health

Option C: Minimum Compute (Premium)

Set minimum CU to 0.5 in Neon Console (prevents scale-to-zero, higher cost).

Step 11: Storage Planning for RAG

RAG data storage per document chunk:

content: TEXT (~1KB)
metadata: JSONB (~200 bytes)
embedding: vector(768) = 3,072 bytes
Total per chunk: ~4.3 KB

Neon Plan	Storage	Est. RAG Capacity
Free	512 MB	~120K chunks
Launch ($19/mo)	10 GB	~2.3M chunks
Scale ($69/mo)	50 GB	~11.5M chunks

For curriculum standards + rubrics + feedback corrections, even the Free tier is sufficient for early stages.

Step 12: Validate and Cutover

12.1 Run Full Test Suite

cd stemblockai-backend
npm run test
npm run test:e2e

12.2 Smoke Test Critical Paths

12.3 Production Cutover Checklist

All tests pass on Neon
Cold start latency acceptable
HNSW index verified
DATABASE_URL (pooled) set in production
DIRECT_DATABASE_URL (direct) set in production
Environment variables updated in all deployments
Team notified of migration
Rollback plan documented (keep DO running 1 week)

Step 13: Cleanup

After 1 week of successful operation:

Verify no issues in production logs
Delete DigitalOcean database to stop charges
Update documentation with new connection info
Archive backup file to secure storage

Rollback Plan

If issues occur:

Change DATABASE_URL back to DigitalOcean connection string
Remove DIRECT_DATABASE_URL (not needed for DO)
Redeploy application

Note: Data created in Neon after migration won't be in DigitalOcean. Plan cutover during low-traffic period.

Troubleshooting

Connection Timeout

Error: Connection timeout

Solution: Add pool timeout to the pooled connection string:

?pool_timeout=30&connection_limit=10

Prepared Statement Error

Error: prepared statement 's0' already exists

Solution: This occurs with PgBouncer in session pooling mode. Neon's default pooled connection should handle this. Ensure you're using the -pooler URL for DATABASE_URL.

SSL Certificate Error

Error: self signed certificate

Solution: Ensure ?sslmode=require is in both connection strings.

Migration Failed

Error: Migration failed

Solution:

Ensure you're using DIRECT_DATABASE_URL (not pooled) for migrations
Check Neon Console — compute may be suspended
Try: npx prisma migrate reset (dev only!)

pgvector Extension Missing

Error: type "vector" does not exist

Solution:

psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"
npx prisma migrate deploy

Vector Search Returns No Results

Possible causes:

No documents ingested yet — RAG gracefully handles this
HNSW index not created — check with \di in psql
Embedding dimensions mismatch — verify vector(768) matches outputDimensionality: 768

Cost Monitoring

Check Usage in Neon Console

Go to Usage tab
Monitor:
- Compute hours used
- Storage GB (watch RAG document growth)
- Data transfer

Set Up Billing Alerts

Go to Settings > Billing
Set alert threshold (e.g., 80% of budget)
Add email notification

References

Document Version: 2.0 Last Updated: February 27, 2026 Previous Location: stemblockai-docs/NEON_MIGRATION_GUIDE.md

Prerequisites​

Step 1: Create Neon Project​

Neon Connection String Formats​

Step 2: Configure Cost Controls​

Recommended Settings by Environment​

Step 3: Export Data from DigitalOcean​

3.1 Get DigitalOcean Connection Details​

3.2 Export Schema and Data​

3.3 Alternative: Export as SQL​

Step 4: Import Data to Neon​

4.1 Using Neon Import Tool (Recommended for large DBs)​

4.2 Using pg_restore (Custom format)​

4.3 Using psql (SQL format)​

Step 5: Verify Core Migration​

5.1 Check Table Counts​

5.2 Verify ENUMs Created​

5.3 Verify Indexes​

Step 6: Verify RAG / pgvector Migration​

6.1 Verify pgvector Extension​

6.2 Verify RAG Tables​

6.3 Verify Vector Index​

6.4 Test Vector Similarity Search​

6.5 Upgrade IVFFlat → HNSW Index​

Step 7: Update Application Configuration​

7.1 Prisma Schema Change (Already Applied)​

7.2 Update Environment Variables​

7.3 Update All Environments​

Step 8: Run Prisma Migrations on Neon​

Verify Migration Applied​

Step 9: Test Application​

RAG-Specific Verification​

Step 10: Handle Cold Starts​

Impact on RAG​

Option A: Accept Cold Starts (Dev/Staging)​

Option B: Keep-Alive (Production)​

Option C: Minimum Compute (Premium)​

Step 11: Storage Planning for RAG​

Step 12: Validate and Cutover​

12.1 Run Full Test Suite​

12.2 Smoke Test Critical Paths​

12.3 Production Cutover Checklist​

Step 13: Cleanup​

Rollback Plan​

Troubleshooting​

Connection Timeout​

Prepared Statement Error​

SSL Certificate Error​

Migration Failed​

pgvector Extension Missing​

Vector Search Returns No Results​

Cost Monitoring​

Check Usage in Neon Console​

Set Up Billing Alerts​

References​