Neon Database Migration Guide
Step-by-step guide to migrate StemBlock AI from DigitalOcean PostgreSQL to Neon, including RAG/pgvector migration, directUrl support, and HNSW index upgrade.
Prerequisites
- Neon account (sign up at https://neon.tech)
- Access to DigitalOcean database credentials
- PostgreSQL client tools (
psql,pg_dump) - Node.js and npm installed
Step 1: Create Neon Project
- Go to https://console.neon.tech
- Click "Create a project"
- Configure:
- Project name:
stemblock-production(orstemblock-staging) - Postgres version: 16 (or latest available)
- Region: Choose closest to your users
- Project name:
- Click "Create project"
- Copy both connection strings — pooled and direct
Neon Connection String Formats
# Pooled (for application runtime — goes through PgBouncer)
postgresql://[user]:[password]@[endpoint]-pooler.[region].aws.neon.tech/[database]?sslmode=require
# Direct (for migrations and schema changes — bypasses PgBouncer)
postgresql://[user]:[password]@[endpoint].[region].aws.neon.tech/[database]?sslmode=require
You need both. The pooled connection is for app runtime; the direct connection is for Prisma migrations.
Step 2: Configure Cost Controls
Before migrating data, set spending limits:
- In Neon Console, go to Settings > Billing
- Set Autoscaling limits:
- Minimum: 0.25 CU (allows scale-to-zero)
- Maximum: 2 CU (cost ceiling for dev/staging)
- For production: 4 CU recommended
- Set Auto-suspend delay: 5 minutes (default)
- Enable Spending alerts at your budget threshold
Recommended Settings by Environment
| Environment | Min CU | Max CU | Plan |
|---|---|---|---|
| Development | 0.25 | 1 | Free |
| Staging | 0.25 | 2 | Launch ($19/mo) |
| Production | 0.5 | 4 | Scale ($69/mo) |
Step 3: Export Data from DigitalOcean
3.1 Get DigitalOcean Connection Details
From DigitalOcean dashboard, get: Host, Port, Database name, Username, Password.
3.2 Export Schema and Data
# Set environment variables
export DO_HOST="your-do-host.db.ondigitalocean.com"
export DO_PORT="25060"
export DO_USER="stemblock"
export DO_DB="stemblock_db"
export DO_PASSWORD="your-password"
# Export full database (schema + data)
PGPASSWORD=$DO_PASSWORD pg_dump \
-h $DO_HOST \
-p $DO_PORT \
-U $DO_USER \
-d $DO_DB \
-F c \
-f stemblock_backup.dump
# Verify export size
ls -lh stemblock_backup.dump
3.3 Alternative: Export as SQL
PGPASSWORD=$DO_PASSWORD pg_dump \
-h $DO_HOST \
-p $DO_PORT \
-U $DO_USER \
-d $DO_DB \
--no-owner \
--no-privileges \
> stemblock_backup.sql
Step 4: Import Data to Neon
4.1 Using Neon Import Tool (Recommended for large DBs)
- Go to Neon Console > Your Project > Import
- Upload your dump file or provide source connection string
- Follow the guided import process
4.2 Using pg_restore (Custom format)
# Use DIRECT connection (not pooled) for data import
export NEON_DIRECT_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/dbname?sslmode=require"
pg_restore \
-d "$NEON_DIRECT_URL" \
--no-owner \
--no-privileges \
--clean \
--if-exists \
stemblock_backup.dump
4.3 Using psql (SQL format)
psql "$NEON_DIRECT_URL" < stemblock_backup.sql
Step 5: Verify Core Migration
5.1 Check Table Counts
psql "$NEON_DIRECT_URL" -c "
SELECT schemaname, tablename, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
"
5.2 Verify ENUMs Created
psql "$NEON_DIRECT_URL" -c "
SELECT typname FROM pg_type WHERE typtype = 'e' ORDER BY typname;
"
5.3 Verify Indexes
psql "$NEON_DIRECT_URL" -c "
SELECT indexname, tablename
FROM pg_indexes
WHERE schemaname = 'public'
ORDER BY tablename;
"
Step 6: Verify RAG / pgvector Migration
Neon has native pgvector support — no extra installation needed.
6.1 Verify pgvector Extension
psql "$NEON_DIRECT_URL" -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
If missing (shouldn't be after pg_restore):
psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"
6.2 Verify RAG Tables
psql "$NEON_DIRECT_URL" -c "
SELECT 'document_embeddings' AS table_name, COUNT(*) FROM document_embeddings
UNION ALL
SELECT 'ai_feedback', COUNT(*) FROM ai_feedback
UNION ALL
SELECT 'fine_tuning_jobs', COUNT(*) FROM fine_tuning_jobs;
"
6.3 Verify Vector Index
psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
AND indexdef LIKE '%vector%';
"
6.4 Test Vector Similarity Search
psql "$NEON_DIRECT_URL" -c "
SELECT id, 1 - (embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)) AS score
FROM document_embeddings
ORDER BY embedding <=> (SELECT embedding FROM document_embeddings LIMIT 1)
LIMIT 5;
"
6.5 Upgrade IVFFlat → HNSW Index
The codebase includes a migration (20260227100000_upgrade_ivfflat_to_hnsw_index) that upgrades the vector index. If running Prisma migrations, this runs automatically. To apply manually:
psql "$NEON_DIRECT_URL" -c "
-- Drop old IVFFlat index
DROP INDEX IF EXISTS document_embeddings_embedding_idx;
-- Create HNSW index (better recall for dynamic datasets)
CREATE INDEX document_embeddings_embedding_idx ON document_embeddings
USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
"
Why HNSW over IVFFlat:
| Factor | IVFFlat (old) | HNSW (new) |
|---|---|---|
| Recall | ~95% | ~99% |
| Dynamic inserts | Needs periodic re-indexing | Handles inserts well |
| Best for | Static datasets | Growing datasets (RAG + feedback loop) |
| Build time | Faster | Slower (one-time cost) |
Step 7: Update Application Configuration
7.1 Prisma Schema Change (Already Applied)
The Prisma schema now supports Neon's dual-connection architecture:
datasource db {
provider = "postgresql"
url = env("DATABASE_URL") // Pooled (app runtime)
directUrl = env("DIRECT_DATABASE_URL") // Direct (migrations)
}
How it works:
DATABASE_URL(pooled) — used by the running application for all queries, goes through PgBouncerDIRECT_DATABASE_URL(direct) — used only byprisma migrateandprisma db push, bypasses PgBouncer
7.2 Update Environment Variables
# Before (DigitalOcean — single connection)
# DATABASE_URL="postgresql://stemblock:password@host.db.ondigitalocean.com:25060/stemblock_db?sslmode=require"
# After (Neon — dual connections)
DATABASE_URL="postgresql://user:pass@ep-xyz-pooler.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"
DIRECT_DATABASE_URL="postgresql://user:pass@ep-xyz.us-east-2.aws.neon.tech/stemblock_db?sslmode=require"
Note the difference: pooled URL has -pooler in the hostname, direct URL does not.
7.3 Update All Environments
Update these environment variables in:
- Local
.envfile - Staging deployment platform
- Production deployment platform
- CI/CD pipeline (for migration jobs)
Step 8: Run Prisma Migrations on Neon
cd stemblockai-backend
# Verify Prisma can connect via direct URL
npx prisma db pull
# Deploy all pending migrations (including HNSW index upgrade)
npx prisma migrate deploy
# Generate fresh Prisma client
npx prisma generate
Verify Migration Applied
# Check HNSW index was created
psql "$NEON_DIRECT_URL" -c "
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'document_embeddings'
AND indexdef LIKE '%hnsw%';
"
Expected output:
document_embeddings_embedding_idx | CREATE INDEX ... USING hnsw (embedding vector_cosine_ops) WITH (m=16, ef_construction=64)
Step 9: Test Application
# Start the application
npm run start:dev
# Test health endpoint
curl http://localhost:3001/health
# Test a database query via API
curl http://localhost:3001/api/v1/users/me \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
RAG-Specific Verification
Trigger an AI evaluation to verify the full RAG pipeline works:
- Create a test submission
- Call
POST /api/v1/evaluations/generate/\{submissionId\} - Check logs for RAG context retrieval (or graceful skip if no documents ingested)
Step 10: Handle Cold Starts
Neon scales to zero after 5 minutes of inactivity. First connection after idle takes 500ms-2s.
Impact on RAG
The first RAG query after cold start will:
- Wake the database (~500ms-2s)
- Load pgvector extension
- Execute vector search
This is acceptable for background AI evaluation tasks.
Option A: Accept Cold Starts (Dev/Staging)
No action needed. First request will be slower.
Option B: Keep-Alive (Production)
Ensure your monitoring pings /health every 4 minutes:
# Example cron or uptime monitor
curl https://api.stemblock.ai/health
Option C: Minimum Compute (Premium)
Set minimum CU to 0.5 in Neon Console (prevents scale-to-zero, higher cost).
Step 11: Storage Planning for RAG
RAG data storage per document chunk:
content: TEXT (~1KB)metadata: JSONB (~200 bytes)embedding: vector(768) = 3,072 bytes- Total per chunk: ~4.3 KB
| Neon Plan | Storage | Est. RAG Capacity |
|---|---|---|
| Free | 512 MB | ~120K chunks |
| Launch ($19/mo) | 10 GB | ~2.3M chunks |
| Scale ($69/mo) | 50 GB | ~11.5M chunks |
For curriculum standards + rubrics + feedback corrections, even the Free tier is sufficient for early stages.
Step 12: Validate and Cutover
12.1 Run Full Test Suite
cd stemblockai-backend
npm run test
npm run test:e2e
12.2 Smoke Test Critical Paths
- User login/registration
- Assignment creation
- Submission upload
- AI evaluation (verifies RAG pipeline + Neon + pgvector)
- Parent view
- English writing evaluation
12.3 Production Cutover Checklist
- All tests pass on Neon
- Cold start latency acceptable
- HNSW index verified
-
DATABASE_URL(pooled) set in production -
DIRECT_DATABASE_URL(direct) set in production - Environment variables updated in all deployments
- Team notified of migration
- Rollback plan documented (keep DO running 1 week)
Step 13: Cleanup
After 1 week of successful operation:
- Verify no issues in production logs
- Delete DigitalOcean database to stop charges
- Update documentation with new connection info
- Archive backup file to secure storage
Rollback Plan
If issues occur:
- Change
DATABASE_URLback to DigitalOcean connection string - Remove
DIRECT_DATABASE_URL(not needed for DO) - Redeploy application
Note: Data created in Neon after migration won't be in DigitalOcean. Plan cutover during low-traffic period.
Troubleshooting
Connection Timeout
Error: Connection timeout
Solution: Add pool timeout to the pooled connection string:
?pool_timeout=30&connection_limit=10
Prepared Statement Error
Error: prepared statement 's0' already exists
Solution: This occurs with PgBouncer in session pooling mode. Neon's default pooled connection should handle this. Ensure you're using the -pooler URL for DATABASE_URL.
SSL Certificate Error
Error: self signed certificate
Solution: Ensure ?sslmode=require is in both connection strings.
Migration Failed
Error: Migration failed
Solution:
- Ensure you're using
DIRECT_DATABASE_URL(not pooled) for migrations - Check Neon Console — compute may be suspended
- Try:
npx prisma migrate reset(dev only!)
pgvector Extension Missing
Error: type "vector" does not exist
Solution:
psql "$NEON_DIRECT_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"
npx prisma migrate deploy
Vector Search Returns No Results
Possible causes:
- No documents ingested yet — RAG gracefully handles this
- HNSW index not created — check with
\diin psql - Embedding dimensions mismatch — verify
vector(768)matchesoutputDimensionality: 768
Cost Monitoring
Check Usage in Neon Console
- Go to Usage tab
- Monitor:
- Compute hours used
- Storage GB (watch RAG document growth)
- Data transfer
Set Up Billing Alerts
- Go to Settings > Billing
- Set alert threshold (e.g., 80% of budget)
- Add email notification
References
- Neon Documentation
- Prisma + Neon Guide
- Neon Connection Pooling
- Neon pgvector Guide
- pgvector HNSW Indexing
- pg_dump Documentation
Document Version: 2.0
Last Updated: February 27, 2026
Previous Location: stemblockai-docs/NEON_MIGRATION_GUIDE.md