Skip to main content

StemBlock AI - Monthly Operational Cost Analysis

Date: February 27, 2026 Purpose: Comprehensive monthly cost breakdown including labor, infrastructure, and AI/token usage Audience: Finance, Operations, Leadership


Executive Summary (Optimized with Vertex AI)

Cost CategoryMonthly CostAnnual Cost
Employee Labor$720$8,640
Infrastructure (Neon + Hosting)$27$324
AI/Token Usage (Vertex AI Gemini)$148$1,776
TOTAL$895$10,740

Target Budget: $1,000/monthACHIEVED

Infrastructure Update (Feb 2026): Migrating from DigitalOcean Managed PostgreSQL to Neon Serverless Postgres with native pgvector support. This enables the RAG pipeline to store vector embeddings directly in PostgreSQL (via document_embeddings table with HNSW indexing, 768D vectors using gemini-embedding-001), eliminating the need for a separate vector database service. At current scale (1 customer), Neon Free plan ($0/mo) covers our database needs. Infrastructure costs remain equivalent.

Optimization Summary

MetricBefore (Mistral + DO)After (Vertex AI + Neon)Savings
AI Cost/Month$6,479$14897.7%
Infrastructure/Month$27$1737.0%
Total Cost/Month$7,226$88587.8%
Annual Cost$86,712$10,620$76,092

1. Employee Costs

Current Staffing

RoleHours/MonthHourly RateMonthly Cost
Employee40$18.00$720

Annual Employee Cost: $720 × 12 = $8,640

Cost Breakdown by Activity (Estimated)

ActivityHours/Month% of TimeCost
Development & Maintenance1640%$288
Testing & QA820%$144
Support & Bug Fixes820%$144
Documentation & Planning410%$72
Meetings & Communication410%$72
TOTAL40100%$720

2. Infrastructure Costs (Neon + Hosting)

Current Monthly Spend

ResourceSpecificationMonthly Cost
App Platform / HostingBasic tier~$12
Database (Neon Serverless Postgres)Free plan (100 CU-hours, 0.5GB)$0
Database (Neon Launch plan, when needed)Pay-as-you-go ($0.106/CU-hour)~$15
Storage / CDNAs needed~$5
TOTAL (current, Free plan)~$17
TOTAL (projected, Launch plan)~$32

Annual Infrastructure Cost (current): $17 × 12 = $204 Annual Infrastructure Cost (Launch plan): $32 × 12 = $384

Why Neon Over DigitalOcean?

FactorDigitalOcean Managed DBNeon Serverless Postgres
pgvector supportNot includedNative extension, no extra cost
ScalingFixed instance sizeAutoscale 0.25–16 CU, scale-to-zero
RAG vector storageRequires separate vector DBBuilt-in via pgvector
Cost at idle~$10/mo (always running)$0 (scales to zero after 5 min)
Cost at scale$60+/mo (Professional tier)$0.106–0.222/CU-hour (pay for usage)
Connection poolingNot includedBuilt-in PgBouncer

Neon Plan Scaling Path

StageNeon PlanEst. MonthlyTriggers
Current (1 customer)Free$0100 CU-hours, 0.5GB sufficient
Growth (5–30 customers)Launch$15–50Exceed free tier compute/storage
Scale (50–300 customers)Scale$69–300Need >16 CU, private networking, SLAs

3. AI/Token Usage Costs (Optimized with Vertex AI)

Provider Configuration

FeatureModelReason
English WritingGemini 2.5 ProHigher quality for nuanced writing feedback
STEM EvaluationGemini 2.5 FlashFast, cost-effective for structured evaluation
Coach FeedbackGemini 2.5 FlashFast, cost-effective
Parent InsightsGemini 2.5 FlashFast, cost-effective

Vertex AI Gemini Pricing (Per 1M Tokens)

ModelInput (≤200K)OutputCached Input
Gemini 2.5 Pro$1.25$10.00$0.125
Gemini 2.5 Flash$0.30$2.50$0.030

Optimized AI Costs (With Caching Implemented)

STEM Evaluation (Gemini 2.5 Flash)

  • Volume: 4,800 submissions/month
  • Caching: 50% hit rate (7-day cache) → 2,400 actual API calls
  • Tokens/call: ~3,000 input + ~600 output
ComponentMonthly TokensRateCost
Input7.2M$0.30/M$2.16
Output1.44M$2.50/M$3.60
Subtotal8.64M$5.76

Cost per evaluation: $0.0024 (vs $0.67 before)


English Writing Workflow (Gemini 2.5 Pro)

  • Volume: 1,900 submissions/month
  • 3-stage pipeline: Moderation → Feedback → Assessment
  • Tokens/submission: ~2,500 input + ~3,500 output
StageTokens (Input/Output)Cost
Moderation1.0M / 0.5M$1.25 + $5.00
Feedback1.5M / 1.5M$1.88 + $15.00
Assessment2.0M / 1.5M$2.50 + $15.00
Subtotal4.5M / 3.5M$40.63

Cost per submission: $0.021 (vs $0.88 before)


Coach Feedback Generation (Gemini 2.5 Flash)

  • Volume: 1,920 calls/month
  • Caching: 30% hit rate → 1,344 actual API calls
  • Tokens/call: ~3,000 input + ~700 output
ComponentMonthly TokensRateCost
Input4.0M$0.30/M$1.20
Output0.94M$2.50/M$2.35
Subtotal4.94M$3.55

Cost per feedback: $0.0026 (vs $0.71 before)


Parent Insights (Gemini 2.5 Flash)

  • Volume: 1,500 generations/month
  • Caching: 85% hit rate (1-week cache) → 225 actual API calls
  • Tokens/call: ~1,500 input + ~500 output
ComponentMonthly TokensRateCost
Input0.34M$0.30/M$0.10
Output0.11M$2.50/M$0.28
Subtotal0.45M$0.38

Cost per insight: $0.0017 (vs $0.14 before)


AI Cost Summary (Optimized)

FeatureVolume/MonthActual CallsMonthly CostCost/Use
STEM Evaluation4,8002,400$5.76$0.0024
English Writing1,9001,900$40.63$0.021
Coach Feedback1,9201,344$3.55$0.0026
Parent Insights1,500225$0.38$0.0017
Subtotal$50.32
Vertex AI Platform Fee (~10%)$5.03
Buffer/Overhead (~50%)$25.16
Network Egress (est.)$10.00
TOTAL AI COST$90.51

Conservative Estimate (with 50% buffer): ~$148/month

Annual AI Cost: $148 × 12 = $1,776


4. Before vs After Comparison

Cost Reduction by Feature

FeatureBefore (Mistral)After (Vertex AI)Reduction
STEM Evaluation$3,226$5.7699.8%
English Writing$1,673$40.6397.6%
Coach Feedback$1,370$3.5599.7%
Parent Insights$210$0.3899.8%
TOTAL$6,479$50.3299.2%

Cost Per Use Comparison

FeatureBeforeAfterSavings
STEM Evaluation$0.67$0.002499.6%
English Writing$0.88$0.02197.6%
Coach Feedback$0.71$0.002699.6%
Parent Insights$0.14$0.001798.8%

5. Total Monthly Cost Breakdown (Optimized)

Summary

CategoryMonthly Cost% of TotalAnnual
Employee Labor$72081.4%$8,640
AI/Token Usage$14816.7%$1,776
Infrastructure (Neon Free + Hosting)$171.9%$204
TOTAL$885100%$10,620

Visual Breakdown

Total: $885/month (Target: $1,000) ✅ Under Budget

┌─────────────────────────────────────────────────────┐
│ Employee Labor (81.4%) │████████░ $720 │
├─────────────────────────────────────────────────────┤
│ AI/Token Usage (16.7%) │██ $148 │
├─────────────────────────────────────────────────────┤
│ Infrastructure (1.9%) │ $17 │
└─────────────────────────────────────────────────────┘

Budget Remaining: $115/month (11.5% headroom)

6. Scaling Analysis

Room to Grow Within $1,000 Budget

With $115/month headroom, you can increase usage:

ScenarioAdditional CapacityNew AI CostTotal Cost
+50% STEM Evaluations+2,400/month+$3$888
+100% English Writing+1,900/month+$41$926
+100% All Features2x volume+$50$935
Maximum Scale (Free plan)3x current+$100$985
With Neon Launch plan5x+ current+$100$1,000

Cost at Different Usage Levels

Usage LevelAI Calls/MonthAI CostTotal CostStatus
Current10,120$148$895✅ Under Budget
2x Current20,240$246$993✅ At Budget
3x Current30,360$344$1,091⚠️ Over Budget
5x Current50,600$540$1,287❌ Over Budget

7. Implementation Checklist

Completed ✅

  • Extended caching (7-day for STEM, 1-week for insights)
  • Parent insights 85% cache hit rate
  • pgvector migration added (document_embeddings table with HNSW indexing, 768D)
  • RAG module implemented (embedding, vector-store, ingestion, query services)
  • Embedding model migrated from text-embedding-005 (256D) to gemini-embedding-001 (768D)

Required for Neon Migration

  • Create Neon project (stemblock-production)
  • Export data from DigitalOcean (pg_dump)
  • Import data to Neon (pg_restore)
  • Enable pgvector extension on Neon (CREATE EXTENSION IF NOT EXISTS vector)
  • Run Prisma migrations on Neon (npx prisma migrate deploy)
  • Update DATABASE_URL in production environment
  • Configure Neon autoscaling limits (min 0.25 CU, max 2–4 CU)
  • Verify cold start latency is acceptable
  • Keep DigitalOcean running 1 week as rollback
  • Decommission DigitalOcean database after successful validation

Required for Vertex AI Migration

  • Set up Google Cloud Project
  • Enable Vertex AI API
  • Create service account with appropriate permissions
  • Update backend to use Vertex AI SDK
  • Configure Gemini 2.5 Pro for English Writing
  • Configure Gemini 2.5 Flash for other features
  • Test in staging environment
  • Deploy to production

Vertex AI Setup Costs (One-Time)

ItemCost
GCP Project SetupFree
Vertex AI API EnableFree
Initial Testing (~1M tokens)~$3
Total Setup Cost~$3

8. Cost Per User Metrics (Optimized)

Assumptions

  • 500 active coaches
  • 2,000 active students
  • 500 active parents

Cost Per User (Optimized)

User TypeCountMonthly CostCost/User
Coach500$895$1.79
Student2,000$895$0.45
Parent500$895$1.79
Per Active User (All)3,000$895$0.30

Comparison

MetricBeforeAfterImprovement
Cost per user$2.41$0.3087.6% lower
Cost per coach$14.45$1.7987.6% lower

9. Revenue vs. Cost Analysis (Optimized)

Current Subscription Revenue

TierUsersPriceMonthly Revenue
COMMUNITY450$0$0
TEAM40$29$1,160
ENTERPRISE10$299$2,990
TOTAL500$4,150

Profitability Analysis (Optimized)

MetricBeforeAfter
Monthly Revenue$4,150$4,150
Monthly Costs$7,226$895
Monthly Profit/Loss-$3,076+$3,255
Annual Profit/Loss-$36,912+$39,060

Status:NOW PROFITABLE

Gross Margin

MetricValue
Revenue$4,150
Costs$895
Gross Profit$3,255
Gross Margin78.4%

10. Key Metrics Dashboard (Optimized)

Monthly KPIs

MetricTargetOptimizedStatus
Total Monthly Cost<$1,000$895✅ On Target
AI Cost per Evaluation<$0.10$0.0024✅ Excellent
Cost per Active User<$1.00$0.30✅ Excellent
Employee % of Total<85%80.4%✅ On Target
Revenue : Cost Ratio>1.04.64✅ Excellent
Gross Margin>50%78.4%✅ Excellent

Budget Utilization

Monthly Budget: $1,000

Spent: $895 ████████████████████░░░ 89.5%
Remaining: $105 ░░░░░░░░░░░░░░░░░░░░░░░ 10.5%

11. Risk Factors & Mitigations

Potential Cost Increases

RiskImpactMitigation
Token price increase+$50-100/monthLock in pricing, monitor alternatives
Usage spike (viral growth)+$50-150/monthImplement rate limiting, tier quotas
Model deprecationMigration effortStay on stable/GA models
Vertex AI outageService disruptionMaintain Mistral fallback
Neon cold start latency500ms–2s on first queryHealth check ping every 4 min in production
Neon Free plan limits exceededNeed to upgrade to Launch ($15/mo)Monitor CU-hours via Neon Console

Cost Monitoring

  • Set up GCP billing alerts at $75, $100, $150
  • Weekly token usage review
  • Monthly cost reconciliation

12. Summary

Monthly Cost Achievement

ComponentCostTargetStatus
Employee$720-Fixed
Infrastructure (Neon Free + Hosting)$17-Variable
AI (Vertex AI)$148-Variable
TOTAL$885$1,00011.5% Under

Annual Savings

MetricAmount
Previous Annual Cost$86,712
Optimized Annual Cost$10,620
Annual Savings$76,092

Key Optimizations Applied

  1. Caching implemented - 50-85% reduction in API calls
  2. Switched to Vertex AI Gemini - 97-99% reduction in token costs
  3. Model tiering - Gemini 2.5 Pro for quality, Flash for speed
  4. Neon + pgvector migration - Serverless PostgreSQL with built-in vector DB, scale-to-zero, ~$10/mo savings vs. DigitalOcean

Appendix: Vertex AI Pricing Reference

Gemini 2.5 Pro (Per 1M Tokens)

Type≤200K Context>200K ContextCached
Input$1.25$2.50$0.125
Output$10.00$15.00N/A

Gemini 2.5 Flash (Per 1M Tokens)

Type≤200K Context>200K ContextCached
Input$0.30$0.30$0.030
Output$2.50$2.50N/A

Source: Vertex AI Pricing


Document Version: 3.0 Last Updated: February 27, 2026 Status: Active - Neon + pgvector Migration Owner: Operations Team Next Review: March 27, 2026