CI Testing Architecture
This document describes the monorepo CI testing architecture, designed to handle multiple services independently with full rollback safety.
CI Testing Architecture
Overview
This document describes the monorepo CI testing architecture, designed to handle multiple services independently with full rollback safety.
Architecture Diagram
┌───────────────────────────────────────────────────────────────┐
│ GitHub Actions CI │
├───────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ │
│ │ Lint (Root) │ ← Runs once for entire monorepo │
│ └──────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Test Standalone (Matrix - No Dependencies) │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────┐ ┌──────────┐ ┌───────┐ ┌────────┐ │ │
│ │ │ Catalog │ │ User │ │ Media │ │ Search │ │ │
│ │ │ test │ │ test │ │ test │ │test:ci │ │ │
│ │ └─────────┘ └──────────┘ └───────┘ └────────┘ │ │
│ │ │ │
│ │ ✅ No service containers - fast startup │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Test with MongoDB (Matrix) │ │
│ ├────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌─────────────────┐ │ │
│ │ │ Payment │ │ │
│ │ │ test │ │ │
│ │ │ +MongoDB:7 │ │ │
│ │ └─────────────────┘ │ │
│ │ │ │
│ │ ✅ Only 1 MongoDB container (not 5) │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Future: Test with Elasticsearch (When Needed) │ │
│ │ For services that require Elasticsearch │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ │
│ │
└───────────────────────────────────────────────────────────────┘Design Principles
1. Job-Level Isolation
Services are grouped into separate jobs based on their infrastructure dependencies:
test-standalone: Services with no external dependencies (catalog, user, media, search)test-with-mongodb: Services requiring MongoDB (payment)- Future:
test-with-elasticsearchwhen needed
Why separate jobs?
- GitHub Actions service containers are job-level, not matrix-level
- Each matrix iteration would get ALL service containers, regardless of conditional logic
- With 5 services and 2 containers, that's 10 containers (wasteful)
- Separate jobs = only start containers when needed
2. Working Directory Approach
The architecture uses working-directory in the test step to ensure Bun's test runner only scans that service's directory:
- name: Run tests
working-directory: ${{ matrix.service.path }}
run: ${{ matrix.service.test_command || 'bun test' }}Why this works:
- Bun's test runner scans from current working directory using
**/*.test.tsglob - When CWD is
services/catalog, it only findsservices/catalog/**/*.test.ts - This bypasses Bun's limitation with
--excludeflag
3. Resource Efficiency
Before (conditional services approach - doesn't work):
- 5 matrix services × 2 containers each = 10 containers started
- All containers idle except for payment job
- Wasted ~90% of container resources
After (separate jobs):
- 4 standalone services × 0 containers = 0
- 1 MongoDB service × 1 container = 1
- Result: 90% reduction in container starts
4. Fail-Fast Disabled
strategy:
fail-fast: falseThis ensures all services are tested even if one fails, providing complete test coverage.
5. Custom Test Commands
Services can override the default bun test command:
- name: search
test_command: bun run test:ci # Excludes integration tests6. Common Pitfall: Conditional Services Don't Work
⚠️ Incorrect approach (looks valid, but wastes resources):
services:
mongodb:
# ❌ This does NOT prevent container from starting
options: ${{ matrix.service.needs_mongodb && '--health-cmd=...' || '' }}Why it fails:
optionsonly passes Docker CLI flags- Container ALWAYS starts regardless of options value
- Empty options = container starts without health checks (still wasteful)
Correct approach: Use separate jobs (as implemented)
Service Configuration
Job: test-standalone
Services with no external dependencies (fast startup, no containers):
| Service | Path | Test Command | Notes |
|---|---|---|---|
| catalog | services/catalog | bun test | Example service |
| user | services/user | bun test | Example service |
| media | services/media | bun test | Example service |
| search | services/search | bun run test:ci | Excludes integration tests |
Job: test-with-mongodb
Services requiring MongoDB:
| Service | Path | Test Command | Container |
|---|---|---|---|
| payment | services/payment | bun test | MongoDB 7 |
Adding a New Service
Step 1: Determine Dependencies
Does your service need external infrastructure?
- No dependencies → Add to
test-standalonejob - Needs MongoDB → Add to
test-with-mongodbjob - Needs Elasticsearch → Create
test-with-elasticsearchjob (see template below) - Needs multiple → Create dedicated job
Step 2: Update Workflow
Example: Service with no dependencies
Add to test-standalone matrix in .github/workflows/ci.yml:
test-standalone:
name: Test - ${{ matrix.service.name }}
strategy:
matrix:
service:
# ... existing services ...
- name: your-new-service
path: services/your-new-service
# Optional: custom test command
test_command: bun run test:customExample: Service needing MongoDB
Add to test-with-mongodb matrix:
test-with-mongodb:
name: Test - ${{ matrix.service.name }}
strategy:
matrix:
service:
# ... existing services ...
- name: your-new-service
path: services/your-new-serviceExample: Service needing Elasticsearch (create new job)
test-with-elasticsearch:
name: Test - ${{ matrix.service.name }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
service:
- name: your-new-service
path: services/your-new-service
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
env:
discovery.type: single-node
xpack.security.enabled: false
ports:
- 9200:9200
options: --health-cmd="curl -f http://localhost:9200/_cluster/health" --health-interval=10s
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
- run: bun install --frozen-lockfile
- name: Run tests
working-directory: ${{ matrix.service.path }}
run: ${{ matrix.service.test_command || 'bun test' }}
env:
ELASTICSEARCH_URL: http://localhost:9200Step 3: Add Environment Variables (if needed)
If your service needs specific env vars, add them to the env: section:
env:
YOUR_VAR: ${{ matrix.service.name == 'your-new-service' && 'value' || '' }}Step 4: Ensure Service Has Test Script
In services/your-new-service/package.json:
{
"scripts": {
"test": "bun test"
}
}Step 5: Test Locally
cd services/your-new-service
bun testRollback Procedures
Scenario A: CI Changes Break Tests (Before Merge)
If tests fail on your PR:
-
Fix the workflow:
vim .github/workflows/ci.yml git add .github/workflows/ci.yml git commit -m "fix(ci): correct service configuration" git push -
PR updates automatically, CI re-runs
-
Main branch is unaffected - no rollback needed
Scenario B: CI Changes Merged and Breaking Main
If CI breaks after merging to main:
Option 1: Quick Revert (Recommended)
# Find the commit that broke CI
git log --oneline -10
# Revert the commit
git revert <commit-sha>
git push origin mainOption 2: Restore from Backup
# Use the backup file committed on 2025-11-11
git checkout main
cp .github/workflows/ci.yml.backup-20251111 .github/workflows/ci.yml
git add .github/workflows/ci.yml
git commit -m "revert(ci): restore working CI configuration from backup"
git push origin mainScenario C: Emergency Rollback (Nuclear Option)
If git history is complex:
# Force reset main to last known good commit
git checkout main
git reset --hard <last-good-commit-sha>
git push --force origin main # ⚠️ USE WITH CAUTION⚠️ Warning: Only use force push if:
- No one else is working on main
- You have team coordination
- No better option exists
Testing the Rollback
Before deploying changes, verify rollback procedure works:
# Save current state
cp .github/workflows/ci.yml /tmp/new-ci.yml
# Test rollback to main version
git checkout main -- .github/workflows/ci.yml
git diff # Should show revert to main version
# Restore new version
cp /tmp/new-ci.yml .github/workflows/ci.ymlDebugging CI Failures
Check Specific Service Job
- Go to GitHub Actions run
- Click on failed service in matrix (e.g., "Test - payment")
- Expand "Run tests" step
- Check error messages
Run Locally
cd services/<service-name>
bun testCheck Environment Variables
Verify service has required env vars in workflow:
env:
MONGODB_URI: ${{ matrix.service.needs_mongodb && 'mongodb://localhost:27017/crop-test' || '' }}Common Issues
Issue: Service not found
- Check
pathin matrix matches actual directory - Verify service has
package.jsonwith test script
Issue: Tests pass locally but fail in CI
- Check environment variables
- Verify dependencies are installed (
bun install --frozen-lockfile) - Check if service needs MongoDB/Elasticsearch
Issue: Wrong tests running
- Verify
working-directoryis set correctly - Check
test_commandmatchespackage.jsonscript - Ensure service uses correct test script
Performance Characteristics
Current Setup (Matrix Strategy)
- Lint job: ~1-2 minutes
- Test jobs (parallel): ~1-3 minutes each
- Total CI time: ~3-4 minutes (limited by slowest service)
Benefits
- Parallel execution: All services test simultaneously
- Fast feedback: Failures visible within minutes
- Isolated failures: One service failure doesn't block others
- Clear attribution: Easy to see which service failed
Migration History
Previous Approaches (Failed)
-
Attempt 1:
bun test --exclude='**/service/**'- Bun ignores
--excludeflag
- Bun ignores
-
Attempt 2:
bun test services/catalog services/user ...- ❌ Failed: Bun ignores explicit paths
- Still scanned entire monorepo
Current Approach (Successful)
Matrix strategy with working-directory
- ✅ Works: Bun only scans from CWD
- ✅ Reliable: Each service isolated
- ✅ Maintainable: Clear service list in matrix
Future Improvements
Potential Enhancements
- Cache optimization: Cache
node_modulesper service - Path-based triggers: Only test changed services
- Service-specific workflows: Migrate to separate workflow files
- Test coverage tracking: Integrate coverage reporting
- Performance monitoring: Track test execution times
Separate Workflows Migration
If matrix becomes too complex, consider migrating to per-service workflows:
.github/workflows/
catalog-ci.yml # triggers: services/catalog/**
user-ci.yml # triggers: services/user/**
payment-ci.yml # triggers: services/payment/**Benefits:
- Complete isolation
- Can deploy services independently
- Clear ownership per service
Migration path:
- Create new workflow for one service
- Remove from matrix
- Test for a week
- Repeat for other services
Verification Checklist
Before merging CI changes:
- All services have tests that pass locally
- Matrix strategy runs all services in parallel
- Failed service doesn't block others (
fail-fast: false) - Each service has correct working directory
- Environment variables are service-specific
- Documentation is updated
- Rollback procedure is documented and tested
- Backup configuration is committed
- PR has successful CI run
Monitoring
GitHub Actions Status Checks
Set up branch protection rules:
- Go to repo Settings → Branches → main
- Add rule: "Require status checks to pass before merging"
- Select:
lintandtestjobs - Enable: "Require branches to be up to date before merging"
Metrics to Track
- Total CI duration (target: < 5 minutes)
- Service-specific test times
- Failure rates per service
- Flaky test incidents
Support
Questions?
- Check this documentation first
- Review backup file:
.github/workflows/ci.yml.backup-20251111 - Test locally:
cd services/<service> && bun test - Check GitHub Actions logs for detailed errors
Need to Rollback?
- Use backup file (fastest)
- Revert commit (cleanest)
- Reset to last good commit (last resort)
Related Documentation
PRODUCTION_URLS.md- Production service URLsservices/search/CLAUDE.md- Search service development guideservices/search/docs/TESTCONTAINERS_BUN_COMPATIBILITY.md- Integration testing guide
API Endpoints Authentication Audit
Date: 2025-11-13 Status: Phase 1 - Initial Audit Purpose: Classify all API endpoints by authentication requirements
CI/CD Guide - CROP Microservices
Last Updated: 2025-11-19 Purpose: Continuous Integration and Deployment automation for CROP microservices Platform: Google Cloud Build, GitHub Actions