August 1, 2025 Architecture • Integration

API-First Salesforce Architecture for Multi-Org Environments

Direct database queries and tight coupling break at scale. Here's how to build for flexibility from day one.

By Tyler Colby

The Problem with Tight Coupling

Traditional Salesforce architecture: Apex triggers call external APIs directly, external systems query Salesforce via SOQL, everything knows everything about everything else.

This works—until you need to:

  • Sync data across multiple orgs
  • Replace an external system without touching 47 Apex classes
  • Exit Salesforce without rewriting every integration

API-First Principles

1. Events Over Direct Calls

Don't call external systems from Apex. Publish events.

// Bad: Direct callout
trigger AccountTrigger on Account (after update) {
  Http h = new Http();
  HttpRequest req = new HttpRequest();
  req.setEndpoint('https://erp.example.com/api/customers');
  // ... tight coupling to ERP
}
// Good: Platform Event
trigger AccountTrigger on Account (after update) {
  List<Account_Changed__e> events = new List<Account_Changed__e>();
  for (Account acc : Trigger.new) {
    events.add(new Account_Changed__e(
      Global_ID__c = acc.Global_ID__c,
      Name__c = acc.Name,
      Industry__c = acc.Industry
    ));
  }
  EventBus.publish(events);
}

2. Middleware as Integration Hub

Middleware subscribes to Salesforce events and routes to external systems.

  • Salesforce → Platform Event → Middleware → ERP/Billing/Marketing
  • External system → Middleware → Salesforce REST API

Middleware handles: authentication, rate limiting, retries, logging, transformations.

3. REST APIs for External Access

External systems never use SOQL directly. Only call documented REST endpoints.

@RestResource(urlMapping='/api/v1/accounts/*')
global class AccountAPI {
  @HttpGet
  global static AccountResponse getAccount() {
    RestRequest req = RestContext.request;
    String globalId = req.requestURI.substringAfterLast('/');
    
    Account acc = [SELECT Id, Global_ID__c, Name, Industry 
                   FROM Account 
                   WHERE Global_ID__c = :globalId LIMIT 1];
    
    return new AccountResponse(acc);
  }
}

Multi-Org Sync Pattern

Architecture

  • Each org publishes Data_Change__e Platform Events
  • Middleware (Sync Engine) subscribes to events from all orgs
  • Sync Engine applies conflict resolution, then writes to target orgs via REST API

Event Schema

Data_Change__e {
  Global_ID__c (Text 255),
  Object_Type__c (Text 50),
  Operation__c (CREATE | UPDATE | DELETE),
  Field_Changes_JSON__c (Long Text),
  Source_Org__c (Text 50),
  Timestamp__c (DateTime)
}

Middleware Logic

// Pseudocode
on_event_received(event):
  target_orgs = get_target_orgs(event.Source_Org__c)
  
  for target_org in target_orgs:
    conflict = check_conflict(event, target_org)
    
    if conflict and conflict.strategy == MANUAL:
      create_conflict_task(event, target_org)
    else:
      resolved = resolve_conflict(event, conflict)
      apply_change(target_org, resolved)
      log_sync(event, target_org, SUCCESS)

Exit-Ready Design

When you exit Salesforce:

  • Platform Events → Generic message bus (Kafka, RabbitMQ)
  • Middleware stays the same (or minimal changes)
  • External systems don't notice—still calling middleware APIs

Rate Limiting and Bulkification

Problem

Publishing 1 event per record = API limits hit fast.

Solution: Batch Events

trigger AccountTrigger on Account (after update) {
  List<Account_Changed__e> events = new List<Account_Changed__e>();
  
  for (Account acc : Trigger.new) {
    events.add(new Account_Changed__e(
      Global_ID__c = acc.Global_ID__c,
      // ... fields
    ));
    
    // Publish in batches of 100
    if (events.size() == 100) {
      EventBus.publish(events);
      events.clear();
    }
  }
  
  if (!events.isEmpty()) {
    EventBus.publish(events);
  }
}

Middleware: Bulk API for Writes

Don't write 1 record at a time to target org. Use Composite API or Bulk API 2.0.

POST /services/data/v58.0/composite/sobjects
{
  "allOrNothing": false,
  "records": [
    {"attributes": {"type": "Account"}, "Global_ID__c": "...", "Name": "..."},
    // ... up to 200 records
  ]
}

Error Handling and Retries

Idempotency

Use Global_ID__c as upsert key. Same event delivered twice = no duplicate.

POST /services/data/v58.0/sobjects/Account/Global_ID__c/CUST-00123
{
  "Name": "Acme Corp",
  "Industry": "Technology"
}

Dead Letter Queue

If sync fails after 3 retries, send to DLQ for manual review.

  • Store failed event + error message + timestamp
  • Alert ops team via Slack/PagerDuty
  • Provide UI to replay from DLQ after fixing root cause

Observability

Metrics to Track

  • Events published per org per minute
  • Sync lag (time from event publish to target org write)
  • Conflict rate (%)
  • DLQ depth (failed events)
  • API call usage (% of daily limit)

Instrumentation

trigger AccountTrigger on Account (after update) {
  Long startTime = System.currentTimeMillis();
  
  // ... publish events
  
  Long duration = System.currentTimeMillis() - startTime;
  Metrics__c.log('account_event_publish_ms', duration);
}
Architect's Note: API-first isn't about APIs—it's about decoupling. When systems communicate via stable, documented interfaces instead of shared databases, you can swap components without breaking everything. Well-Architected "Modular" means loose coupling at the interface boundary.

Real-World Performance

Multi-org sync deployment, 3 orgs, 2.4M records synced/month:

  • Avg sync lag: 6.2 seconds (P95: 18 seconds)
  • Event publish overhead: 12ms per trigger execution
  • API usage: 18% of daily limit (plenty of headroom)
  • Conflict rate: 0.14%
  • DLQ depth: <10 events/day (transient network errors)

Security

Platform Event Access

  • Create dedicated integration user with Publish_Events permission set
  • Platform Event object-level security via permission sets
  • Field-level security on event fields (don't publish PII if not needed)

Middleware Authentication

  • OAuth 2.0 JWT bearer flow (not username/password)
  • Separate connected app per org
  • Rotate client secrets quarterly
  • IP whitelisting for middleware endpoints

When NOT to Use Platform Events

  • Real-time response required (<100ms): use synchronous REST callout
  • Guaranteed ordering critical: use Change Data Capture instead
  • Extremely high volume (>1M events/day): may hit event delivery limits

Migration Path

Already have tight coupling? Incremental migration:

  1. Add Platform Event publishing alongside existing callouts (dual-write)
  2. Build middleware to consume events and write to external systems
  3. Validate middleware output matches direct callouts (run in parallel for 30 days)
  4. Cut over: disable direct callouts, rely on middleware
  5. Remove callout code from Apex

Checklist

  1. Define Platform Event schema with Global_ID__c, Object_Type__c, Operation__c
  2. Publish events from triggers (bulkified, batched)
  3. Build middleware to subscribe and route
  4. Expose REST APIs for inbound writes (use Global_ID__c for upserts)
  5. Implement idempotency, retries, DLQ
  6. Add observability (metrics, logs, alerts)
  7. Secure with OAuth JWT, IP whitelisting, permission sets
  8. Test failover: kill middleware, verify DLQ accumulates, restart, verify replay

Need Help Building API-First Architecture?

We design Platform Event schemas, build sync middleware, and implement multi-org sync with conflict resolution—fully instrumented and production-ready.