August 5, 2024 Data Backup & Recovery

The Golden Rule of Backups: You're Not Backed Up Until You've Tested Recovery

Having backups ≠ being able to recover. The uncomfortable truth about Salesforce backup vendors. How to run recovery drills that actually prepare you for disasters.

By Tyler Colby

The False Sense of Security

You pay $15,000/year for Salesforce backups.

Your backup vendor has a dashboard. Green checkmarks everywhere. "Backup successful." Daily snapshots for the past 90 days. Metadata and data. All good.

You feel safe.

You shouldn't.

Because here's the question nobody asks until disaster strikes:

Can you actually restore from those backups?

Not "does the backup vendor say you can restore." Not "does the marketing page promise recovery." Can YOU, your team, with your org, recover your data in a timeframe that matters?

If you haven't tested it, the answer is: you don't know.

And "you don't know" is not a backup strategy. It's hope. And hope is not a disaster recovery plan.

The Uncomfortable Truth About Backup Vendors

Backup vendors are excellent at one thing: creating backups.

They're mediocre at another thing: helping you recover.

Here's what actually happens when disaster strikes:

Scenario: You Need Emergency Recovery

It's 8:47 AM on Tuesday. Someone just accidentally deleted 50,000 Account records. Sales team is panicking. You need these back NOW.

You open your backup vendor's portal. Click "Restore."

What happens next:

Select backup point: Yesterday's backup (24 hours old)
Select objects: Account (check). Related objects? Contact, Opportunity, Case... you select them all to be safe
Click "Restore"
Estimate: 4-6 hours

Okay. Not ideal, but you can work with 4-6 hours.

6 hours later: Restore completes. You check the org.

The 50,000 Accounts are back. But:

Opportunities are linked to the wrong Accounts (relationship IDs don't match)
Contacts that were created in the past 24 hours are missing (they weren't in yesterday's backup)
Account fields that changed in the past 24 hours reverted to old values
Duplicate Accounts now exist (backup restored + manually recreated ones)

You didn't restore your data. You created a data integrity disaster.

Now you need to:

Deduplicate Accounts (manually or via tool)
Reconcile field values (which version is correct?)
Fix broken relationships (re-link Opportunities to correct Accounts)
Merge manually created records with restored ones

Timeline: 2-3 days of manual work. Sales operations halted. Customer data questionable.

This is not recovery. This is damage control.

Architect's Note: Backup vendors typically capture data snapshots but don't preserve referential integrity metadata needed for clean restores. Salesforce architects recommend implementing immutable backup strategies with External ID preservation—every record should have a consistent External ID across all backups, enabling upsert-based recovery instead of insert-only restores. The Well-Architected principle of Trusted requires that backups be tested regularly (quarterly minimum) with metrics: Recovery Time Objective (RTO), Recovery Point Objective (RPO), and Data Integrity Score post-recovery.

Why Most Backups Fail During Recovery

Problem 1: Point-in-Time vs. Current State Mismatch

Your backup is from yesterday at 2 AM. It's now 9 AM the next day.

In the past 31 hours, your org has changed:

247 new Accounts created
1,832 Opportunities updated
4,194 Contacts modified
89 Accounts deleted (intentionally)

You restore from backup. What happens to those 31 hours of changes?

Option A: Backup vendor overwrites everything. You lose 31 hours of work.
Option B: Backup vendor creates duplicates. You now have to reconcile.
Option C: Backup vendor skips records that exist. Missing data stays missing.

None of these are good.

The correct answer is Option D: 3-way merge with conflict resolution—but most backup vendors don't do this. It's too complex. Too expensive. Too likely to cause support tickets.

Problem 2: Relationship Integrity

Salesforce objects are connected by ID references. Account has related Opportunities. Opportunities have related OpportunityLineItems. OpportunityLineItems reference Products and PricebookEntries.

When you delete an Account and restore it, Salesforce assigns a new ID.

Old Account ID: 001abc123XYZ
Restored Account ID: 001xyz789ABC

All the Opportunities that referenced 001abc123XYZ? They now point to nothing. Relationship broken.

Most backup vendors restore data without fixing relationships. You get the records back, but they're orphaned. Disconnected. Useless.

Fixing this requires:

Mapping old IDs to new IDs
Updating all child records to reference new parent IDs
Handling circular references (Opportunity → Account → Primary Contact → Opportunity)
Preserving master-detail cascade delete rules

This is possible. But it's not automatic. And if your backup vendor doesn't provide this, you're doing it manually.

Architect's Note: Relationship-aware recovery requires External IDs on all major objects. Salesforce architects recommend implementing a Global_ID__c field (External ID, unique, case-sensitive) on Account, Contact, Opportunity, and custom objects. During recovery, use upsert operations with External ID matching—this preserves relationships because External IDs don't change across restore operations. The Composite API supports relationship-aware inserts via nested sobject structures, enabling parent + children restoration in a single atomic transaction.

Problem 3: Metadata Drift

Your backup was taken when you had 350 custom fields on Account. You've since added 12 new fields.

You restore from backup. Those 12 new fields? Null for all restored records.

Or worse: You've removed 5 fields since the backup was taken. The backup tries to restore data to fields that no longer exist. Restore fails.

Backup vendors handle this in different ways:

Fail the restore entirely: "Schema mismatch detected"
Skip unmapped fields: Data loss, no warning
Require manual field mapping: You specify which old fields map to which new fields

All of these are painful during an emergency.

Problem 4: Validation Rule Conflicts

Your backup was taken before you implemented strict validation rules. You restore old data that doesn't meet current validation standards.

Salesforce rejects the restore: "Required field missing" or "Invalid picklist value" or "Formula field constraint violated."

You have two choices:

Disable validation rules: Restore succeeds, but you've just allowed invalid data into production
Clean the data before restore: Manual ETL process, 24-48 hour delay

Neither is acceptable during a crisis.

The Golden Rule

You're not backed up unless you've tested recovery.

And by "tested recovery," I don't mean:

❌ Reading the backup vendor's documentation
❌ Watching the vendor's demo video
❌ Restoring a single test record to a sandbox

I mean:

✅ Full-scale recovery drill in a sandbox org
✅ Restore production volume (not 10 records—restore 100K records and see what breaks)
✅ Verify relationships (are Opportunities still linked to Accounts?)
✅ Measure recovery time (how long did it actually take?)
✅ Document gaps (what failed? what required manual intervention?)
✅ Repeat quarterly (schemas change, vendors change, processes change)

If you haven't done all of this, you have backups. But you don't have recovery capability.

How to Run a Recovery Drill

Step 1: Create a Test Scenario

Don't test generic "restore everything." Test realistic disaster scenarios:

Scenario A: Accidental mass delete of Accounts (5,000 records)
Scenario B: Corrupted Opportunity data (mass update overwrote critical fields)
Scenario C: Full org restore (simulate ransomware or complete data loss)

Pick one scenario per drill. Be specific.

Step 2: Refresh a Full Sandbox

Use a Full Copy sandbox (or at minimum, Partial Copy with production data volume).

Developer sandboxes with 10 records don't test anything meaningful. You need production scale to find problems.

Step 3: Simulate the Disaster

In your sandbox:

Delete the 5,000 Accounts (or whatever your scenario requires)
Document: deletion timestamp, number of records, affected objects

Step 4: Attempt Recovery

Use your backup vendor's tools to restore:

Select the backup point (e.g., previous day's snapshot)
Configure restore options (overwrite? upsert? insert?)
Execute restore
Start the timer

Step 5: Validate Recovery

After restore completes, verify:

Validation Check	Pass Criteria	Failure Impact
Record Count Match	Deleted records restored: 5,000/5,000	Data loss
Relationship Integrity	All Opportunities linked to correct Accounts	Broken references
Field Value Accuracy	Spot-check 50 records: fields match expected values	Data corruption
No Duplicate Creation	Zero duplicate Accounts created	Deduplication required
Validation Rules Pass	All restored records pass current validation	Invalid data in org
Incremental Changes Preserved	Records created after backup still exist	Data loss (recent changes)

Step 6: Measure Recovery Metrics

Document the results:

Recovery Time: How long from "start restore" to "validated data"?
Data Integrity Score: Percentage of validations passed
Manual Intervention Required: List of steps that weren't automated
Gaps Identified: What failed? What's missing?

Step 7: Document and Remediate

Create a post-drill report:

What worked
What failed
Root cause analysis for failures
Remediation plan (what needs to change before next drill)

Share with leadership. If recovery doesn't meet business requirements (e.g., "we need 4-hour RTO but drill showed 12 hours"), escalate.

Real Drill Results

Company A: Healthcare SaaS

Drill Scenario: Restore 50,000 deleted Patient records
Expected Recovery Time: 4 hours (per vendor documentation)
Actual Recovery Time: 14 hours
Data Integrity Score: 68%

Failures:

Relationships between Patient and Treatment Plan broken (32% of records orphaned)
Backup didn't include related Case records (assumed they'd be restored automatically—they weren't)
3,400 duplicate Patient records created (no External ID matching)
Manual deduplication required: 18 hours of admin time

Outcome: Leadership approved $45K investment in External ID implementation + relationship-aware recovery tooling. Next drill: 4.5 hours, 97% data integrity.

Company B: Financial Services

Drill Scenario: Full org restore (simulate ransomware)
Expected Recovery Time: 24 hours
Actual Recovery Time: Restore failed after 6 hours
Data Integrity Score: N/A (restore aborted)

Failures:

Metadata schema mismatch: 47 fields added to production since backup
Backup vendor required manual field mapping (not automated)
Validation rules implemented after backup prevented restore
No documented process for handling schema drift

Outcome: Implemented nightly metadata snapshots + schema versioning. Migrated to backup vendor that supports schema evolution. Next drill: 22 hours, 94% data integrity.

Architect's Note: Recovery drills expose gaps that documentation never reveals. Salesforce architects recommend quarterly recovery drills with rotating scenarios—don't test the same disaster twice. The Well-Architected Adaptable principle means your recovery process must evolve as your org evolves. Document RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements from business stakeholders, then validate you can meet them. If drills show you can't meet requirements, escalate immediately—waiting for a real disaster to discover this gap is career-limiting.

What "Recovery-Ready" Actually Looks Like

Organizations that can reliably recover have:

1. External IDs on All Major Objects

Every Account, Contact, Opportunity, and custom object has a Global_ID__c field (External ID, unique).

This enables upsert-based recovery: restore operations use External ID matching, preserving relationships and avoiding duplicates.

2. Relationship Metadata Preservation

Backups include not just record data, but relationship structure:

Parent-child mappings (which Opportunity belongs to which Account)
Lookup field values (stored as External IDs, not Salesforce IDs)
Master-detail relationships (preserved during restore)

3. Incremental Backup Strategy

Not just daily snapshots. Use Change Data Capture to stream all changes to external storage.

This enables point-in-time recovery: restore to yesterday 2 AM, then replay CDC events to bring org to current state (or any state between backup and now).

4. Metadata Versioning

Store metadata snapshots alongside data backups. When you restore data from 30 days ago, you know exactly what the schema looked like.

This enables schema-aware recovery: transformation logic to map old fields to new fields, handle removed fields, populate new required fields.

5. Documented Recovery Procedures

Step-by-step playbooks for common scenarios:

Accidental mass deletion recovery
Corrupted field recovery (restore just one field across all records)
Full org restore
Partial object restore (Accounts only, preserve relationships)

Playbooks include: commands to run, vendor-specific steps, validation checks, expected timelines.

6. Quarterly Recovery Drills

Scheduled. Non-negotiable. Documented. Measured.

If you skip drills, you're not recovery-ready. You're hoping nothing breaks.

The Bottom Line

Paying for backups ≠ having recovery capability.

Most organizations discover this during their first real disaster—when it's too late to fix.

The Golden Rule: You're not backed up unless you've tested recovery.

If you haven't run a full-scale recovery drill in the past 90 days, you don't know if you can recover.

And if you don't know, assume you can't.

Because when disaster strikes at 3:47 AM on a Friday, "I thought we could recover" is not a plan.

Test your backups. Now. Before you need them.

Need Help Testing Your Backup Recovery?

We offer recovery drill planning and execution. We'll design realistic disaster scenarios, run full-scale recovery tests in your sandbox, validate data integrity, and document gaps. Get a real answer to: "Can we actually recover?"

Request Recovery Drill Assessment Email Tyler

← Previous: Salesforce Rescue Launch Next: Multi-Org Sync Center Launch →