Core Concept
A/A Testing: Validating Your Experimentation Platform
Before you trust your experiment results, run an A/A test to confirm your platform's randomization, metrics, and instrumentation are working correctly.
An A/A test assigns users to two groups but gives both groups the exact same experience with no treatment applied. Because nothing was changed, you should see no statistically significant difference between the groups. If you do, something is wrong with your experimentation setup.
What an A/A test validates
- Randomization integrity: confirms the assignment engine splits users into balanced, unbiased groups
- Metric pipeline accuracy: verifies that your data collection and metric computation don't produce spurious significance
- Instrumentation quality: catches logging bugs, selection bias, or data leakage that could contaminate real experiment results
Real-world example
Run at least 2 A/A tests before trusting a new experimentation tool. Each should run for your typical experiment duration (minimum 14 days). You should see p > 0.05 (no significant difference) roughly 95% of the time, matching the expected false positive rate.
When to re-run A/A tests
- After any changes to user assignment logic or traffic bucketing
- After modifying event tracking, data pipelines, or metric computation
- Quarterly, as a routine health check of your experimentation infrastructure
Beyond the theory
If you've got the theory down, see how it plays out in the simulator.
See the simulator