Statistical Method
Statistical Power (1 - β): Detecting Real Effects
Power measures your experiment's ability to detect a real effect when one truly exists. Low power means you risk missing genuine improvements.
Statistical power (1 - β, where β is the Type II error rate) is the probability that your experiment will correctly detect a real effect when one exists. A Type II error (false negative) occurs when a genuine improvement exists but your experiment fails to detect it, meaning you might abandon a winning idea.
Power = 80% (industry standard)
If a real effect of your specified size exists, you have an 80% chance of detecting it
Common pitfall
Never run experiments with power below 80%. You'd have a greater than 1-in-5 chance of missing a real improvement. If your available traffic is too low, extend the experiment duration or narrow the target audience to increase the per-user sample. Do not accept low power as a tradeoff.
Best practice
For high-impact, strategic experiments (major product redesigns, pricing changes), target power = 90% to minimize the risk of missing a meaningful effect.
Beyond the theory
If you've got the theory down, see how it plays out in the simulator.
See the simulator