pkhuong parent
One easy slip-up with discussing p values in the context of a workflow or a decision-making process is that a process with p < 0.05 doesn't give us any bound on the actual ratio of actually good VS lucky changes. If we only consider good changes, the fraction of false positive changes is 0%; if we only consider bad changes, that fraction is 100%. Hypothesis testing is no replacement for insight or taste.