Comment by Palmik - Hacker Neue

This isn't just a startup thing. This is common also at FAANG.

Not only are expriments commonly multi-arm, you also repeat your experiment (usually after making some changes) if the previous experiment failed / did not pass the launch criteria.

This is further complicated by the fact that lauch criteria is usually not well defined ahead of time. Unless it's a complete slam dunk, you won't know until your launch meeting whether the experiment will be approved for launch or not. It's mostly vibe based, determined based on tens or hundreds of "relevant" metric movements, often decided on the whim of the stakeholder sitting at the lauch meeting.

netcan 3 days ago

Is this terrible?

The idea is not do do science. The idea is to loosely systematize and conceptualize innovation. To generate options and create a failure tolerant system.

I'm sure improvements could be made... but this isn't about being a valid or invalid expirement.

godelski 3 days ago

  > The idea is not do do science. The idea is to loosely systematize and conceptualize innovation.

Why are you acting like these are completely different frameworks? You have the same goals

Rastonbury 2 days ago

The standard for science a much higher ie. publishing a effect when it arose by chance as an academic

When you A/B test generally mistakes are reversible and will not make your company bankrupt or lose your job. Something being a 1 in 20 fluke is acceptable risk, you'll get most decisions right. Compare this however to hairy decisions on entering a new market or creating a new product line, there are no A/B tests or scientific frameworks here, you gather all the evidence you can, estimate the risk and make a decision

zeroCalories 2 days ago

It seems bad to me because you're giving yourself the illusion of effect.

setgree 3 days ago

You're describing conditioning analyses on data. Gelman and Loken (2013) put it like this:

> The problem is there can be a large number of potential comparisons when the details of data analysis are highly contingent on data, without the researcher having to perform any conscious procedure of fishing or examining multiple p values. We discuss in the context of several examples of published papers where data-analysis decisions were theoretically-motivated based on previous literature, but where the details of data selection and analysis were not pre specified and, as a result, were contingent on data.

This item has no comments currently.