Methods of Study Design – Experiments

We all are familiar with experiments, we read about them in books or newspapers. Researchers/ scientists perform experiments to validate their hypothesis/ statements or to test a new product. Unlike observational studies, experiments are performed in a controlled environment so that the effect of other external factors/variables can be eliminated from it. Let us understand this in brief.

Suppose in a town, a relationship is being studied between the level of exercise and blood pressure. But in an observational study conducted, it was found that correct exercise is increasing the blood pressure which is contrary to our expectation. It may be possible that those who are exercising might be smoking, drinking and eating rich food at the same time which is causing an increase in their blood pressure. These external factors are called Lurking Variables, which badly affect the study. To reduce their effects, experiments are the best alternative. In an experiment, the participants(subjects) would be kept in a controlled environment and are not allowed to drink or eat junks, thus the researchers will be able to establish that exercise has a positive effect on blood pressure.

Bias(syatematic unfairness in data collection) can be a potential problem in experiments and we need to take it into account while designing experiments. Randomization is a very effective way to solve the bias problem. While choosing subjects for experiments we need to choose them in a perfectly randomized manner. For example, we want to choose subjects for the above blood-pressure example. If we only choose people from the higher-income group, in that case, it would be a bias problem as higher-income individuals mainly come from an educated family and they do less smoking and drinking and also they can afford to visit doctors regularly and have dieticians.

Suppose we want to test the effectiveness of a new drug against a particular disease. We randomly recruit subjects for that. We place them in two groups where one will be receiving treatment with the new drug and another group will receive fake treatment(maybe water/sugar pills instead of the actual medicine). This treatment with fake medicine is called Placebo and the effect observed from this fake treatment is called Placebo effect. This is necessary to examine the effectiveness of the new medicine. Some points to note here:

The chosen subjects should be entitled to one of the two groups in a completely randomized manner to avoid bias.

The subjects shouldn’t know which treatment they are receiving. This is called Blinding. If the subjects know they are receiving the fake treatment they may take up other medicines or do take up a more healthy diet than the other group so the results could be wrong.

Even the researcher should not know which group is receiving which treatment because if he gets to know that beforehand, a firm belief would sit in his mind that the Placebo group will perform badly so he is unconsciously getting more biased towards treatment group before the results are declared. This method where both subjects and researchers don’t know about treatments is called Double-Blinding.

So in conclusion, the ideal experiment is a Randomized Controlled  Double-Blind Experiment.

Some pitfalls of this type of experimentation include:

  1. Suppose an experiment is performed to observe the relationship between the snack habit of a person while watching TV. We collect some participants and let them stay in a house where they can be observed. We divide participants into two groups- one group given a snack and TV and another group given no TV, just magazines and snacks. They are being secretly observed. Now, there is a high probability that in a new environment they might not follow the same habit they use to follow at home. So the setting/environment becomes unrealistic for the experiment. This phenomenon is called Lack of Realism(lack of ecological validity).
  2. Again suppose the participants suspect that they are being observed. So they might behave differently from what they normally behave at home. This is called Hawthorne Effect.

Quality of data is another factor we should keep an eye on. Bad data can result in poor results.

  1. Reliability: It means measurements should have repeatable results. For eg: you measure the blood pressure of a person. Then after some time, you take the measurement again and there is a huge fluctuation in the result. This clearly states, there is some error in the reading of the instrument. So instruments should be checked well before taking measurements.
  2. Unbiasedness: This has been discussed before. Bias can cause a huge error in experimentation results so we need to avoid them.
  3. Validity: Valid data measures what we actually intend to find out. This statement may sound confusing so let us see an example!!! Suppose we want to compare the literate data of a country across decades. Let the number of literate people increased by 5000 in 2010-2020 whereas 3500 in 2000-2010. So apparently it seems that there is more work done on the education sector by the government in the last decade. But we also note that the population growth in 2010-2020 is 3 times the other decade. So though the number of literates increased more in 2010-2020, the literacy rate () remains much low giving the impression that the government performed very poorly in driving educational growth in the country.

Drawing Conclusion from Analysis:

Drawing conclusions from the results of an experiment is not as easy as it seems and if not done carefully, wrong conclusions can be reached. Some of the things we need to take care while concluding include:

  1. Overstated Result: Suppose a researcher published the success of his new drug grandly in all newspapers or press conference. But in reality, he might have tested it only on lab rats or monkeys and yet to test on real human beings. So the claim of success is not completely justified here.
  2. Fail to Check for Lurking Variables: Suppose in the above blood pressure experiment, subjects for the treatment group were accidentally collected from a town where people exercise and take healthy food regularly. So the chance of getting a good result is high. So it is always wise to check the background of the subjects before believing any scientific result.
  3. Generalizing beyond the scope: Suppose you made an experiment on the obesity rates by collecting samples from the Indian population. Since your sample only contains citizens of the India you cannot justify the obesity rates of the whole world with that experiment.


  1. Statistics Essential for Dummies by D. Rumsey
  2. Statistical Reasoning Course by Stanford Ligunita
  3. Introduction to the Practice of Statistics by D. Moore, G. McCabe & B. Craig

About the Author

Aniket Mitra is in the final year of studies in B.E in Electrical Engineering at the University of Burdwan.

About Ryan Swanstrom

Creator of Data Science 101

View all posts by Ryan Swanstrom

Leave a Reply

Your email address will not be published. Required fields are marked *