Correlation and Causation When Analyzing Data

In most statistics or data analysis books (at least the ones I have), there is always an explanation that states that correlation is not necessarily causation. With correlation, there is no prior assumption that two variables are dependent on each other but only estimates the degree of association between them (Sharda et al., 2021).

The misunderstanding or intentional fallacy of the correlation between variables can affect the bottom line of a business. For instance, if a new report shows a correlation between a bad disease and your product, it will affect your business because people will stop consuming your products. Even though your product is not causing the disease, the public will take it as a fact and move on to competitors.

Please watch the following video that present a good example with ice cream and drowning.

Using the ice cream example, a business can decide to avoid or ban selling their ice cream by beaches and pools as a reaction to this study. This means that they will lose a lot of money by not selling in an area that has good factors for selling ice cream. For instance, in the Summer there will be open pools and beaches in hot weather with a lot of people including kids playing which could be the perfect conditions for the selling of ice cream. If a business makes the right decisions and is careful about the correlation between variables, it will be drastically affected. It is important to use common sense and assume that there is no causation between two correlated variables unless there are experiments that prove them correct (Gutman & Goldmeier, 2021).

We still make those mistakes of correlation and causation because we tend to associate events and come up with conclusions. This issue might happen because the person conducting the study could have some monetary gain with the results or because he just forgot to investigate more before reaching the conclusion. For example, there could be poorly designed experiments that yield some results, and as some researchers need to produce results, they publish what they have without further study (Book, 2020). This could be a topic on its own but it is also related to how we trust data without further study.

References

  • Sharda, R., Delen, D., & Turban, E. (2021). Analytics, Data Science, & Artificial Intelligence: Systems for Decision Support. (11th ed.). Pearson.
  • Gutman, A. J., & Goldmeier, J. (2021). Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning. John Wiley & Sons.
  • Book, J. (2020, September 24). Why So Much Science is Wrong, False, Puffed, or Misleading – AIER. Www.aier.org. https://www.aier.org/article/why-so-much-science-is-wrong-false-puffed-or-misleading/
Teylor Feliz
Teylor Feliz

Teylor is a seasoned generalist that enjoys learning new things. He has over 20 years of experience wearing different hats that include software engineer, UX designer, full-stack developer, web designer, data analyst, database administrator, and others. He is the founder of Haketi, a small firm that provides services in design, development, and consulting.

Over the last ten years, he has taught hundreds of students at an undergraduate and graduate levels. He loves teaching and mentoring new designers and developers to navigate the rapid changing field of UX design and engineering.

Articles: 182