Assignment Task:
Thoughtful synthesis, Barbara! As you move forward, think about the effect size you'd report for your area and set a practical benchmark to guide interpretation and sample-size planning--draw those thresholds from current studies and meta-analyses in your field.
The Importance of Effect Sizes in Psychology:
Over time, psychology has emphasized on the null hypothesis significance test whereby the main question has been whether a result is significant at the p < .05 threshold. That ritual is vague according to the article by Jacob Cohen in 1994 titled The Earth is Round (p <.05). He termed the continuation of significance testing as a mechanical activity that tends to bring false confidence and hence misleading in the manner in which psychologists interpret their results. The chapter on effect sizes continues the criticism by demonstrating why psychology should take note of the size of the findings rather than just the statistical status. Effects sizes are responsive to measure, compare and accumulate knowledge that may guide future research and inform understanding of behavior.
Effect size represents the strength or magnitude of a finding. In psychological studies this may be defined in various forms, including the difference in the means of two groups, the strength of a correlation between two variables, or the percentage of variance accounted by a model. A p-value only indicates the probability that the observed data would be realized in the event the null hypothesis were true, whereas effect size addresses the size or the extent to which the observed relationship is large or meaningful. If two treatments for depression both reach statistical significance, the one with a larger effect size communicates a stronger change in symptom reduction, which matters for both clinicians and theorists. Without effect size, research results remain shallow, offering no sense of impact. As the effect size increases, results become dimensional and results across studies can be compared and synthesized meaningfully.
Cohen's attack on the ritual of significance testing illustrates why the turn to effect sizes became urgent. He described the widespread illusion that a small p-value proves a theory or guarantees replication. He also demonstrated the faulty logic behind rejecting the null hypothesis, showing that psychologists often slide into assuming that rejection confirms the research hypothesis (Cohen, 1994). For Cohen, this reliance on p-values was a distraction from real measurement. He described the "nil hypotheses" problem as the idea that the null hypothesis always posits an effect size of zero. Because true effects are rarely zero in the real world, large enough samples will always yield significant results. This means the ritual of testing against zero effects does not tell us whether a finding is meaningful. When researchers focus on effect size, they avoid the trap of declaring trivial results significant and instead concentrate on the real size of the differences and relationships.
Effect sizes are important in that they create a bridge between statistical findings and substantive meaning. Even a finding that has been declared significant can be meaningless in practice when the effect under observation is small. To illustrate, a correlation between study habits and examination scores in a large sample of thousands of students may be statistically significant at p < .01, yet the true relationship may only account for less than two percent of the variance. Reporting the significance level alone would confuse a reader into thinking that the finding has weight, whereas in reality it does not change much in academic terms. Effect size makes that distinction visible. It tells both researchers and theorists whether a change is worth attention and whether it holds theoretical importance. Without such information, psychology risks becoming a science of small significant differences that never build into real knowledge.
Another reason effect sizes matter is replication. Cohen reminded the field that p-values do not translate into probabilities of replication. Many psychologists wrongly believed that a significant result at p < .01 meant a study would replicate 99 percent of the time. In reality, low statistical power and sampling variability make replication far less certain. Effect sizes provide a more stable measure for replication because they communicate the strength of a finding in a form that can be compared across studies. If two studies report similar effect sizes, even if one did not reach the .05 threshold, the evidence still points to a consistent phenomenon. Replication in psychology depends on this focus on magnitude rather than the binary outcome of significance testing. Cohen tied the survival of significance testing to its convenience. It is easier for researchers to make a yes-or-no decision than to judge the actual size and meaning of an effect.
The role of effect sizes in study design is equally important. When researchers plan new studies, they need to know how large a sample to collect in order to detect meaningful effects. This calculation depends on prior knowledge of expected effect sizes. Without this information, researchers may design studies that are underpowered and unable to detect effects even when they exist, or overpowered in ways that inflate trivial differences into statistically significant results. Cohen himself made major contributions in this area by publishing guidelines for what counts as small, medium, and large effects. These benchmarks gave psychologists a way to interpret their results against shared standards, allowing more consistent communication across the field.
Meta-analysis demonstrates the value of effect sizes more clearly than any single study. Meta-analysis aggregates the results of many individual studies to draw stronger conclusions about the presence and strength of psychological effects. If studies only reported p-values, this process would be impossible, since significance cannot be combined in a meaningful way. Pushing for effect sizes is important thus as it gives researcher a way to compare results from different studies. Also, as psychologists have continuously been using significance testing without turning to other methods, and Cohen explained this elongated use as a matter of convenience for the researchers. Researchers should thus start asking how strong the relationships really are and whether the strengths repeat in more studies. Psychology, as Cohen warns, should stop relying on p-values so that they do not end up publishing separate results with significance levels, yet results that do not add anything to the growth of psychology and to knowledge.
Effect sizes also enhance the process of psychology communicating with the general population and with applied domains such as education, medicine, and organizational practice. Practitioners and policymakers need to know the extent to which a new intervention improves outcomes or how strong the relationship between two behaviors is. For instance, reporting that a cognitive training program raises test scores by half a standard deviation conveys clear information about educational impact. Change will then be understood exactly as it should be and will be understandable for readers. Effect sizes therefore connect research to practice by showing how much difference an intervention makes in real terms.
For theory, it is important to use effect sizes because they advance the interpretation from why a relationship exists, to how strong it is. The weight of a theory should be determined by its predicted size of the relationship that shows repeatedly in different studies. For instance, if a theory predicts a strong relationship between adolescents' use of social media and mental health, it should then be tested as to whether effect size is large or not. Need Assignment Help?
References
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997-1003.