In the past several years, advances in parapsychology have been characterized less by dramatic new experiments than by a reevaluation of previous experiments that has allowed us to draw stronger conclusions about the reality of psi. Spearheading this development has been the introduction of a statistical technique called meta-analysis that has enjoyed widespread use in the social sciences (Rosenthal, 1984) and has even invaded the physical sciences (Hedges, 1987). Especially in psychology, it is common to find that a large number of studies have accumulated addressing a particular research question or hypothesis. For example, Robert Rosenthal and Daniel Rubin (1978) were able to uncover 345 published experiments pertaining to the question of whether experimenters can subtly bias the results of their experiments to confirm their prior expectations. As is usually the case, only a minority of these 345 experiments confirmed the expectancy hypothesis to a statistically significant degree, and some even produced results in the opposite direction from that predicted. How could the authors integrate this large number of disparate findings and draw a conclusion about whether the expectancy affect is real?

Meta-analysts accomplish such objectives by taking the probability value associated with each test of the hypothesis in question -- the likelihood that the outcome can be attributed to chance rather than reflecting a real effect - and converting these *p-values* to a standardized statistic called a *z-score*. When all these z-scores are added up and divided by the square root of the number of tests, the result is another z-score. The p-value associated with this new z-score tells us how likely it is that the results of all the experiments as a composite can be attributed to chance. In psychology, it is customary to choose a p-value of .05 - one out of 20 - as the critical value. If the probability is less than one out of 20 that an effect is due to chance, we convlude that it is real. In most cases, the p-value is much smaller than the critical .05. This was the case in the Rosenthal-Rubin metaanalysis, so the authors were able to conclude with a high degree of confidence that the expectancy hypothesis is true.

(Text about the file drawer effect lifted to separate topic)

A second problem has to do with the fact that statistical significance is not enough by itself to convince most scientists that a hypothesis is true. The quality of the experimental methodology must be good enough to rule out *artifacts*, or alternative interpretations of the significant findings. As even published studies often contain artifacts, it is usually not immediately obvious how serious a problem they present. Meta-analysts address this probleme by giving each experiment a quality score, or code, based on a standardizes set of criteria. Some meta-analysts believe that studies of poor quality should be excluded from the analysis entirely, but the majority disagree. Their approach is to determine the relationship between study outcome and the quality codes. If the poor-quality studies reveal significantly more support for the experimental hypothesis than de the good-quality studies, then it is concluded that methodological artifacts indeed played a crucial role and the experimental hypothesis is rejected. If the relationship between quality and outcome is not statistically significant, the hypothesis continues to be endorsed.

Other factors beside quality can distinguish successful from unsuccessful studies. To pursue this matter further, metaanalysts sometimes use a technique called *blocking* to divide their sample of experiments into categories. For instance, a meta-analyst might discover that when separate meta-analyses are performed on experiments that tested college students and experiments that tested other kinds of people, only those that tested college students were successful as a whole. Thus blocking can tell us something about the generality or limitations of a particular effect.

This brief introduction to meta-analysis is incomplete and perhaps a bit oversimplified, but I hope it gives you a general idea about how the technique works. Meta-analysis was introduced into parapsychology by Charles Honorton (1985) and critic Ray Hyman (1985), who used it to debate the reality of ESP in the context of a particular methodology called the ganzfeld. I will touch on this debate later, but here I would like to focus on the four other metaanalyses that are more recent (in keeping with the title of this paper).