Test Hypotheses

You’re working with a collection that can be treated as a random sample from some population. For example, it could be a sample of students from a school, data from an experiment about animal learning, or a sample of census data from a state. You have a hypothesis you wish to test, such as test scores for students in the school have risen above some threshold, animal learning works better with rewards than it does with punishments, or median incomes for young workers measured in constant dollars has increased in the last ten years. These hypotheses are about the population, but your data are for a sample.

The statistical inference process can help you decide whether the measurements in the sample can be explained by chance variation as opposed to your hypothesis.

Attribute types	Test types
One numeric attribute (e.g., height)	Test Mean from Raw Data (t-Test)
Two numeric attributes (e.g., income90 and income00)	Compare Means from Raw Data (Two-Sample t-Test) Test Correlation
One categorical attribute (e.g., FavoredCandidate with values “cand1”, “cand2”, and “cand3”)	Goodness of Fit (Chi-Square) Test from Raw Data Test Proportion Against a Value from Raw Data
Two categorical attributes (e.g., sex and maritalStatus)	Test for Independence from Raw Data (Chi-Square test) Compare Proportions from Raw Data (Z Test)
One numeric and one categorical attribute (e.g., income and sex)	Analysis of Variance (ANOVA) Compare Means from Raw Data (Two-Sample t-Test) (if the categorical attribute has exactly two values)

Fathom ships with four sample documents that show how to perform non-parametric tests (see Non-parametric Tests).