Estimate Difference of Proportions from Raw Data |
How To... > Work with Statistical Objects > Estimate Parameters and Compute Confidence Intervals > Estimate Difference of Proportions from Raw Data If you have raw categorical data for which you want to estimate a difference of proportion and confidence interval for that difference, you can use Fathom’s estimate object to calculate them.
Exactly how you do this depends on which of two basic ways your data are structured. First, consider the preferred data structure. Let’s take a concrete example in which you have a list of 42 employees that includes their sex and job classification (salaried or hourly). You wish to estimate any difference in proportion of males between the two job classifications. The portion of data shown at right shows that each case represents an employee with Classification and Sex as attributes.
Look carefully at the top pane of the estimate. Notice, as shown below, that one attribute slot is labeled “categorical” and the other is labeled “categorical or grouping.” It makes a difference which attribute you assign to each slot. The attribute whose proportions you are comparing (Sex in this example) goes in the first slot. The attribute that defines the groups (Classification in this example) goes in the second slot.
The estimate object, in its verbose and terse forms, is shown here.
By default, the estimate was for females rather than males. You can change that by clicking on the word “female” and choosing male from the pop-up menu. Notice that the range of the estimate includes zero, indicating that you would not be justified in concluding that something other than chance is involved in creating the difference in proportions. The default confidence level of 95% can be changed by typing or by assigning a formula to use a slider. See Use Sliders to Vary Summary Information in a Statistical Object Now we’ll consider the less preferred data structure, in which one set of attribute values records the sex of the salaried employees and another attribute records the sex of the hourly employees. This structure is less preferred because a case in this collection doesn’t have any meaning; it’s just an arbitrary pairing of one salaried and one hourly employee. A portion of the data is shown at right. The blank cells for cases 23 through 26 in the salaried column make clear that there are different numbers of employees in each group. In this situation, it doesn’t make any difference which attribute you drop in which slot of the estimate object. Once you have dropped the two attributes, the text of the estimate object will fill in with something nonsensical, similar to what is shown at right. The problem, of course, is that Fathom is assuming that the attributes conform to the preferred structure. To fix things, click your mouse on the first part of the phrase and choose the single-word option, in this case Salaried. The result of this change, shown at right, is that Fathom treats each attribute as a group and compares the proportion in one with the proportion in the other. You can change the category being used for comparison by clicking on the chosen category, in this case “female”. You can change the confidence level by typing in a new value or by giving it a formula. |