Estimate Difference of Proportions from Raw Data

Top  Previous  Next

How To... > Work with Statistical Objects > Estimate Parameters and Compute Confidence Intervals > Estimate Difference of Proportions from Raw Data

If you have raw categorical data for which you want to estimate a difference of proportion and confidence interval for that difference, you can use Fathom’s estimate object to calculate them.

 

_img483

Exactly how you do this depends on which of two basic ways your data are structured. First, consider the preferred data structure. Let’s take a concrete example in which you have a list of 42 employees that includes their sex and job classification (salaried or hourly). You wish to estimate any difference in proportion of males between the two job classifications.

The portion of data shown at right shows that each case represents an employee with Classification and Sex as attributes.

1.Create an estimate by dragging one from the shelf or by choosing Object | New | Interval Estimate.
2.From the pop-up menu in the estimate’s upper-right corner, choose Difference of Proportions.

Look carefully at the top pane of the estimate. Notice, as shown below, that one attribute slot is labeled “categorical” and the other is labeled “categorical or grouping.”

_img484

It makes a difference which attribute you assign to each slot. The attribute whose proportions you are comparing (Sex in this example) goes in the first slot. The attribute that defines the groups (Classification in this example) goes in the second slot.

3.Drag the two collection attributes to their slots in the estimate object.

The estimate object, in its verbose and terse forms, is shown here.

_img152
Verbose form of the estimate

_img485
Terse form, obtained by choosing Estimate | Verbose to remove the checkmark next to Verbose

 

_img486

By default, the estimate was for females rather than males. You can change that by clicking on the word “female” and choosing male from the pop-up menu.

_img487

Notice that the range of the estimate includes zero, indicating that you would not be justified in concluding that something other than chance is involved in creating the difference in proportions.

The default confidence level of 95% can be changed by typing or by assigning a formula to use a slider. See Use Sliders to Vary Summary Information in a Statistical Object

Now we’ll consider the less preferred data structure, in which one set of attribute values records the sex of the salaried employees and another attribute records the sex of the hourly employees. This structure is less preferred because a case in this collection doesn’t have any meaning; it’s just an arbitrary pairing of one salaried and one hourly employee.

_img153

A portion of the data is shown at right. The blank cells for cases 23 through 26 in the salaried column make clear that there are different numbers of employees in each group.

In this situation, it doesn’t make any difference which attribute you drop in which slot of the estimate object.

Once you have dropped the two attributes, the text of the estimate object will fill in with something nonsensical, similar to what is shown at right. The problem, of course, is that Fathom is assuming that the attributes conform to the preferred structure.

_img488

To fix things, click your mouse on the first part of the phrase and choose the single-word option, in this case Salaried.

The result of this change, shown at right, is that Fathom treats each attribute as a group and compares the proportion in one with the proportion in the other.

You can change the category being used for comparison by clicking on the chosen category, in this case “female”.

You can change the confidence level by typing in a new value or by giving it a formula.