# Fathom Features

## Data Sources

- Use data from one of the many
**sample documents**that come with Fathom. - Enter
**your own data**by typing into a case table. - Paste
**data from other applications**. - Import
**text files**. - Import data from the
**Internet**. **Generate data**by formula.- Import
**U.S. Census Microdata**from IPUMS (more than 50 attributes available; includes historical data as old as 1850).

## Plot Types

**Dot plot (stacked or unstacked):**May be split by adding one or more numeric attributes to the same axis or by dropping a categorical attribute on the other axis. May display dots shaped according to a legend.**Line plot:**May be split by a categorical attribute on the vertical axis and may display dots shaped according to a legend. You can add attributes to the vertical axis.**Histogram (equal width bars):**May be split by adding numeric attributes to the same axis or by dropping a categorical attribute on the other axis. May show Frequency (the default), Relative Frequency, Relative Percentage, or Density.- Ntigram (equal area histogram): May be split by adding numeric attributes to the same axis or by dropping a categorical attribute on the other axis.
**Box plot:**May be split by adding more numeric attributes to the same axis or by dropping a categorical attribute on the other axis.**Percentile plot:**May be split by adding more numeric attributes to the same axis or by dropping a categorical attribute on the other axis. May display dots shaped according to a legend.**Normal quantile plot:**May be split by adding more numeric attributes to the same axis or by dropping a categorical attribute on the horizontal axis. May display dots shaped according to a legend.**Bar chart:**May be split by adding one more categorical attribute to the axis of a bar chart or by dropping another categorical attribute in the plot area. May display shaded regions according to a legend.**Ribbon chart:**May be split by dropping another categorical attribute in the plot area. May display shaded regions according to a legend.**Breakdown plot:**May be split by dropping a categorical attribute on each axis. May display dots shaped according to a legend.**Scatter plot:**You can add more numeric attributes to one of the axes or drop any attribute in the plot area. May display dots shaped or colored according to a legend.**Line scatter plot:**You can add more numeric attributes to one of the axes or drop any attribute in the plot area. May display dots shaped or colored according to a legend.**Function Plot:**Use sliders in the formulas to manipulate the functions dynamically.- Fathom also displays the
**Test Statistic Distribution**for hypothesis tests.

## Plotting Values and Functions

**Plotted values:**Display any number of computed values with a line perpendicular to the numeric attribute axis of a univariate plot.**Plotted functions:**Display any functions, f(x), on a graph that has two continuous attributes. For univariate graphs, the x-axis is the axis corresponding to the attribute.

## Statistical Tests

**t-test to test the mean**of a sample against a hypothesized population mean when the population standard deviation is not known. Compute either from summary or raw data.**t-test**to test the hypothesis that**two means**from independent samples are equal.- Choose either an un-pooled or a pooled estimate of population standard deviation.
- Compute either from summary or raw data.
**One-way ANOVA tests**the null hypothesis that means of a continuous attribute, when grouped by a single categorical attribute, are independent of each other. Compute from raw data.**Test proportion**of a sample against a hypothesized population proportion. Uses exact binomial computation when np or n(1– p) is less than five; otherwise, uses normal approximation. Compute either from summary or raw data.**Compare two proportions**against the null hypothesis that the two population proportions are equal. Uses the normal approximation and requires that the number of successes and failures in each of the two attributes be greater than or equal to 5. Compute either from summary or raw data.**Chi-square goodness of fit**tests the null hypothesis that each of n categories is equally likely or that each has a specified probability of occurrence. Compute either from summary or raw data.**Chi-square test for independence**tests the null hypothesis that two categorical attributes are independent of each other. Compute either from summary or raw data. Displays expected values for each cell.**Test the slope**of a least squares regression line against the null hypothesis that the slope is zero. Uses student’s t as the test statistic. Can only be computed from raw data.**Test the correlation**coefficient between two attributes against the null hypothesis that the correlation coefficient is zero. Uses student’s t as the test statistic. Compute either from summary or raw data.

## Estimates of Population Parameters

- Given a confidence level, compute the
**confidence interval for the mean**of a population when the standard deviation of the population is unknown. Compute either from summary or raw data. - Given a confidence level, compute the
**confidence interval for a population proportion**. Compute either from summary or raw data. - Estimate the
**difference of two means**. Given a confidence level, compute the confidence interval for the difference of means for two groups where the groups are specified either by two continuous attributes or by one continuous and one categorical attribute. Compute either from summary or raw data. - Estimate the
**difference of two proportions**. Given a confidence level, compute the confidence interval for the difference of proportions. Compute either from summary or raw data.

## Linear Models

**Simple linear regression:**Estimate the slope, intercept, and correlation coefficient for paired, continuous observations. Given a confidence level, compute the confidence intervals for each of these. Given a value for the independent attribute, compute the predicted value and its confidence interval at a specified confidence level.**Multiple linear regression:**Estimate regression coefficients and their standard errors, tstatistics, p-values, and contributions to R-squared for multiple predictor attributes’ contributions to a model of a response attribute.

## Dynamic Dragging

The following can be **dragged in Fathom**. All dependent objects update during the drag.

**Points in plots, bars in histograms and Ntigrams, all portions of a box plot.**Dragging changes values of data.**Numeric axes.**Dragging changes scale or translates axis.**Categorical axes.**Dragging changes the category order.- Slider. Dragging changes slider value. Whatever formula the slider is used in updates. (See Parameterization.)
**Edges of bars in histograms and Ntigrams.**Dragging changes widths of bins.**Edges of bars in bar charts.**Dragging changes the width of the bar.**Movable lines in univariate plots and in scatter plots.**Dragging moves lines and shows changed value or equation.

## Simulation Capabilities

Fathom supports several different kinds of simulation.

### Sampling

Given a collection, generate a new collection whose **cases are drawn at random** from the original:

- with or without
**replacement** - with or without
**animation** - may be triggered on
**response to source change** - choice of whether to
**empty collection**when starting a new sampling process **sample**a given number of cases or sample until a condition is met

### Collecting Measures

The **source collection** may be an:

**ordinary collection**, in which case any values defined by random functions will be regenerated before each iteration of collecting measures**sample collection**, in which case the sample will be taken again before each iteration of collecting measures**summary table****statistical test or estimate**

Copy the measures defined in one collection into a single case in a **new collection**:

- with or without
**animation** - may be triggered on
**response to source change** - choice of whether to
**empty collection**before collecting more measures **repeat the collection process**a specified number of times or until a given condition is met

### Scrambling

Given a collection, form a new collection in which the values for a specified attribute have been **randomly shuffled**.

### Parameterization

Sliders provide **manipulatable parameters** that can be referenced in formulas for:

**attributes and measures**- plotted
**values** - plotted
**functions** **formulas**in summary tables**parameters**for sampling and collecting measures**filters****numeric parameters**for statistical tests and estimates

Dragging the slider causes all aspects of the model to **recompute dynamically**. Sliders can be **animated** to move continuously from the lower to the upper bounds of their axes. Slider values can be **restricted to multiples** of an input value (for example, restrict to multiples of 1 for an integer slider). Slider values can be **defined by formula**.

## Control Over Appearance of Cases

The appearance of cases in a collection may be changed either directly or through formulas.

The **“display” attributes** over which the user has control are:

- caption
- x-position
- y-position
- width
- height
- icon image (by default, a little gold ball)

## Flexible Layout

- All objects may be
**laid out in a single document**and saved, restored, or printed as such. - Individual objects may be
**viewed in a separate window**if desired. - Text objects of any length can be
**placed anywhere in the document**. Text objects support full character formatting and insertion of a suite of mathematical and statistical symbols. - Objects may
**copied as pictures and pasted into other applications**, or pasted back into the document as a static record.

## Functions

### Statistical Functions

- bin
- correlation
- count
- covariance
- first quartile
- first value
- interquartile range
- last value
- linear regression intercept
- linear regression predicted value
- linear regression residual
- linear regression slope
- linear regression standard error of the slope
- maximum
- mean
- median
- minimum
- next value
- percentile
- population covariance
- population standard deviation
- population variance
- population z-score
- previous value
- product
- proportion
- R-squared
- rank
- run length
- sample covariance
- sample standard deviation
- sample variance
- sample z-score
- standard error
- sum
- third quartile
- unique rank
- unique values
- variance

## Distributions

For each of the distributions listed below, Fathom provides functions to compute the probability density, cumulative probability and inverse cumulative probability (quantiles).

- beta
- binomial
- Cauchy
- chi-square
- exponential
- F
- gamma
- geometric
- hypergeometric
- normal
- Poisson
- student’s t
- uniform
- uniform lattice

## Random Functions

- random integer between a defined minimum and maximum
- random integer from a binomial distribution
- random integer from a geometric distribution
- random number between 0 and 1
- random number between 0 and a defined maximum
- random number between a defined minimum and maximum
- random number from a beta distribution
- random number from a Cauchy distribution
- random number from a chi-square distribution
- random number from an exponential distribution
- random number from an F-distribution
- random number from a gamma distribution
- random number from a hypergeometric distribution
- random number from a normal distribution
- random number from a Poisson distribution
- random number from a t-distribution
- random number from a uniform distribution
- random number from a uniform lattice distribution
- random pick from a defined list of elements

## Trigonometric Functions

- cosecant
- cosine
- cotangent
- hyperbolic cosecant
- hyperbolic cosine
- hyperbolic cotangent
- hyperbolic secant
- hyperbolic sine
- hyperbolic tangent
- inverse cosecant
- inverse cosine
- inverse cotangent
- inverse hyperbolic cosecant
- inverse hyperbolic cosine
- inverse hyperbolic cotangent
- inverse hyperbolic secant
- inverse hyperbolic sine
- inverse hyperbolic tangent
- inverse secant
- inverse sine
- inverse tangent
- inverse tangent with two numeric arguments
- secant
- sine
- tangent

## Functions for Dealing with Text

- begins with
- character to number
- concatenate
- ends with
- find string
- includes
- left string
- mid-string
- number to character
- repeat string
- replace characters
- replace string
- right string
- string length
- string to number

## Arithmetic, Logical, and Other Functions

- absolute value
- bin width
- card icon
- case index
- ceiling
- column proportion
- column total
- combinations
- common logarithm
- concatenate
- even
- exists
- expected
- exponential
- false value
- floor
- grand total
- if
- includes
- index of category
- in range
- is number
- is prime
- look up value by index
- look up value by key
- missing
- modulo
- natural logarithm
- number of bins
- number of digits in common between two arguments
- odd
- pi
- round
- row proportion
- row total
- scalar
- signum
- square root
- switch (nested if statements)
- true value
- truncate
- unit of

## Units

Fathom recognizes and performs unit algebra on **dozens of built-in units** in the following dimensions:

- Acceleration
- Angle
- Area
- Capacitance
- Charge
- Conductance
- Current
- Data
- Density
- Electrical Potential
- Force
- Frequency
- Important Constants
- Inductance
- Length
- Magnetic Flux
- Magnetic Flux Density
- Mass
- Power/Energy
- Pressure
- Resistance
- Speed
- Time
- Volume
- Work