Sampling Simulations

How To... > Create Simulations > Sampling Simulations

A sample collection selects cases from its source collection. You can use it to sample with or without replacement, and you can control how many cases it gets at each sample.

Regular Sampling

One reason to sample is that you have a big collection and you only want to look at a little of it. So you choose some cases at random and use them as a stand-in for the whole population.

Simulation

Another reason to sample, especially while you’re learning, is to do simulation. You can learn about the mathematical properties of sampling by repeatedly sampling from a population to see what the samples would have told you if you had seen each of them alone. You might use this, for example, to study confidence intervals.

You can also use sampling to simulate common (and less-common) probability events. For example, to make a coin-flipper, make a collection with two cases, heads and tails. Then sample from it with replacement.

Bootstrap

Suppose you have N cases in your collection.

If you set up the sample collection to make a new collection by drawing N cases from the original collection with replacement, this is called a bootstrap. This technique draws as many items as were in the original collection, but because it’s with replacement, the exact distribution may not be the same. Some cases will have been chosen twice; others not at all.

You often use a bootstrap to establish a confidence interval for a summary statistic (the median, say).