**Why trust some supposed laws of statistical sampling and convergence when you can just test them yourself?** If you have a computer with `R`

installed (also recommended: `Rstudio`

) then you can stop dithering about whether these `n=1000`

studies cited in the newspapers actually resemble the truth enough, or not.

# make some people # let's say 1e5 one-dimensional people characterised by one parameter # like "wealth" or "health" or "support of some particular policy" # if you want you can create subsets like "Irish" and "English" # ... I'll leave that kind of fun to youbase<-rnorm(1e5, mean=45, sd=4)inheritance<-exp( exp( exp(rpois(1e5, 1.1) )))luck<-base*inheritance*rpois(1e5, 2.1)extreme.luck<- rcauchy(1e5, location=45, scale=4)people<-exp( base+inheritance+luck+extreme.luck ) # randomly sample the peopleNielsen<-sample(people[1:1e5],100, replace=F)# take some statistics of each and compare themmean(Nielsen)mean(people)diff( mean(Nielsen), mean(people) ) # and so on # compare histograms, compare medians, compare stdev's, compare kurtoses...

(Notice this is an economy with no geography, no choice, and no response.)

- You could also simulate “biased sampling” by grabbing for example
`people[1:100]`

rather than`sample(people[1:1e5], 100, replace=F)`

. - Or to be a little biased but also a little random you could make a
`indexes.to.sample.from <- floor( runif( 100, min=1, max=316) ^2 )`

.

(Squaring will disperse the values with a bias towards the earlier. Think about *that* meaning of the parabola picture!)

Nice way to play around with:

- Different functions for generating (and noising up) a bunch of sims
- Different measures of central tendency or spread (is
`median`

better than`mean`

? You can prove it to yourself.) `R`

. Not that we need more reasons to play around with R, but we will gladly accept them.

(Source: http://docs.google.com/)

Advertisements