More Central Limit Theorem Questions

N

ncwalker

So let's take a standard 6-sided die and start rolling it and make a histogram. We will get (with a lot of rolls) all counts the same - no bell shape, a rectangle. Which we expect, because the frequency of occurrence of any side is the same.

Now let us roll 2 dice and sum the result. And plot the frequency of the sum which can be from 2 to 12. This will now look normal simply because the frequency of occurrence of snake-eyes and boxcars (both die = 1 or both die = 6) will be much less than 6, 7, or 8 which have much more combinations that will sum to these values.

Is this because of the Central Limit Theorem?

This would also hold true if I averaged the dice instead of summed them, more combinations would result in an average of 3 or 4 than 1 or 6.

Which then brings me to the question of doing "math" on data in any form.

1) A CMM that takes probe hits to measure a diameter. What it reads is n point coordinates, then it "does math" to best fit a circle resulting in a diameter.

2) A leak tester looking for a leak rate. That pressurizes a part, lets it stabilize, and takes several measurements over a stabilization phase to generate an average leak rate. It "does math" before the result is reported.

One could go on and one, but there are a lot of automated devices out there that have sensors connected to transducers in some manner. And further then take several inputs to compose or derive an outputted result, "doing math."

Because of CLT do the results then look more normal than the values the sensors are actually reporting?

In other words, I have a leak tester that through internal to the device math an controls gives me a leak rate in ccm. But what the sensors are outputting are volts. Would my volts be, say, Weibull but my leak rate show up as normal because the device "did math" and because of the CLT?
 

reynald

Quite Involved in Discussions
Is this because of the Central Limit Theorem?

Would my volts be, say, Weibull but my leak rate show up as normal because the device "did math" and because of the CLT?

Is this because of the Central Limit Theorem? -->yes
Weibull but my leak rate show up as normal because the device "did math" and because of the CLT? --> It's measuring the same point (well, at least is should be) in order to reduce the variation due to the measuring device. Remember CLT makes you approach the true average and reduces std dev by a factor of sqrt(n). So the more measurement you do, the better the average is approximating the "true" measurement. BUT
"Weibull but my leak rate show up as normal because the device "did math" and because of the CLT?" -->If you are measuring different points then do the average, this statement would be correct.
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
1) A CMM that takes probe hits to measure a diameter. What it reads is n point coordinates, then it "does math" to best fit a circle resulting in a diameter.

2) A leak tester looking for a leak rate. That pressurizes a part, lets it stabilize, and takes several measurements over a stabilization phase to generate an average leak rate. It "does math" before the result is reported.

Measurement error is a "natural" error, whose variation is random, independent and typically about a central value (required conditions for CLT to apply.) CLT speaks of averages, but for measurement it is true for individuals except in the case where there is a physical limit such as lower limit 0 - as in roundness, flatness, etc. Then the distribution is skewed, such as a beta or weibull distribution.
 

Miner

Forum Moderator
Leader
Admin
Remember that the central limit theorem only applies to averaging.

Let's take your CMM measurement of a diameter as an example. When you probe a feature with say 30 hits, the CMM creates a best fit circle through these points. In essence, this results in a circle with an average diameter of those taken. If you made a distribution of individual diameters taken on a single feature from another gage, not from the CMM, it would most likely be skewed, but multiple CMM circles on the same feature may be normally distributed due to the central limit theorem.

However, if you use the CMM to measure the same feature on multiple parts, it is not averaging across the multiple parts so the central limit theorem does not apply.

The math itself is not necessarily indicative of the CLT, its whether the math involves averaging. Even then, as in the CMM, its WHAT is being averaged that dictates whether the CLT is invoked.
 

Bev D

Heretical Statistician
Leader
Super Moderator
to elaborate on Miner's response: the central limit theorum applies to averages of independent samples drawn randomly from a stable population. In gaming terms, this is random sampling with replacement. In practical terms this is samples drawn from a homogenous process stream OR random samples drawn from a static population. In these cases the sample averages will tend to have an approximately Normal distribution unless the sample size is 'small' in which case the averages will tend to have a t-distribution.

the central limit theorum does not apply to your sum of 2 dice example. you have described the physical probabiity of the sums correctly and THAT is the reason the distribution will be symmetrically bell shaped. it will NOT be Normal as there are not infinite tails. This is a PROBABILITY distribution of integer data.

Miner has addressed the CMM measurements. I will add one more example: if you take the average 'best fit' circle for multiple samples (of size greater than n=1) then the sample average circle will tend towards a Normal distribution. the larger the n (>25, the closer to Normal you will get.

and of course the leak tester is the same as the CMM.

the CLT applies to a specific kind of math (and a specific sampling scenario) not just any math done on any kind of measurement.
 
Last edited:
Top Bottom