Random 'Normal Distribution' Numbers in Excel

  • Thread starter Thread starter M Greenaway
  • Start date Start date
I don't think that it need more data than usual to do a good aproximation to normal "Gaussian" distribution, 500 could get a fairly good aproximation, of course the more the best, remember the central limit theorem.:notme:

You can play with the attachement, just press F9 to recalculate, feel free to change mean and standard deviation (the theorical), a good way to create almost real info.
 

Attachments

Hi all -

I'm brand new here, but have a question that a colleague sent me. It appears similar to this, but yet I can't work it out. Here is the issue - he's trying to figure out how to distribute a single value (e.g., 300) across a number of periods e.g., 12 or 24 or 30, etc.) using a normal distribution.

I used the formula that Darius provided (thank you!) but can't figure out how to make it be a set number of periods. Does this make sense? I feel quite dense!

Thanks in advance if anyone knows the answer!
 
Hi all -

I'm brand new here, but have a question that a colleague sent me. It appears similar to this, but yet I can't work it out. Here is the issue - he's trying to figure out how to distribute a single value (e.g., 300) across a number of periods e.g., 12 or 24 or 30, etc.) using a normal distribution.

I used the formula that Darius provided (thank you!) but can't figure out how to make it be a set number of periods. Does this make sense? I feel quite dense!

Thanks in advance if anyone knows the answer!

About all I can think of is say you want to "distribute" 300 across 30 time intervals randomly. That would mean each time interval should have an average of 10 per time interval. The problem is, we must also specify a standard deviation if you want to create 30 random numbers of average 10. From what you've given, I'd probably pick 3 since that would put three standard deviations below average at 1, and you would not get any negative numbers. So, you could generate 30 numbers from the normal distribution with mean 10 and standard deviation 3.
 
NORMSINV() is quite slow, so if you are simulating many normal random variables that method may end up being painfully slow, in which case may I recommend the following:

RAND() on the worksheet or Rnd in VBA code will return a uniformly distributed random variable on (0,1]

To simulate a normal r.v. sum N of these uniform r.v. and normalise the mean by subtracting N/2 (mean was N * 1/2, now will be 0), then normalise the variance by multiplying by SQRT(12/N) (standard deviation was N * 12, now will be 1) or use Sqr in VBA.

As N tends to infinity this will be normal, any finite N will be an approximation but by the central limit theorem it converes quickly.

e.g. in a cell using N=12:
=RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() - 6

e.g. in a cell using N=24:
=SQRT(0.5)
* (RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() + RAND() - 12)

e.g. in a VBA function using N=uniformCount:

Code:
Public Function NORMR(uniformCount)
    Application.Volatile (True) 'This will allow recalculation via F9
    Randomize 'This reseeds using the timer
    Dim retVal As Double
    retval = 0
    For i = 1 To uniformCount
        retVal = retVal + Rnd
    Next i
    retVal = retVal - (uniformCount / 2)
        NORMR = Sqr(12 / uniformCount) * retVal
End Function
 
RAND() on the worksheet or Rnd in VBA code will return a uniformly distributed random variable on (0,1]

Have you found any documentation on how Rnd works? One thing I learned 20 years ago during degree work on Operations Research is that there are some good pseudo-random number generators out there, and there are some bad pseudo-random number generators out there. Has any one been able to find out:

1) The algorithm Microsoft uses for Rnd?
2) Any studies on how well Rnd behaves?
 
Have you found any documentation on how Rnd works? One thing I learned 20 years ago during degree work on Operations Research is that there are some good pseudo-random number generators out there, and there are some bad pseudo-random number generators out there. Has any one been able to find out:

1) The algorithm Microsoft uses for Rnd?
2) Any studies on how well Rnd behaves?

No I don't know what MS use.
Pretty hard to say if it's good or not, but I guess one could run some analysis. Maybe I will once I've done my exam tomorrow.
 
Here is a spreadsheet which simulates random numbers with the aim of a normal distribution.

It works in a way I prescribed in an earlier post (although I took out the randomize, which only needs calling once - feel free to add it somewhere more sensible) - summing the uniform RAND() and normalising the first two central moments [I still do not have any idea about the underlying generation process].

The number of uniform draws used is provided by the user in cell B1. Try 1 and you will see a simulated uniform with zero mean and unit variance. Try 2, 3...

I have displayed the probability density function & cumulative distribution function of both the expected and simulated results.

There is also a user override to use NORMSINV(RAND()) instead of my VBA function. If I make 100,000 cells equivalent to D10, I see that NORMSINV() takes almost twice as long as my function with 12 draws and still takes longer than when I use 100 draws.

(Fixing the number of draws means the call to sqr(12 / N) could be replaced; fixing at 12 means it could be removed from the code, since sqr(12/12)=sqr(1)=1.)

N.B The outermost buckets are implemented as catch overflow buckets, stretching out to infinity, but they don't have to be. (With 12 draws and 51 buckets the minimum realisation is just over -6 and the maximum realisation is 6, as are the insides of the outer buckets, which is nice)

You can increase the number of simulations by simply copying the last row down, all post calculations should handle this.

Further analysis could be made into the skew, kurtosis, and higher order central moments. It would be nice to know what underlying method MS use for their RAND() function.
 

Attachments

Last edited by a moderator:
Great thread.
However, there is a problem if you are working with time data series, so you have positive numbers only, and you need to create random numbers from normal distribution. Random generated numbers are positive and negative numbers. Now what? What to do to get positive numbers only (because time values have to be positive numbers) within given SD and mean? Just to mention I have to use normal distribution, not log normal.
 
Back
Top Bottom