# How can I determine the kind of distribution - What about Bionimal or Poisson?

V

#### vasilist

Hello there to all of you.

This is my first thread and i hope i'll get some good news.

How can i determine the kind of distribution when i have large series of numbers ?

To determine if it is Normal Distribution i think it is easy because of the bell shape graph. But what about Bionimal or Poisson?

If your answer has to do with minitab sofware also i will be gratefull.

Vasilis

#### Marc

##### Hunkered Down for the Duration with a Mask on...
Staff member
Hello, and welcome to the Cove!

If you take a large number of data points and plot them and look at the distribution, the type of distribution should be evident from the shape of the plot. Of course, it's not always as clear as day and night.

Maybe one of the statistical gurus can elaborate.

R

#### Rick Goodson

vasilist,

Welcome to the Cove!

First, as you are probably aware, there are two types of probability distributions, continuous and discrete. The Normal or Gaussian distribution is continuous while the binomial and Poisson are discrete. The distribution is related to the type of data.

While a plot of the data in a histogram will give you a general idea about the distribution, it won't help you to identify the type of distribution. For example, the flip of eleven coins (a discrete probability either heads or tails) an infinite number of times yields a distribution that looks like a Guassian distribution although it is in fact binomial.

You did not state why you care about the type of distribution so I will have to assume it has to do with when to apply them. Some guidelines are:

The binomial is used for infinite situations or when there is a steady stream of product so we can assume infinite supply. If the fraction defective is less than or equal to 0.10 and the average number of occurance per time unit or amount is less than or equal to 5, you can use the Poisson as an approximation to the binomial. If the fraction defective is close to 0.5 and the sample size is equal to or graeter than 10, the normal curve is a good approximation. As the sample size increases the fraction defective requirement decreases so that at samples of 50, fraction defectives from 0.10 to 0.90 work.

Hope this helps. If you have further questions or need additional information feel free to ask. There is no shortage of information at the Cove.

Regards,

Rick

D

#### Dave Strouse

Good old Jurans handbook covers basici distributions and he has a section in 5th edition , page 44.27 on selecting discrete distributions. It recommends a Chi square "Goodness of fit test" if the process particulars do not lead to a model. Many introductory stats books cover them.

As Rick mentioned, it would be helpfull to know what kind of process you are dealing with and what you will use the distributional model for.

If you are looking at long term data, for instance you will likely be confused by shifting and drifting of process that shows multimodal behavior.

MINITAB had distributional ID routine under STATS>RELIABILITY /SURVIVABILITY header in V13 and above. However, it only covers continuous distributions.

Finally, ASQ has a book in their quality handbooks series that deals with (mostly normal) distributional assumptions.