Histogram Visual Animation (Normal Data)

# HISTOGRAM

A Histogram is used to display in bar graph format measurement data distributed by categories.

A HISTOGRAM IS USED FOR:

1. Making decisions about a process, product, or procedure that could be improved after examining the variation (example: Should the school invest in a computer-based tutoring program for low achieving students in Algebra I after examining the grade distribution? Are more shafts being produced out of specification that are too big rather than too small?)
2. Displaying easily the variation in the process (example: Which units are causing the most difficulty for students? Is the variation in a process due to parts that are too long or parts that are too short?)

STEPS IN CONSTRUCTING A HISTOGRAM:

1. Gather and tabulate data on a process, product, or procedure. This could be time, weight, size, frequency of occurrences, test scores, GPA's, pass/fail rates, number of days to complete a cycle, diameter of shafts built, etc.
2. Calculate the range of the data by subtracting the smallest number in the data set from the largest. Call this value R.
3. Decide about how many bars (or classes) you want to display in your eventual histogram. Call this number K. This number should never be less than four and seldom exceeds 12. With 100 numbers, K=7 generally works well. With 1000 pieces of data, K=11 works well.
4. Determine the fixed width of each class by dividing the range, R by the number of classes K. This value should be rounded to a "nice" number, generally a number ending in a zero. For example 11.3 would not be a "nice" number. 10 would be considered a "nice" number. Call this number i, for interval width. It is important to use "nice" numbers else the histogram created will have wierd scales on the X axis.
5. Create a table of upper and lower class limits. Add the interval width i to the first "nice" number less than the lowest value in the data set to determine the upper limit of the first class. This first "nice" number becomes the lowest lower limit of the first class. The upper limit of the first class becomes the lower limit of the second class. Adding the internal width (i) to the lower limit of the second class determines the upper limit for the second class. Repeat this process until the largest upper limit exceeds the biggest piece of data. You should have appriximately K classes or categories in total.
6. Sort, organize, or categorize the data in such a way that you can count or tabulate how many pieces of data fall into each of the classes or categories in your table above. These are the frequency counts and will be plotted on the Y axis of the histogram.
7. Create the framework for the horizontal and vertical axes of the histogram. On the horizontal axis plot the lower and upper limits of each class determined above. The scale on the vertical axis should run from zero to the first "nice" number greater than the largest frequency count determined above.
8. Plot the frequency data on the histogram framework by drawing vertical bars for each class. The height of each bar represents the number o
9. r frequency of values occuring between the lower and upper limits of that class.
10. Interpret the histogram for skew and clustering problems:

Interpreting skew problems:

Data may be skewed to the left or right. If the histogram shows a long tail of data on the left side of the histogram, the data is termed left or negatively skew. If a tail appears on the right side, the data is termed right or positively skew. Most process data should not typically appear skew. Data that is seriously skew either to the left or right may be an indication that there are inconsistencies in the process or procedures, etc. Decisions may need to be made to determine the appropriateness of the direction of the skew.

It should be noted, however, that some process data is, by its very nature, skew. This situation occurs in arrival processes (for example, people arriving at a McDonalds within a fixed unit of time) and in service processes (for example, the time it takes to wait on a customer in a bank).

Interpreting clustering problems:

Data may be clustered on opposite ends of the scale or display two or more peaks indicating serious inconsistencies in the process or procedure or the measurement of a mixture of two or more distinct groups or processes that behave very differently.

A discussion of histograms and exploratory data analysis at NIST

If you are interested in a solution to generate histograms, Cpk, and control charts, visit https://www.pqsystems.com/ to learn more about SPC software.