# Determining Sample Size - Filling a bag with three different types of candy

P

#### PerfettiFUN

i work in a candy factory that makes airheads candy and currently we use a vertical form fill bagger to fill a bag with three different types of airheads (normally cherry, blue raspberry, watermelon). our biggest complaint always seems to be that someone "only got two reds!" in their bag or something similar.

on our machine 3 hoppers are filled with different colors that are vibrated to fill the bucket conveyor up which then dumps into the actual bagger at the top.

what i am planning on doing is taking a number of bags, counting them up, and seeing what kind of ratio we are really getting. then i want to see if i take a large box with all three colors and mix them up really good before they go into the conveyor if it is improved at all. i think with current hopper system with 1 color in each it may be putting out waves of each color causing the bags to be skewed. what my boss said he wants to know is if mixing them more gives us a better ratio of colors in the bags.

what i am wondering is how i would calculate my sample size for this?

i know it is discrete data so i will need a TON of data to statistically prove anything. i am guessing more than is really feasible in this case.

any of you statistics gurus out there that can give me some help on this it would be very much appreciated. its been a while since statistics class so the methods for sample size determination aren't too fresh in my head

also any other kind of insight into this problem would be very much appreciated.

#### Tim Folkerts

Trusted Information Resource
Re: Determining sample size

I would suggest doing this theoretically rather than experimentally.

For example, if the ideal mix is 1/3 red, 1/3 blue, and 1/3 yellow, it would be relatively straightforward to calculate the expected number (and the spread) if you truly had a random sample of 2000 candies.

I don't have time right now, but I might be able to help you with the problem later today (or others might jump in quicker).

Tim

#### howste

##### Thaumaturge
Trusted Information Resource
Maybe I'm naive, but couldn't you just use three bag fillers and put a set quantity (or weight) of one type of candy in at each station? This would eliminate the random chance of having too few red ones in any bag.

P

#### PerfettiFUN

i could definitely do it theoretically instead of mixing a bunch by hand then running it but i think that i would still want to do an analysis on the current state and compare it to a theoretically random sample. pretty much i am trying to prove whether we are or aren't getting a normal distribution of colors in each bag and what kind of standard deviation we are getting. i am just not sure what kind of sample size i need to find this out with the kind of data i am collecting.

also the way i would want to look at the data is by bag not by bar. for example i want to be able to make a histogram when i am done and put # of each color per bag.

also the bag size i am looking at is 30 pieces.

P

#### PerfettiFUN

Maybe I'm naive, but couldn't you just use three bag fillers and put a set quantity (or weight) of one type of candy in at each station? This would eliminate the random chance of having too few red ones in any bag.

that is definitely possible to do but we are trying to see if we can get "close enough" without spending a lot of money on totally redoing the machine. that is why i am collecting the data to see how good we can get and whether it is worth it to spend the extra money or not.

#### howste

##### Thaumaturge
Trusted Information Resource
Typically initial process capability studies use at least 50 samples. I'd say that 30 samples would be the absolute fewest you would want to use.

#### Bev D

##### Heretical Statistician
Super Moderator
Typically initial process capability studies use at least 50 samples. I'd say that 30 samples would be the absolute fewest you would want to use.

um. that is a good rule of thumb for continuous data capability studies that are randomly sampled. but since this is categorical data and the subgrouping is by bag, we'd have to look at how many bags we'd need.

I like the idea of knowing the theoretical distribution of randomly filling 3 different colors in a bag, then sampling actual bags to determine if you are at the theoretical distribution first. If you are, then further randomization of the candy prior to the current process won't help. If you aren't then further randomization might help.

I'll try to run the theoretical later. Maybe Tim will get back before I do...

P

#### PerfettiFUN

Typically initial process capability studies use at least 50 samples. I'd say that 30 samples would be the absolute fewest you would want to use.

oh ok, so there isn't any set formula i should be using to calculate out a statistically significant sample size for something like this?

what i am thinking i am going to do with my data right now is make a histogram for each color showing the number of occurrences of that color in the bags i sampled.

then once i do that i will find out my standard deviation and figure out what % of the time i will get a bag with less than say 6 of any color.

M

#### M Komarmy - 2012

If I understand the problem correctly, you are filling a bag with 30 pieces of candy. there are three different colors and you wish to have 10 of each in the bag.

I would not make this very complicated. Run 30 to 50 bags with the current process and count how many of each color are in the bags. Record the data. Repeat the run after making your proposed change (another 30 to 50 bags). Record the data.

You can evaluate the means and variances for each color for both studies and see if there was an improvement (means closer to target and less variation).

#### Bev D

##### Heretical Statistician
Super Moderator
Yes - there are set formulas for both the theoretical distribution of colors and what sample size you will need.

Last edited by a moderator: