|
|
 |
|

2nd June 2008, 03:31 AM
|
|
Inactive Registered Visitor
Registration Date: Jun 2008
Location: Salt Lake City, Utah
|
|
Posts: 5
Thanks Given to Others: 0
Thanked 0 Times in 0 Posts
Karma Power: 6 Karma: 10 
|
|
Failures-in-time (FITs), DPPM and Probability Calculation Questions
Hello forum members, I am glad I found this very informative web-site. My name is John and I am a new member and this is my first post so I hope I have chosen the correct forum.
I have performed an MTBF on my electronics during the design phase of the project but I have come to the realization that it did not accurately predict the actual reliability of the hardware, i.e the measured failures exceeded predicted failures in time.
In the system, I have 22 components powered/running in parallel, with a published DPPM of 200 (40 DPPM for higher rated parts), and a published 9 FITs; any failure causes a catastrophic failure in the system. Each component has 4 die (88 per system) that could fail. I plan on having 5000 devices in the field (440000 die) and these devices will be powered/running 24/7 in the field. My dilema is figuring out what formulas can I use to calculate the correlation between DPPM and FIT data I have received and how can I determine my actual reliability with this system.
Also, to prevent single event catastrophic failures we are placing a redundant component to act as a back-up in the event of a die/component failure. Can someone point me in the right direction to calculating the probability of two failures occurring in the one system?
Thanks in advance, any help is much appreciated.
|

2nd June 2008, 05:14 AM
|
|
Super Moderator
Registration Date: Sep 2005
Location: Johore/Malaysia
|
|
Posts: 2,941
Thanks Given to Others: 571
Thanked 1,031 Times in 746 Posts
Karma Power: 231
|
|
Re: Failure in time (FITs), DPPM and Probability Calculation Questions
Some definitions / Terminology
Quote:
DPPM: Defects per parts per million
MTBF: Mean Time Between Failures
FIT: Failures-in-Time. The number of failures per 1E9 (billion) device-hours
Confidence Level %: Statistical confidence level
|
Last edited by harry; 2nd June 2008 at 05:32 AM.
|

2nd June 2008, 01:00 PM
|
|
Inactive Registered Visitor
Registration Date: Jun 2008
Location: Salt Lake City, Utah
|
|
Posts: 5
Thanks Given to Others: 0
Thanked 0 Times in 0 Posts
Karma Power: 6 Karma: 10 
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Harry. Thanks for the reminder of the definitions. My system has an MTBF of 650000 hours but clearly it is not accurate. I have experienced component failures way in excess of that prediction/model. What I am now looking to do is to correlate the DPPM on the component level (200) and introduce FITs (9) to that formula to get the worst case scenario in terms of reliability. Also, we've added a redundant component to protect against chip failure and I need to calculate probability of a failure occurring on two chips at the same time. Can you direct me to right place?
|

3rd June 2008, 02:48 AM
|
|
Super Moderator
Registration Date: Sep 2005
Location: Johore/Malaysia
|
|
Posts: 2,941
Thanks Given to Others: 571
Thanked 1,031 Times in 746 Posts
Karma Power: 231
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Quote:
Originally Posted by johnb1
.................. My system has an MTBF of 650000 hours but clearly it is not accurate. I have experienced component failures way in excess of that prediction/model.......................
|
Did you use the Mil-Std to work out the MTBF? It is recognized that the Mil-Std is overly conservative. Commercial approaches favored the Bellcore/Telcordia (SR 332) standard because it's an improvement of the Mil-Std and closer to reality.
With regards to your other questions, I'll have to leave it to Covers who have more knowledge in that area.
|

3rd June 2008, 04:00 AM
|
|
Inactive Registered Visitor
Registration Date: Jun 2008
Location: Salt Lake City, Utah
|
|
Posts: 5
Thanks Given to Others: 0
Thanked 0 Times in 0 Posts
Karma Power: 6 Karma: 10 
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Harry. Thanks for you continued interest. Yes I use MIL-H-217 as a reference but mostly use software tools that automatically derive the failure rates, FITs, DPPM etc. The Handbook allows for conservative prediction models but in reality it does not make good predictions for components that have e.g. floating gate logic or NAND etc. I have 50 components that have 200 DPPM and 9.93 FITs. We have added two additional component to act as a redundant back-up in the event of a failure. The question I have is what is the probability that the redundant parts will also fail. Thanks John.
|

3rd June 2008, 07:19 AM
|
|
Willy Grunfeld
Registration Date: Jul 2005
Location: Israel
|
|
Posts: 142
Thanks Given to Others: 17
Thanked 47 Times in 38 Posts
Karma Power: 35
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Quote:
Originally Posted by johnb1
Harry. Thanks for you continued interest. Yes I use MIL-H-217 as a reference but mostly use software tools that automatically derive the failure rates, FITs, DPPM etc. The Handbook allows for conservative prediction models but in reality it does not make good predictions for components that have e.g. floating gate logic or NAND etc. I have 50 components that have 200 DPPM and 9.93 FITs. We have added two additional component to act as a redundant back-up in the event of a failure. The question I have is what is the probability that the redundant parts will also fail. Thanks John.
|
John,
It seems you are mixing up some metrics that are related but not identical.
There is no direct correlation between DPPM and relaibility (expressed as FIT). DPPM is a production metric that counts every defect some related to relaibility some not. The simplest example a smeared marking is a defect and counted in the DPPM but has no reliability meaning.
Most manufacturing defects are, at least theoretically, screened out by the various inspection and final test. In reliability it is assumed that all parts are defectless at time t=0. FIT is a reliability metric that you can get from the manufacturer, it is calculated from the number of parts failed during manufacturer's Life Test , the industry standard is to calculate the FIT at a 60% confidence level. Still it is a lot more accurate than "predicting" it based on any generic type prediction method such as MIL-HDBK-217, Prism, 217+ or Telcordia.
As for two parts added for redundacy,, your question is overly general to provide an answer as to the correct method. Are these 2 components active standby, static standby (kick in upon failure), in short you need to draw up a reliability block diagram showing the redundant components with the series and parallel branches. Once you have that, there are numerous sources that will direct you to the correct way of calculating the system reliability. It is not the same as the "probability that both components will fail at the same time" as you imply.
Re for instance : http://www.ece.cmu.edu/~koopman/des_...ity/index.html
|
|
Thanks to w_grunfeld for your informative Post and/or Attachment!
|
|

4th June 2008, 04:22 AM
|
|
Inactive Registered Visitor
Registration Date: Jun 2008
Location: Salt Lake City, Utah
|
|
Posts: 5
Thanks Given to Others: 0
Thanked 0 Times in 0 Posts
Karma Power: 6 Karma: 10 
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Willy-
Thanks for your detailed response. I should be more specific. The vendor I speak of published a DPPM of their NAND flash to be 200. I don't know if they actually tested million parts or it was just a prediction/model, nevertheless I accept it. So I can expect any die failure causes me to have a catatrophic event, and therefore a system failure. In my system I potentially have 880000 die (potential failures) Can I express DPPM as: 200/880000 = 0.00022 or 0.02%? Now I also know that the vendor has pulished an FIT of 9.93. Therefore in a billion hours of operation I can expect 9.93 failures, is this expression correct? Now, I may have 5000 devices in the field and I may run them 24/7 I can expect hours of operation to be 4.38e7. I accept the 9.93 FIT data therefore I can expect to have 9.93 failure per billion hours of operation which is approximately 42.6 failures over this time (1 year). Is this correct? Can I factor in the DPPM from the vendor as I have expressed or do I only use the FITs?
I have experienced 14 failures in a few weeks of operation in a sample set of 100 devices. Naturally there is concern, so we have moved towards our secondary supplier, nevertheless I am faced with predicting real world data and not those vague prediction used by MIL-H 217. Looking forward to your reply.
|

4th June 2008, 07:25 AM
|
|
Willy Grunfeld
Registration Date: Jul 2005
Location: Israel
|
|
Posts: 142
Thanks Given to Others: 17
Thanked 47 Times in 38 Posts
Karma Power: 35
|
|
Re: Failures-in-time (FITs), DPPM and Probability Calculation Questions
Quote:
Originally Posted by johnb1
Willy-
Thanks for your detailed response. I should be more specific. The vendor I speak of published a DPPM of their NAND flash to be 200. I don't know if they actually tested million parts or it was just a prediction/model, nevertheless I accept it. So I can expect any die failure causes me to have a catatrophic event, and therefore a system failure. In my system I potentially have 880000 die (potential failures) Can I express DPPM as: 200/880000 = 0.00022 or 0.02%? Now I also know that the vendor has pulished an FIT of 9.93. Therefore in a billion hours of operation I can expect 9.93 failures, is this expression correct? Now, I may have 5000 devices in the field and I may run them 24/7 I can expect hours of operation to be 4.38e7. I accept the 9.93 FIT data therefore I can expect to have 9.93 failure per billion hours of operation which is approximately 42.6 failures over this time (1 year). Is this correct? Can I factor in the DPPM from the vendor as I have expressed or do I only use the FITs?
I have experienced 14 failures in a few weeks of operation in a sample set of 100 devices. Naturally there is concern, so we have moved towards our secondary supplier, nevertheless I am faced with predicting real world data and not those vague prediction used by MIL-H 217. Looking forward to your reply.
|
Leave the DPPM out! Use only the FIT, I did not repeat your calculation but it sounds right, number of expected failures=FITXOperating hours x number of IC's (not dies!)
The real life failure will likely be higher in any case, first because the FIT as given by the manufacturer is at 60% CL. Also manufacturer's FIT relates to components tested on a test board with IC's plugged in and nothing else. In your system there are added failure modes such as caused by substandard solder joints,or other manufacturing imperfections, handling (ESD damage) and others.
So I wouldn't rush to change suppliers, before factoring in all the above. Finally a few weeks of operation is not enough time to decide either way. To see whether or not the actual field results are compatible with the FIT data , and at what confidence, is a lot more complicated than just multiplying a few factors.You would need to use a weibull analysis, consult www.weibull.com (Life data analysis)to understand the underlying theory.
Willy
__________________
Willy Grunfeld
RAMSQ Consultant
|
Lower Navigation Bar
|
|
|
|
Visitors Currently Viewing this Thread: 1 (0 Registered Visitors and 1 Unregistered Guests)
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate Thread Content |
Linear Mode
|
|
Posting Settings
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|