Should Data be Normal before Computing Baselines?

P

patkim

While computing performance baselines, after you collect data, is it recommended to remove the outliers or should one compute the baselines including the outliers?

What is the correct or recommended approach? Could someone please elaborate on the correct approach?

For example say in a Software Development Company Effort Variance is collected from 40 milestones. Data is Not Normal because there are a few outliers say 4 data points are too skewed. Should the baseline be computed with all 40 data points or should I remove the outliers.

Thanks in advance.
 

Bev D

Heretical Statistician
Leader
Super Moderator
Don't test for Normality and don't remove so-called 'outliers'.
The only reason to remove a data point is that it is actually a fake value.


It might be helpful if you could explain what you are trying to do with the baseline...
For example - if you are trying to improve the deviation from milestone timelines then removing the 'skewed values' (I assume they are 'large deviations?) from the baseline is removing the very poor performance that you seek to improve. Focus on the causes of the deviation, improve the performance and plot the after performance values vs the before and see if the improvements were good enough. Fancy statistical calculations do not help you do this...
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
Your baseline should be while your process is stable. It should only be normal when it is stable if the output variation is SUPPOSED to be normal, that is your variation is expected to be random, independent with its mean centered within its variation. If not, then it needs to be the distribution that you would expect it to be in its stable condition (e.g. non-normal such as skewed or uniform.)
 

Bev D

Heretical Statistician
Leader
Super Moderator
A baseline 'should' be stable for the calculation of capability indices (ugh!)

A baseline only needs to be relatively stable in order to calculate control limits for a control chart. We can still calculate 'good limits from bad data' if we understand the process and how control charts work. :) In this case we would exclude a few extreme data points form the control limit calculations but would keep the data in the chart.

If the OP is determining a baseline for other reasons (like understanding the actual process performance for budgeting or problem solving, etc.) the process doesn't have to be stable at all. And if this is the intent censoring data is horribly misguided and counterproductive.

Perhaps the OP could provide more information on what they are trying to do?
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
The key point I was trying to make - going to the OP's question - is that 'normal' may have absolutely nothing to do with it. Just because you see a normal distribution, it may have nothing to do with the process. It may be all gage or measurement error - which tends to generate normal distributions that mask the true underlying process variation.

The distribution that models your current state may also be totally different from the distribution in its optimum state, too. But, neither may be normal. Curve fit and find a meaningful model.
 
Last edited:
Thread starter Similar threads Forum Replies Date
A Taguchi Minitab - Continuous Data - What should I choose as response variable? Using Minitab Software 3
D Can Cloud Data Management resources be qualified? Should they be? Quality Manager and Management Related Issues 3
V Help on C Sat Data Analysis - Should I use discriminant analysis? Six Sigma 8
V What kind of Data should support Fit/Function Characteristics? APQP and PPAP 1
J Association of Analytical Communities validation study data - How deep should I go? General Measurement Device and Calibration Topics 1
T Monthly SPC original data sheets - What should be done with the reports? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 9
B What type of Control Chart should I use for this Data Statistical Analysis Tools, Techniques and SPC 1
K Gage R&R Records - Should we keep data longer as 'suggested' by our TS Auditor? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 7
T Process Capability Data (TS16949 8.2.3.1) - How often should it be reviewed? Capability, Accuracy and Stability - Processes, Machines, etc. 9
B When I have to create PPAP Level 1, what data should be updated? APQP and PPAP 8
L Material Safety Data Sheet - MSDS should specify how to make the material innocuous Miscellaneous Environmental Standards and EMS Related Discussions 10
D Life of the part - How long should I keep processing data, powder, raw material tests Document Control Systems, Procedures, Forms and Templates 3
J What Device Should I use for Gage Study? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 7
T What should be considered or asked to certified body auditors before selecting them for AS9100 Audit? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 4
U How should we build the FDA inspection preparedness team for FDA inspections at factories? General Auditing Discussions 0
MaHoDie How deep should be risk control tracebility IEC 62304 - Medical Device Software Life Cycle Processes 3
K Departmental KPI not met - should the Auditor raise a NC for it IATF 16949 - Automotive Quality Systems Standard 10
DuncanGibbons How to determine process stability for a process whose outputs should be steadily increasing? Capability, Accuracy and Stability - Processes, Machines, etc. 4
A CE mark symbol should be on the package? EU Medical Device Regulations 1
M What should be measurement method in control plan if you are defining Control method as work instruction. Manufacturing and Related Processes 5
T Should PMS and PMCF plans be 2 separate documents? EU Medical Device Regulations 8
C When should I take endotoxin testing? EU Medical Device Regulations 3
Dazzur Difficulty in determining who should be addressing NCRs Nonconformance and Corrective Action 9
M Should DoC be updated every time Technical Documentation is revised? EU Medical Device Regulations 2
F How often should Gage R&R's be updated? Is there a recommended time-frame? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 4
A Which certificate should I start with from ASQ, CQIA or CQE? (or should I take something else) Professional Certifications and Degrees 4
E Please help! ! I AM FRASTRATED SO MUCH! Should inactive ingredient be included in the unit formula if it is removed during the manufacturing process? US Food and Drug Administration (FDA) 0
I IEC 60812 or ISO 14971 for PFMEA? What should we use? ISO 14971 - Medical Device Risk Management 3
qualprod Do sum of results of quality objectives should met a high level goal? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 10
F Should I perform several Gage R&R for every caliper to cover all specs range of use? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 9
E Opening meeting for Third-Party Audit--Who should say what? General Auditing Discussions 22
8 MSA--Should I Use Parts or Gage Pins? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 2
B Should I buy IEC 62304:2006, IEC 62304:2006/AMD 1:2015 or both? EU Medical Device Regulations 1
W "Accurate to ±2%" means "64% of measurements should be within 2% of the actual value"? US Medical Device Regulations 4
H Should we stop inspecting a part if its never been rejected? Reliability Analysis - Predictions, Testing and Standards 6
D What distributor requirement should we (manufacturer) require from the Distributor? EU Medical Device Regulations 4
M Should there be another column in the Optimization section (step 6) of AIAG-VDA DFMEA form? FMEA and Control Plans 1
C By when should harmonized standards be complied with? EU Medical Device Regulations 5
W Strategy for determining which components from a system should be "ME EQUIPMENT" -- home healthcare environment IEC 60601 - Medical Electrical Equipment Safety Standards Series 6
B Documented information - Should be controlled? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 10
Q Is engineering a process and should it have its own process turtle? IATF 16949 - Automotive Quality Systems Standard 7
P Which MDCG Document should I use for PMS Plan and PSUR? EU Medical Device Regulations 2
M Who should quality representative report to? ISO 13485:2016 - Medical Device Quality Management Systems 5
H When should the first PSUR be issued? EU Medical Device Regulations 5
I If i do not want to be an initial importer should i register with FDA? 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 0
O Should a Covid vaccine and testing policy be included as part of ISO9001 or AS9100 risk management? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 6
B UKRP to what level should you audit Class I Technical Documentation? UK Medical Device Regulations 0
C When should you quit programming? Job Openings, Consulting and Employment Opportunities 9
C Should resolution be included in uncertainty budget for digital caliper or micrometer calibration? Measurement Uncertainty (MU) 5
Ed Panek External Standards List - Should this document include previously revised standards? ISO 13485:2016 - Medical Device Quality Management Systems 4

Similar threads

Top Bottom