Should Data be Normal before Computing Baselines?

patkim

Registered Visitor
#1
While computing performance baselines, after you collect data, is it recommended to remove the outliers or should one compute the baselines including the outliers?

What is the correct or recommended approach? Could someone please elaborate on the correct approach?

For example say in a Software Development Company Effort Variance is collected from 40 milestones. Data is Not Normal because there are a few outliers say 4 data points are too skewed. Should the baseline be computed with all 40 data points or should I remove the outliers.

Thanks in advance.
 
Elsmar Forum Sponsor

Bev D

Heretical Statistician
Staff member
Super Moderator
#2
Don't test for Normality and don't remove so-called 'outliers'.
The only reason to remove a data point is that it is actually a fake value.


It might be helpful if you could explain what you are trying to do with the baseline...
For example - if you are trying to improve the deviation from milestone timelines then removing the 'skewed values' (I assume they are 'large deviations?) from the baseline is removing the very poor performance that you seek to improve. Focus on the causes of the deviation, improve the performance and plot the after performance values vs the before and see if the improvements were good enough. Fancy statistical calculations do not help you do this...
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
#3
Your baseline should be while your process is stable. It should only be normal when it is stable if the output variation is SUPPOSED to be normal, that is your variation is expected to be random, independent with its mean centered within its variation. If not, then it needs to be the distribution that you would expect it to be in its stable condition (e.g. non-normal such as skewed or uniform.)
 

Bev D

Heretical Statistician
Staff member
Super Moderator
#4
A baseline 'should' be stable for the calculation of capability indices (ugh!)

A baseline only needs to be relatively stable in order to calculate control limits for a control chart. We can still calculate 'good limits from bad data' if we understand the process and how control charts work. :) In this case we would exclude a few extreme data points form the control limit calculations but would keep the data in the chart.

If the OP is determining a baseline for other reasons (like understanding the actual process performance for budgeting or problem solving, etc.) the process doesn't have to be stable at all. And if this is the intent censoring data is horribly misguided and counterproductive.

Perhaps the OP could provide more information on what they are trying to do?
 

bobdoering

Stop X-bar/R Madness!!
Trusted Information Resource
#5
The key point I was trying to make - going to the OP's question - is that 'normal' may have absolutely nothing to do with it. Just because you see a normal distribution, it may have nothing to do with the process. It may be all gage or measurement error - which tends to generate normal distributions that mask the true underlying process variation.

The distribution that models your current state may also be totally different from the distribution in its optimum state, too. But, neither may be normal. Curve fit and find a meaningful model.
 
Last edited:
Thread starter Similar threads Forum Replies Date
A Taguchi Minitab - Continuous Data - What should I choose as response variable? Using Minitab Software 3
D Can Cloud Data Management resources be qualified? Should they be? Quality Manager and Management Related Issues 3
V Help on C Sat Data Analysis - Should I use discriminant analysis? Six Sigma 8
V What kind of Data should support Fit/Function Characteristics? APQP and PPAP 1
J Association of Analytical Communities validation study data - How deep should I go? General Measurement Device and Calibration Topics 1
T Monthly SPC original data sheets - What should be done with the reports? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 9
B What type of Control Chart should I use for this Data Statistical Analysis Tools, Techniques and SPC 1
K Gage R&R Records - Should we keep data longer as 'suggested' by our TS Auditor? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 7
T Process Capability Data (TS16949 8.2.3.1) - How often should it be reviewed? Capability, Accuracy and Stability - Processes, Machines, etc. 9
B When I have to create PPAP Level 1, what data should be updated? APQP and PPAP 8
L Material Safety Data Sheet - MSDS should specify how to make the material innocuous Miscellaneous Environmental Standards and EMS Related Discussions 10
D Life of the part - How long should I keep processing data, powder, raw material tests Document Control Systems, Procedures, Forms and Templates 3
T 510(k) submission - Which name should I use in the submission? Other US Medical Device Regulations 3
N ISO 19011:2018 - 5.4.2 "...audit program should engage in appropriate continual development..." Training - Internal, External, Online and Distance Learning 4
G Should I perform Gage R&R only at the beginning of a new project? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 6
DuncanGibbons Should the requirements FAA/EASA Part 21 be addressed within the QMS and AS9100D quality manual? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 5
M Should 510(k) Predicates be Actively Listed Devices? Other US Medical Device Regulations 12
B Why the Greek god Hephaestus should have done a design FMEA (DFMEA) on his giant robot APQP and PPAP 1
J On PFMEA for danger labels - Label always should be assigned severity 10 ? FMEA and Control Plans 3
H Who should be listed as the manufacturer/distributor on the box? 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 15
M MDR, RED and LVD - Should our device comply with them? EU Medical Device Regulations 2
BeaBea How Many Processes should be created for each Department? Process Maps, Process Mapping and Turtle Diagrams 5
M Should volume of sales be factored into risk probability assessments? ISO 14971 - Medical Device Risk Management 33
MrTetris Should potential bugs be considered in software risk analysis? ISO 14971 - Medical Device Risk Management 5
S Should safety checks be included in the Control Plan? IATF 16949 - Automotive Quality Systems Standard 5
M Which incubation condition should be selected to recover both bacteria and fungus effectively Miscellaneous Environmental Standards and EMS Related Discussions 3
D Is there a specific location for PPE such as safety glass holders and glove dispensers should be mounted Occupational Health & Safety Management Standards 10
Robert Stanley Which Registrar Should I Choose for ISO 9001:2015 registration? Registrars and Notified Bodies 10
M Who should receive the bills from suppliers and vendors, account payable or procurement? Consultants and Consulting 4
V IATF 16949 8.4.1 Control of externally provided processes, products and services - Should the CB be on our Approved Supplier List? IATF 16949 - Automotive Quality Systems Standard 10
A We are ISO 13485:2016 should we be audited to ISO 14971 ISO 13485:2016 - Medical Device Quality Management Systems 16
E Received a Major finding during IATF Surveillance audit for loss of BIQS Level 3 (more than 6 SPPS in 6 months)...how should we address SYSTEMIC CA? IATF 16949 - Automotive Quality Systems Standard 11
J Organization merger. Should we keep two separate ISO 13485 certificates? ISO 13485:2016 - Medical Device Quality Management Systems 6
S Companies that maintain your machine should be in ASL? AS9100, IAQG, NADCAP and Aerospace related Standards and Requirements 2
S Use of "Shall" versus "Should" in Procedures ISO 13485:2016 - Medical Device Quality Management Systems 21
D Class II medical device - When should a complaint be closed? Customer Complaints 6
Sidney Vianna IATF 16949 News Presentations from the latest IATF Stakeholder Event - Expectation that IATF 16949 certification should equate with product quality. Misguided? IATF 16949 - Automotive Quality Systems Standard 7
L Clause 0.4 of ISO 9001 and EHS - Where should I stop the inclusion of EHS in my QMS ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 8
Ed Panek Part 11 Self Certify Memo - What else should it cover? Qualification and Validation (including 21 CFR Part 11) 5
H Should I mention machine/Equipment password In SOP? Qualification and Validation (including 21 CFR Part 11) 4
D How long should we keep the spare parts available for our medical device, after we have stopped the production? ISO 13485:2016 - Medical Device Quality Management Systems 0
H Statistical Techniques Procedure - What should be included Document Control Systems, Procedures, Forms and Templates 4
Q How should I analyze measurement correlation between me and customer? Gage R&R (GR&R) and MSA (Measurement Systems Analysis) 12
Sidney Vianna Interesting Discussion ISO 9001:2024 - What should be changed in the next Edition of ISO 9001? ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 82
N Should it even be on the hazard analysis (software)? FMEA and Control Plans 2
V Which batches should or could be considered for design validation and design verification? 21 CFR Part 820 - US FDA Quality System Regulations (QSR) 0
L A Taiwan company want to sell Class I medical device (510(k) exempt) on Amazon, should we register with FDA? US Food and Drug Administration (FDA) 4
M Routine testing of medical electrical systems - What specific electrical safety tests should be performed? IEC 60601 - Medical Electrical Equipment Safety Standards Series 5
G ISO 17025:2017 7.1.2 - Should I produce a document for the customer? ISO 17025 related Discussions 8
F Quality Objectives - Where in the QMS Quality Objectives should be located ISO 9000, ISO 9001, and ISO 9004 Quality Management Systems Standards 8

Similar threads

Top Bottom