The correct way to analyze Skewed Data using Minitab

D

doublea007

I am trying to find the relationship between insurance policies/premium and claim counts/cost. When I look at my data, there are about 60% of policies at a given time without any claim/cost. The data is skewed to the left. What would be the correct way to analyze such data using Minitab? There are other variables too for each policy (credit score, financial score) but I need to find out if there is any relationship between low credit score and high claim counts?
 

Bill McNeese

Involved In Discussions
Re: Skewed Data

I assume you mean that 60% of the policies have no claims and 40% do have claims. And you want to know if low credit score impacts either one of these. You could do a test to see if the % of policies with low credit score and no claims is significantly different from the % of claims with low credit scores and claim. This the two portions test.

Then you can look at those with policies with claims and do a scatter diagram/regression for credit score versus the claim amount. I would be interested to hear if there is a correlation. I would not expect one versus amount of the claim.

I am sure others here will have additional insights.
 
D

doublea007

Re: Skewed Data

Here is my problem. When I look for policies with low credit score and claim counts, I find that low credit scores do not necessarily mean more claims or cost. It could be because of my sample size. What do you think about segmenting the data by the avg. size of policy and doing regression, I haven't tried it yet but I believe the data will be more normal than it is now.
 

Bill McNeese

Involved In Discussions
Re: Skewed Data

Here is my problem. When I look for policies with low credit score and claim counts, I find that low credit scores do not necessarily mean more claims or cost. It could be because of my sample size. What do you think about segmenting the data by the avg. size of policy and doing regression, I haven't tried it yet but I believe the data will be more normal than it is now.

You can certainly try that the approach to see if you learn anything. It appears that you don't like the answer you got - that the data is telling you that there is not a correlation between low credit score and claim counts or costs. I don't think you need to worry about the normality of the data. Why don't you think your answer is correct/valid?
 
Top Bottom