Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Classification Performance Metrics and Techniques, Slides of Biomaterials

University of Skövde (US)Biomaterials

An overview of various classification performance metrics and techniques, including sensitivity, specificity, ppv, npv, accuracy, validation, roc curves, lda, and logistic regression. It also discusses the importance of threshold selection and the use of confusion matrices. The document also covers the concepts of true error rate, holdout method, and k-fold cross-validation.

Typology: Slides

2018/2019

Uploaded on 03/13/2019

lectura-lectura 🇸🇪

1 document

1 / 65

Partial preview of the text

Download Classification Performance Metrics and Techniques and more Slides Biomaterials in PDF only on Docsity! AER * Sensitivity and specificity * PPVand NPV e. Accuracy e Validation * ROC curves e LDA e Logistic regression * KNN World health report 2013 Sensitivity The two main terms used to describe how well a test performs are sensitivity and specificity. Sensitivity – how often a test turns positive for people who have the disease (are positive) HIV + ELISA Test shows positive True positive HIV + ELISA Test shows negative False negative ELISA test for HIV‐antibodies has a sensitivity of 99.7%. ‐> Hence only 3 out of 1000 HIV+ persons will not show positive FP = False positive = 1 FN = False negative = 2 TP = True positive = 5 TN = True negative = 6 + 7 O o Tue Positive (TP) False Positive (FP) _ E = D 2 ad ao | $ o o O => B False Negative (EN) True Negative (TN) Exercise Let's say you have got the following data. Suggest an appropriate threshold level for PSA that separates healthy individuals from prostate cancer patients. Sensitivity is more important than specificity: == PSA (Hg/L) 2 Confusion matrix A confusion matrix is a contingency table that shows the performance of a classification model (TP = true positive, TN = true negative, FP = false positive, FN = false negative): Predicted class Actual class Cancer Healthy Cancer TP FN Healthy FP TN Predicted class Actual class Cancer Healthy Cancer 5 2 Healthy 1 6 Specificity (true negative rate) True negative rate ‐ how often the test shows a negative result for the ones who are healthy Predicted class Actual class Cancer Healthy Cancer TP=5 FN=2 Healthy FP=1 TN=6 6 86% 6 1 TNSpecificity TN FP      Positive predictive value (PPV) PPV ‐ proportion of individuals with a positive test who actually have the disease Predicted class Actual class Cancer Healthy Cancer TP=5 FN=2 Healthy FP=1 TN=6 5 83% 5 1 TPPPV TP FP      Probability that you have cancer given a positive test (note the difference from sensitivity, which is the probability that the test shows positive if you have cancer). We assume that the sample includes the true prevalence Negative predictive value (NPV) NPV ‐ proportion of individuals with a negative test who actually are healthy Predicted class Actual class Cancer Healthy Cancer TP=5 FN=2 Healthy FP=1 TN=6 6 75% 6 2 TNNPV TN FN      Probability that you are healthy, given a negative test (note the difference from specificity, which is the probability that the test shows negative if you are healthy) We assume that the sample includes the true prevalence Example of PPV Let’s assume that we have 1000 individuals who do the test 135+71 = 206 will show a positive test. The fraction of these who actually have cancer is: 71/206 = 0.34 ‐> PPV = 34% TP FN FP TN Validation Using a cutoff level of 2.3 results in a sensitivity of 71% and a specificity of 85%. This cutoff (2.3 µg/ml) was set to maximize accuracy of this particular data. What if we collect 14 new subjects, would a cutoff of 2.3 be optimal for that data set? Validation Validation techniques are used: • To estimate performance of the classifier based on “new” data • For model selection – which model performs best on new data Performance, accuracy, sensitivity etc, is usually measured by “true” error rate based on new data, which can be seen as the error rate for the population of interest. The holdout method If we do not have a big data set, we will have to few data points to check sensitivity and specificity. Example, we have 20 data points ‐> use 20% as test set ‐> only 4 data points are used to determine sensitivity and specificity. Using only 4 data points could, just by chance, result in 100% or 0% sensitivity even though the “true” sensitivity is 50%. Cross validation When we do not have a large data set, we can use cross‐validation to predict the accuracy of our model. Cross‐validation will give us an insight on how the model would perform on an unknown data set. A common cross‐validation technique is the K‐fold cross validation, where the data is splited into K equal samples K‐fold cross validation Data is divided into K equal samples, example: 4‐fold cross validation: Test set Test set Test set Test set Training set Training set Training set Training set 1. Fit model and find appropriate cutoff based on training data. 2. Use test data to estimate accuracy, sensitivity etc. 3. Repeat K‐folds 4. Average (combine) performance results. Generating a ROC curves Consider the following example a Jj - Cutoff = 0.8 o | Sensitivity = 100% > Specificity = 0% ÉS, “do: 3 D ee —= Ma ó S A a 8 O 2 m3 D u a ] X o o O 8 FAB ay > 27 T T 10 08 06 04 02 00 C Healthy Specificity Generating a ROC curves Consider the following example o = 7 Cutoff = 0.85 o | Sensitivity = 100% 2 Specificity = 14% ÉS, “do: 3 D ee —= Ma ó S A a 8 O 2 m3 D u A XL] O o 7 O g o A 8 O ay > 27 T T 10 08 06 04 02 00 C Healthy Specificity Generating a ROC curves Consider the following example a = 7 Cutoff = 1.05 o | Sensitivity = 100% 2 Specificity = 28 % ÉS, “do: 3 D ee —= Ma ó 5 = 3 8 O 2 m3 D u A XL] O o 7 poo So go 7 B ay > 27 T T 10 08 06 04 02 00 C Healthy Specificity Generating a ROC curves Consider the following example a = 7 Cutoff = 2.05 o | Sensitivity = 86 % > Specificity = 86 % 28 - Y 7 o 3 O ee —= Ma B3- ¿ E o O Dee rn, a | l O 9 O 8 z a ay > 27 T T 10 08 06 04 02 00 c Healy Specificity Generating a ROC curves Consider the following example Sa = 7] Cutoff = 2.30 o | Sensitivity = 71 % > Specificity = 86 % g57 “dos 3 D ee —= Ma ó S A a 8 O za ATT a a ] X o o O 8 Ml a ay > 27 T T 10 08 06 04 02 00 c Heallhy Specificity Generating a ROC curves Consider the following example 1.0 7] Cutoff = 2.60 o | Sensitivity = 71% 2 Specificity = 100 % gs E 3 D ee —= Ma 33 - 30d IO a 2 m3 D u A XL] O o 7 O g Ml a ay > 27 T T 10 08 06 04 02 00 c Heallhy Specificity Exercise Based on the data below, draw a ROC curve and calculate the area under the curve. Do you think that the area is significantly different from 0.5? Sensitivity = 4/5 = 80% Specificity = 5/5 = 100% Sensitivity = 4/5 = 80% Specificity = 4/5 = 80% Sensitivity = 5/5 = 100% Specificity = 4/5 = 80% Area = 1 – 0.2*0.2 = 0.96 ROC curves What we want is a high specificity and a high sensitivity, example: 90% sensitivity and 90% specificity 50% sensitivity and 50% specificity – the classifier is not better than just chance ¡Ue Is the ROC curve significant different from the reference line? 2 a y e_ o a a 7] $4 Z39 =S7 2 der R e 2 E h 3 Z E Z y] M e ao ZO al 2-7 aL] 2 T T T T T T 2 10 08 06 04 02 00 10 08 06 04 02 00 Specificity Specificity HO: The area under the ROC curve is equal to 0.5 DJS LDA combines variables by maximizing the separation between the groups e o a A D | O v o e | D 2 D o 2 O 5 o 8 7 8 s o - 7 » a 3“ 8 o Lor O £ A a $] > E a g 0 o E 0D £ o D D 7 a É 7 D e JJ o O O o D 8 o 2 NT E o O 2 a Z o e - . D T T T T A T T Virus Bacteria Virus Bacteria ñ Virus Bacteria D=0.1.CRP+0.7- Temp LDA ‐ cutoff ( ) ( ) (28.87 32.29) 30.58 2 2cutoff D Virus D BacteiraT     A reasonable cutoff could be set to the mean of the means of the discriminant scores of the two groups. 0.1 0.7D CRP Temp    LDA ‐ predict 30.58cutoffT For example, let’s say that a patient has a body temperature of 40o C, and a CRP concentration of 70 mg/L. Does the patient have bacterial or viral infection? 0.10 0.70 0.10 70 0.70 40 35D CRP Temp         0.1 0.7D CRP Temp    Logistic regression Logistic regression is a mathematical modeling approach that can be used to describe the relationship between an dichotomous (YES/NO) dependent variable and independent variables. BMI Ch ol es te ro l Linear regression BMI Di se as e (Y ES /N O ) Logistic regression 0 1 (x)y a bx  logit(x) a bx  Logistic regression Logistic regression is commonly used to predict the risk (or odds ratio) to get a disease based on some explanatory variables: Cancer [YES/NO] = SMOKING + BMI + AGE However, in this course we will only use logistic regression as a classifier. Logistic regression VS LDA Linear discriminant analysis relies on two major assumptions: • independent variables must have a multivariate normal distribution, • the variance‐covariance matrix of all the independent variables must be homogenic among the population groups. Logistic regression does not require any specific assumptions Binary logistic regression can only have 2 groups, whereas LDA can have several groups. Logistic regression involves model building and can generate p‐values that can tell if an explanatory variable is significantly contributing to the prediction. Example BMI Pr ob ab ili ty o f d ise as e 0 1 20 0.5 30 40 50 Calculate probabilities and fit a curve 1/5 = 20% 2/5 = 40% 3/5 = 60% 4/5 = 80% BMI Pr ob ab ili ty o f d ise as e 0 1 20 0.5 30 40 50 For example, we see that there is an 80% risk (probability) of getting the disease if an individual has a BMI between 45 and 55. 1 1( ) 1(x) 1 a b x p e    Logistic regression ( ) 1 1 a b x p e     Fit the function: a=‐5.754, b= 2.747 ( 5.75 2.75 ) 1 1 PSA p e      Logistic regression ‐ predict Example: PSA = 2 µg/L ( ) ( 5.75 2.75 2) 1 1 1 0.43 1 a b xp e e             Cutoff = 50% The probability that the patient has prostate cancer is 43% ‐> classify as healthy. Logistic regression — ROC curve Coordinates of the Curve 2-7 e.. ea Test Result Variable(s): Predicted probability 383 Positive if »o Greater Than Bl orEqualTo* | Sensitivity 1 - Specificity E S7 .0000000 1,000 1,000 e 7 > Heat 0319626 1,000 857 a _— o Cancer Ss AA 0575184 1,000 714 A AA eos 1039871 1,000 571 PSA g/L 1667372 1,000 ,429 5 ROC Curve 2284887 .857 429 : 3439260 857 ,286 08. 4693766 .857 143 .6280978 714 143 gos 7966036 714 000 5 os 8572791 ¿571 000 .8876535 429 ,000 we .9371522 ,286 ¿000 00 9819278 143 000 00 02 04 06 08 10 4 - Specificity 1,0000000 000 000 Petal Width 25 20 1.5 10 TATI io UA Tolo ll Example from the Iris data set. Classify only Versicolor and Virginica. Model if Term Removed y ES Change in -2 a ro. Model Log Log Sig. ofthe e, Variable Likelihood Likelinood Change 7 ES o o Step1 — Sepal length (cm) -6,633 1,367 1 242 _ ETE Sepal width (cm) -7,746 9,594 1 .058 on Petal length (cm) 12,951 14,003 1 .D00 LA Petal width (cm) -11,886 11,873 1 .D01 e Step 2 — Sepalwidth (cm) -10,282 7,298 1 ¿007 PetalLengin Petal length (cm) -13,700 14,133 1 ¿000 Petal width (cm) -15,756 18,246 1 ¿000 Removing sepal length results in lower AIC value compared to the fu Il model. K‐nearest neighbor (KNN) • KNN is a non‐parametric method that can be used for classification. Since it is a non‐parametric method, it does not require normal distribution of the variables and is robust against outliers. • The method does not use any model to fit data. • It involves calculation of the distances from the data points with known group membership to the new observations with unknown membership • The majority of the k nearest neighbors decides the class of the new observations. KNN algorithm 1. Determine the Euclidean distance between the new observation and all data points in the training set. 2. Sort the distances CRP Temp Group D 7 42.0 37.6 Bacteria 3.124100 1 40.0 36.0 Virus 4.000000 11 45.7 38.6 Bacteria 5.869412 8 31.1 42.2 Bacteria 9.167879 9 50.0 38.5 Bacteria 10.111874 3 30.0 36.5 Virus 10.594810 4 21.4 39.4 Virus 18.609675 … KNN algorithm 3. Select the k closest neighbors. Example k = 5 4. Determine the class of the new observation based on group majority of the k nearest neighbors ‐> 4 bacteria, 1 Virus CRP Temp Group D 7 42.0 37.6 Bacteria 3.124100 1 40.0 36.0 Virus 4.000000 11 45.7 38.6 Bacteria 5.869412 8 31.1 42.2 Bacteria 9.167879 9 50.0 38.5 Bacteria 10.111874 … Note the different scales of the axes KNN performance In contrast to LDA and logistic regression, KNN does not provide a threshold that can separate the groups. Instead we can calculate performance similarly as what is done in LOOCV: 1. Select one data point and check its k closest neighbors 2. Predict its class based on majority vote 3. Check if the prediction is correct or not 4. Repeat step 1‐3 for all data points 5. Calculate performance

Documents

questions

Classification Performance Metrics and Techniques, Slides of Biomaterials

Related documents

Partial preview of the text