ezoic

Sunday, August 1, 2021

choose the cut off for binary classification

 

http://ethen8181.github.io/machine-learning/unbalanced/unbalanced.html#choosing-the-suitable-cutoff-value



why most of the time , the cut off or threshold is 0.5?


https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_logistic_roc_curves.htm




in the built-in function, the cut off is 0.5


when reporting the confusion matrix, can change the cut off to an arbituray number, and the confusion matrix will be changed accordingly, when changing the cut off, the false positive rate and false negative rate will change accordingly. 

dat <- iris dat$positive <- as.factor(ifelse(dat$Species == "setosa", "s", "ns")) library(caret) mod <- train(positive~Sepal.Length, data=dat, method="glm")




confusionMatrix(table(predict(mod, type="prob")[,"s"] >= 0.25,
                      dat$positive == "s"))
# Confusion Matrix and Statistics
# 
#        
#         FALSE TRUE
#   FALSE    88    3
#   TRUE     12   47
#                                           
#                Accuracy : 0.9             
#                  95% CI : (0.8404, 0.9429)
#     No Information Rate : 0.6667          
#     P-Value [Acc > NIR] : 2.439e-11       
#                                           
#                   Kappa : 0.7847          
#  Mcnemar's Test P-Value : 0.03887         
#                                           
#             Sensitivity : 0.8800          
#             Specificity : 0.9400          
#          Pos Pred Value : 0.9670          
#          Neg Pred Value : 0.7966          
#              Prevalence : 0.6667          
#          Detection Rate : 0.5867          
#    Detection Prevalence : 0.6067          
#       Balanced Accuracy : 0.9100        

No comments:

Post a Comment

looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...