why most of the time , the cut off or threshold is 0.5?
https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_logistic_roc_curves.htm
in the built-in function, the cut off is 0.5
when reporting the confusion matrix, can change the cut off to an arbituray number, and the confusion matrix will be changed accordingly, when changing the cut off, the false positive rate and false negative rate will change accordingly.
dat <- iris dat$positive <- as.factor(ifelse(dat$Species == "setosa", "s", "ns")) library(caret) mod <- train(positive~Sepal.Length, data=dat, method="glm")
confusionMatrix(table(predict(mod, type="prob")[,"s"] >= 0.25,
dat$positive == "s"))
# Confusion Matrix and Statistics
#
#
# FALSE TRUE
# FALSE 88 3
# TRUE 12 47
#
# Accuracy : 0.9
# 95% CI : (0.8404, 0.9429)
# No Information Rate : 0.6667
# P-Value [Acc > NIR] : 2.439e-11
#
# Kappa : 0.7847
# Mcnemar's Test P-Value : 0.03887
#
# Sensitivity : 0.8800
# Specificity : 0.9400
# Pos Pred Value : 0.9670
# Neg Pred Value : 0.7966
# Prevalence : 0.6667
# Detection Rate : 0.5867
# Detection Prevalence : 0.6067
# Balanced Accuracy : 0.9100
No comments:
Post a Comment