pixel 3 vs iphone
1. some apps available on pixel , but not on ios
2. pixel 3 has better connection on wifi than iphone.
I wrote about the solutions to some problems I found from programming and data analytics. They may help you on your work. Thank you.
pixel 3 vs iphone
1. some apps available on pixel , but not on ios
2. pixel 3 has better connection on wifi than iphone.
If you frequently use your usb on your computer, your computer will more easily freeze up.
facebook , bidding on click through, cost per impressions, conversion?
how to choose from the prices? bidding
why most of the time , the cut off or threshold is 0.5?
https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_logistic_roc_curves.htm
in the built-in function, the cut off is 0.5
when reporting the confusion matrix, can change the cut off to an arbituray number, and the confusion matrix will be changed accordingly, when changing the cut off, the false positive rate and false negative rate will change accordingly.
dat <- iris dat$positive <- as.factor(ifelse(dat$Species == "setosa", "s", "ns")) library(caret) mod <- train(positive~Sepal.Length, data=dat, method="glm")
confusionMatrix(table(predict(mod, type="prob")[,"s"] >= 0.25,
dat$positive == "s"))
# Confusion Matrix and Statistics
#
#
# FALSE TRUE
# FALSE 88 3
# TRUE 12 47
#
# Accuracy : 0.9
# 95% CI : (0.8404, 0.9429)
# No Information Rate : 0.6667
# P-Value [Acc > NIR] : 2.439e-11
#
# Kappa : 0.7847
# Mcnemar's Test P-Value : 0.03887
#
# Sensitivity : 0.8800
# Specificity : 0.9400
# Pos Pred Value : 0.9670
# Neg Pred Value : 0.7966
# Prevalence : 0.6667
# Detection Rate : 0.5867
# Detection Prevalence : 0.6067
# Balanced Accuracy : 0.9100
advertising related
1. ad network
2. ad exchange
3. RBT real time bidding
4. DSP demand side platform
5. DMP data-management platform
6. programmatic buying
7. Private market place
8. Programmatic Direct buy
9. Premium Inventory
10. Remnant Inventory
11. CPM cost per mille
12. CPC cost per click, Cost Per Thousand;Cost Per Impressions
13. CPC (Cost Per Click;Cost Per Thousand Click-Through)
14. CPA(Cost-per-Action)
15. CPS(Cost-Per-Sale)
16. CPT cost per time
17. CPV cost per visit
18. CPI cost per visit
19. CPD cost per download
20. banner
21 Interstitial
22. Native Advertising (Native Ads)
Operation related
23. AARRR :Acquisition、Activation、Retention、Revenue、Refer
24. DNU(Daily New Users)
25. CAC(Customer Acquisition Cost)
26. CPC (Cost Per Customer )
27. CR (Conversions Rates)
28. DAU(Daily Active Users)
29. WAU(Weekly Active Users)
30. MAU(Monthly Active Users)
31. DEC(Daily Engagement Count)
32. DAOT/AT(Daily Avg.Online Time)
33. DAU (Daily Active User)
34. MAU (Monthly active users)
35.Users Retention
36. Day 1/3/7/30 Retention Ratio
37. Users Churn
38. Day 1 Churn Ratio
39. Day 7 Churn Ratio
40. Day 30 Churn Ratio
41. MPR(Monthly Payment Ratio)
42. MAU, APA
43. APA(Active Payment Account)
44. ARPU(Average Revenue per Uers)
45. ARPU
46. monthly ARPU= /MAU
47.ARPPU(Average Revenue per Paying User)
48. ARPPU
monthly ARPPU=
49. life time
50 life time value
51 PCU(Peak Concurrent Users)
52. ACU(Average Concurrent Users)
53. New Users Converstion Rate
54. SEO(Seach Engine Optimization)
55. SEM (Search Engine Marketing)
56. ASO (App Store Optimization)
57. KPI(Key performance indicators)
58. GMV(Gross Merchandise Voltume )
59. SKU (Stock Keeping Unit)
60. Long Tail Keyword
61. MVP(Minimum Viable Product )
62. SP (Service Provider)
63. CP(Content Provider
64. BD (Business Development)
65. SDK (Software Development Kit)
66. UE/UE(User Experience)
67. EDM (Email Direct Marketing)
68. SNS (Social Networking Services)
69. UGC (User Generated Content)
70. PGC(Professional Generated Content)
71. OGC(Occupationally-generated Content)
72. KOL(Key Opinion Leader)
1. how to normalize data
2. how to detect outlier, what is IQR?
3. how to reverse a list in python
4. how to insert a number in a list in python
5. how does spark's rdd work? how is it diffrent from pyspark's dataframe?
6. how to calculate cumulative sums in a table in sql
7. what is the difference between mapreduce and in-memory?
8. what is mapreduce?
9. what is lag?
10. proceeding and in sql?
11. how to count number of data points in a numpy array?
12. how to do hyperthesis test?
13. what is false positive rate? what is false negative rate?
14. how to delete duplicates in a dataframe in python?
15. what is false discovery rate? and bonferroni correction?
what is false discovery rate?
https://www.youtube.com/watch?v=3PVkfQRUGI4
an interesting video talking about it.
it is something we predefined in a hyperthesis testing. a type one erro for the multiple testing we tried to control.
https://www.youtube.com/watch?v=HLzS5wPqWR0
to understand bonferroni correction, first , we need to understand family-wise error rate,
a1=type one error
FWER=1-(1-a1)^m
m is the number of tests
bonferroni correction
corrected a1
=a1/k
k is the number of tests performed.
FWER=1-(1-a1/k)^k
编写高质量Python代码的59个有效方法
https://l1nwatch.gitbook.io/writing_solid_python_code_gitbook/
https://medium.com/@kyawsawhtoon/log-transformation-purpose-and-interpretation-9444b4b049c9
Before we get into log transformation, let’s quickly talk about normal distribution. Normal distribution is a probability and statistical concept widely used in scientific studies for its many benefits. Just to name a few of these benefits— normal distribution is simple. Its mean, median and mode have the same value and it can be defined with just two parameters: mean and variance. It also has important mathematical implications such as the Central Limit Theorem.
Unfortunately, our real-life datasets do not always follow the normal distribution. They are often so skewed making the results of our statistical analyses invalid. That’s where Log Transformation comes in.
When our original continuous data do not follow the bell curve, we can log transform this data to make it as “normal” as possible so that the statistical analysis results from this data become more valid.
poisson distribution
statistical consulting skills
Statistical consulting for dissertations is our sole focus. It is required for many fields and offers extensive potentials to people in need of it. Conducting constructive analysis and research on certain topics is highly relevant because of the competition that exists in the contemporary world today. Statistical consulting is therefore a necessary tool for obtaining the required and significant data in many fields and domains.
Statistical consulting is necessary in the following areas:
· Science and Medicine
· Business and Commerce
· Social Sciences like Psychology and Sociology
· Government Bodies and Law
· Universities and Colleges for dissertations and theses
Statistical consulting is very popular and is applied in almost every aspect of society because it ensures adequate and successful functioning of organizations. The activities that are associated with statistical consulting ranges, and can concern any topic. The task of consulting varies from project to project and involves the statistician acting as the problem solver by conducting selecting the appropriate analysis, conducting analyses on the data, and interpreting the findings. In statistical consulting, the consultant also acts as a guide and advisor to the client.
Consulting is very effective and accurate and is therefore a necessary entity in today’s day and age. A statistician should possess certain qualities that ensure his success. For a statistician, statistical consulting requires the following characteristics:
· Good Communication Skills: The statistician must possess good communication skills so that the consultant can interact with the client fluently and comfortably. Once the idea is made clear to the consultant (through healthy, professional conversations with the client) the statistical consultant is able to carry on with their work professionally as per the clients needs.
· Scientific Interest: It requires a keen and eager interest in the pursuits of science. Science forms the core root of statistics and is a fundamental feature in statistical consulting.
· Statistical Knowledge: Without proper training and education in statistics, one cannot engage in statistical consulting. One has to be able to understand the subject and to apply the required technical and specialized techniques and procedures of statistics.
· Computer Proficiency: Basic computer skills are essential. The statistician must be able to utilize the computer while making use of the new and latest statistical software available in the market today.
Statistical consulting necessitates that the statistician perform research studies and experiments. It also includes designing the experiments needed for observations and interpretations. With statistical consulting at hand, organizations need not worry themselves with the problem of obtaining the needed information.
Statistical consulting is instrumental to small scale industries in particular. Small scale industries can gain profits through statistical consulting as the statistician gives the industry the opportunity to conduct proper researches as well as giving them a full length statistical analysis. Without this, the company would not have the resources or knowledge to carry on with the project.
Contemporary times offer a number of possibilities to people. The advent of statistics and statistical consulting has in many ways made things a lot easier for everyone. Statistical consulting has brought with it an endless number of solutions for research findings and data analysis. Information is an important need and statistics have various ways of finding that information so that it may be utilized to bring about advancement and evolution. Clearly, this consulting is of crucial importance today.
statistics subject
nonparametric statistics
ranking data
normality testing methods
sample size calculation and power analysis from toward data science
https://towardsdatascience.com/experiment-sample-size-calculation-using-power-analysis-81cb1bc5f74b
youtube channel for survival analysis
https://www.youtube.com/c/marinstatlectures/videos
a webpage for survival analysis
https://www.emilyzabor.com/tutorials/survival_analysis_in_r_tutorial.html
In suivival analysis, S(t)=p(T>t) denote the probablity an event survived time t, called survival probablity.
Kaplan-Meier survival estimate, it is a non-parametric method to estitmate the survival probability based on the survival times.
Log-Rank test , after KM, it is not easy to do the test still. Then we use log-rank test to compare several survival curves.
hazard probability.
beta HR (95% CI for HR) wald.test p.value
age 0.019 1 (1-1) 4.1 0.042
sex -0.53 0.59 (0.42-0.82) 10 0.0015
ph.karno -0.016 0.98 (0.97-1) 7.9 0.005
ph.ecog 0.48 1.6 (1.3-2) 18 2.7e-05
wt.loss 0.0013 1 (0.99-1) 0.05 0.83
The output above shows the regression beta coefficients, the effect sizes (given as hazard ratios) and statistical significance for each of the variables in relation to overall survival. Each factor is assessed through separate univariate Cox regressions.
From the output above,
The variables sex, age and ph.ecog have highly statistically significant coefficients, while the coefficient for ph.karno is not significant.
age and ph.ecog have positive beta coefficients, while sex has a negative coefficient. Thus, older age and higher ph.ecog are associated with poorer survival, whereas being female (sex=2) is associated with better survival.
Cox model, they want to estimate if a treatment is useful?
I am a mid aged woman. I was born in 1980. I do not have any kid. no complicated dating before . I am looking for a man here for marriage...