type
Post
Created date
Jun 16, 2022 01:43 PM
category
Data Science
tags
Machine Learning
Machine Learning
status
Published
Language
From
summary
slug
password
Author
Priority
Featured
Featured
Cover
Origin
Type
URL
Youtube
Youtube
icon

What

What does a 95% confidence interval mean?

  • A 95% confidence interval means that 19 times out of 20 (95%), you would expect real measure (i.e. Truth value) to fall within this range of the sample measurement. 1 time out of 20 you would get unlucky with your sample, and the Truth value would fall outside of that range. (Reddit )
Example : (如何理解 95% 置信区间? - 知乎 (zhihu.com)
  • 海底撈魚, 95% 撈到
  • 彩票: 一盒 VS 一張

the quality of our estimator with respect to the number of samples. (openst)
To understand what CI is, first understand what ̂Θ is (ITP)
 
Two common scenarios to be used : (MIT-Class 23.1)
Standardised Statistic
Hypo testing

What is True value ? truth

  • Cannot be found but Can only be estimated
    • (对于人类真实的平均身高,我们是没有办法知道的,因为几乎不可能把每个人都统计到。 但这个数据肯定是真实存在的,我们可以说,上帝知道。)
  • When the interval includes the population percentage, we say the interval covers the truth. (berkeley.edu)

What is coverage ?(berkeley.edu)

  • The chance that the random interval will contain the true population percentage is called the coverage probability of the interval. Given that The interval is random, because it is centered at the sample percentage, which is random. (berkeley.edu)
notion image

 

What is Bias ?

  • A measurement procedure or estimator is said to be biased if, on the average, it gives an answer that differs from the truth.
  • The average (expected) difference between the measurement and the truth. (i.e. )
    • For example, if you get on the scale with clothes on, that biases the measurement to be larger than your true weight (this would be a positive bias). The design of an experiment or of a survey can also lead to bias. Bias can be deliberate, but it is not necessarily so. See also nonresponse bias.

What is

Why

  • Unlikely the point estimate from MLE, this is less arbitary and more accurate. (MIT-Class22)
  • Used to report the confidence about the point estimates. (ITP p.559)

How

Interpretation:
  • This means that if you have 100 intervals, 95 of them will contain the true proportion, and 5% will not.
    • The wrong interpretation : there is a 95% chance that the true value of p will fall between 0.65 and 0.73.
      • The reason that this interpretation is wrong is that the true value is fixed out there somewhere. You are trying to capture it with this interval. So this is the chance is that your interval captures it, and not that the true value falls in the interval. (libretext)
 
Example : life expectancy (ITP)

Example 2 : (libretext)

Example 3 : political poll (MIT-Class 23.2)
Scenario :
Suppose we want to use a political poll to estimate the proportion of the population that supports candidate A, or equivalent the probability θ that a random person supports candidate A.
Interpretation:
notion image

 

Reference

openst: (Here)
ITP - The best source to get the gist Chan, S. (2021). Introduction to Probability for Data Science. Michigan Publishing Services. (p.548) (Here)
(MIT-Class22) (Here)
(MIT-Class 23.1 - Confidence Intervals: Three Views) (Here)
(MIT- Class 23.2 Confidence Intervals for the Mean of Non-normal Data ) (Here)
 
Coaching vs mentoringCritical thinking