Bayesian – 6109.hidepiy.com

Least square method: deciding model parameters by minimizing RSS (residual sum of squares)
ML (maximum likelihood) estimation: estimating model parameters by finding the parameter values that maximize the likelihood of making the observations given the parameters
MAP (maximum a posteriori) estimation

Bayes’ Theorem

$P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)}$

where A and B are events and P(B) ≠ 0.

$P(A \mid B)$ : conditional probability of event A given that B is true.
$P(A)$ , $P(B)$ : probability of event A, probability of event B without regarding each other.
$P(B \mid A)$ : conditional probability of event A given that B is true.

$P(A \cap B) = P(B \cap A)$

$P(A)P(A \mid B) = P(B)P(B \mid A)$

$P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)}$

example

Suppose that a probability of having disease A is 0.5%.
And suppose that a test is 99% sensitive (true positive rate), 95% specific (true negative rate).
If you’re detected as positive by the test. What is the probability that you have disease A?

$P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} = \frac{0.99 * 0.005}{0.99 * 0.005 + 0.05 * 0.995} \approx 0.090$

Even if you’re diagnosed as positive, the probability of having disease is only around 9%

Bayesian inference

$\theta = (\mu, \sigma)$ : parameters of probability distribution
$x = (x_1, x_2, \dots, x_N)$ : observed data (fixed)
$f(\theta \mid x)$ : posterior probability
$f(x \mid \theta)$ : likelihood
$f(\theta)$ : prior probability
$f(x)$ : marginal likelihood or normalization constant

$f(\theta \mid x) = \frac{f(x \mid \theta) \, f(\theta)}{f(x)} \propto f(x \mid \theta)f(\theta)$

Try to find $\theta$ which maximizes $f(\theta \mid x)$

http://mlss.tuebingen.mpg.de/2015/slides/ghahramani/lect1bayes.pdf

http://hosho.ees.hokudai.ac.jp/~kubo/stat/2010/Qdai/b/kuboQ2010b.pdf