-McElreath 2017
\[H(p) = - \sum p_i log \, p_i\]
Remember, we must average 1 Heads, so,
sum(distribution * 0,1,1,2) = 1
\[H = - \sum{p_i log p_i}\]
Distribution | TT, HT, TH, HH | Entropy |
---|---|---|
Binomial | 1/4, 1/4, 1/4, 1/4 | 1.386 |
Candiate 1 | 2/6, 1/6, 1/6, 2/6 | 1.33 |
Candiate 2 | 1/6, 2/6, 2/6, 1/6 | 1.33 |
Candiate 3 | 1/8, 1/2, 1/8, 2/8 | 1.213 |
Assume 2 draws, p=0.7, make 1000 simulated distributions
Constraints | Maxent distribution |
---|---|
Real value in interval | Uniform |
Real value, finite variance | Gaussian |
Binary events, fixed probability | Binomial |
Non-negative real, has mean | Exponential |
Why?
\[\Large \boldsymbol{Y_i} = \boldsymbol{\beta X_i} + \boldsymbol{\epsilon} \]
\[\Large \epsilon \sim \mathcal{N}(0,\sigma^{2})\]
Likelihood:
\[\Large Y_i \sim \mathcal{N}(\hat{Y_i},\sigma^{2})\]
Data Generating Process:
\[\Large \boldsymbol{\hat{Y}_{i}} = \boldsymbol{\beta X_i} \]
Likelihood:
\[\Large Y_i \sim \mathcal{N}(\hat{Y_i},\sigma^{2})\]
Data Generating Process:
- Transformation (Identity Link):
\[\Large \hat{Y}_{i} = \eta_{i} \]
- Linear Equation:
\[\Large \boldsymbol{\eta_{i}} = \boldsymbol{\beta X_i} \]
Likelihood:
\[\Large Y_i \sim \mathcal{N}(\hat{Y_i},\sigma^{2})\]
Data Generating Process:
- Transformation (Log Link):
\[\Large Log(\hat{Y}_{i}) = \eta_{i} \]
- Linear Equation:
\[\Large \boldsymbol{\eta_{i}} = \boldsymbol{\beta X_i} \]
Likelihood:
\[\Large Y_i \sim \mathcal{N}(\hat{Y_i},\sigma^{2})\] Error is Normal
Data Generating Process:
- Transformation (Log Link):
\[\Large Log(\hat{Y}_{i}) = \eta_{i} \]
- Linear Equation:
\[\Large \boldsymbol{\eta_{i}} = \boldsymbol{\beta X_i} \]
\[ Y_i \sim B(prob, size) \]
Likelihood: \[\Large Y_i \sim \mathcal{B}(\hat{Y_i}, size)\]
Data Generating Process:
Logit Link Function \[\Large Logit(\hat{Y_i}) = \eta_{i}\]
Linear Function
\[\Large \boldsymbol{\eta_{i}} = \boldsymbol{\beta X_i} \]
OR, with Success and Failures
LR Chisq | Df | Pr(>Chisq) | |
---|---|---|---|
Dose | 233.8357 | 1 | 0 |
And logit coefficients
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -1.4077690 | 0.1484785 | -9.481298 | 0 |
Dose | 0.0134684 | 0.0010464 | 12.870912 | 0 |
\[Log-Odds = Log\frac{p}{1-p} = logit(p)\]
\[\beta = logit(p_2) - logit(p_1)\]
\[\beta = Log\frac{p_1}{1-p_1} - Log\frac{p_2}{1-p_2}\]
We need to know both p1 and \(\beta\) to interpret this.
If p1 = 0.7, \(\beta\) = 0.01347, then p2 = 0.702
Likelihood:
\[\Large Y_i \sim \mathcal{N}(\hat{Y_i},\sigma^{2})\]
Data Generating Process:
- Transformation (Identity Link):
\[\Large \hat{Y}_{i} = \eta_{i} \]
- Linear Equation:
\[\Large \boldsymbol{\eta_{i}} = \boldsymbol{\beta X_i} \] But what if We don’t want a Normal Distribution?
Likelihood:
\[\boldsymbol{Y_i} = E(\boldsymbol{\hat{Y_i}}, \theta)\]
E is any distribution from the Exponential Family
\(\theta\) is an error parameter, and can be a function of Y
Data Generating Process:
- Link Function \[\boldsymbol{f(\hat{Y_i})} = \boldsymbol{\eta_i}\]
- Linear Predictor \[\boldsymbol{\eta_i} = \boldsymbol{\beta X}\]
Basic Premise:
We have a linear predictor, \(\eta_i = a+Bx_i\)
That predictor is linked to the fitted value of \(Y_i\), \(\hat{Y_i}\)
We call this a link function, such that \(g(\hat{Y_i}) = \eta_i\)
For example, for a linear function, \(\mu_i = \eta_i\)
For an exponential function, \(log(\mu_i) = \eta_i\)
Identity: \(\hat{Y_i} = \eta_i\) - e.g. \(\mu = a + bx\)
Log: \(log(\hat{Y_i}) = \eta_i\) - e.g. \(\mu = e^{a + bx}\)
Logit: \(logit(\hat{Y_i}) = \eta_i\) - e.g. \(\hat{Y_i} = \frac{e^{a + bx}}{1+e^{a + bx}}\)
Inverse: \(\frac{1}{\hat{Y_i}} = \eta_i\) - e.g. \(\hat{Y_i} = (a + bx)^{-1}\)
Basic Premise:
The error distribution is from the exponential family
For these distributions, the variance is a funciton of the fitted value on the curve: \(var(Y_i) = \theta V(\hat{Y_i})\)
For a normal distribution, \(var(Y_i) = \theta*1\) as \(V(\hat{Y_i})=1\)
For a poisson distribution, \(var(Y_i) = 1*\mu_i\) as \(V(\hat{Y_i})=\hat{Y_i}\)
Distribution | Canonical Link | Variance Function |
---|---|---|
Normal | identity | \(\theta\) |
Poisson | log | \(\hat{Y_i}\) |
Binomial | logit | \(\hat{Y_i}(1-\hat{Y_i})\) |
Negative Binomial | log | \(\mu + \kappa\hat{Y_i}^2\) |
Gamma | inverse | \(\hat{Y_i}^2\) |
Inverse Normal | \(1/\hat{Y_i}^2\) | \(\hat{Y_i}^3\) |
Likelihood:
\[\boldsymbol{Y_i} = E(\boldsymbol{\hat{Y_i}}, \theta)\]
E is any distribution from the Exponential Family
\(\theta\) is an error parameter, and can be a function of Y
Data Generating Process:
- Link Function \[\boldsymbol{f(\hat{Y_i})} = \boldsymbol{\eta_i}\]
- Linear Predictor \[\boldsymbol{\eta_i} = \boldsymbol{\beta X}\]
Likelihood:
\[\boldsymbol{Y_i} \sim \mathcal{P}(\lambda = \boldsymbol{\hat{Y_i}})\]
Data Generating Process: \[log(\boldsymbol{\hat{Y_i}}) = \boldsymbol{\eta_i}\]
\[\boldsymbol{\eta_i} = \boldsymbol{\beta X_i}\]
LR Test
LR Chisq | Df | Pr(>Chisq) | |
---|---|---|---|
HLD_DIAM | 456.6136 | 1 | 0 |
Coefficients:
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 1.778059 | 0.0572585 | 31.05319 | 0 |
HLD_DIAM | 0.023624 | 0.0010502 | 22.49521 | 0 |
Likelihood:
\[\boldsymbol{Y_i} \sim NB(\boldsymbol{\hat{Y_i}}, \boldsymbol{\theta})\]
Data Generating Process: \[log(\boldsymbol{\hat{Y_i}}) = \boldsymbol{\eta_i}\]
\[\boldsymbol{\eta_i} = \boldsymbol{\beta X_i}\]
Analysis of Deviance Table (Type II tests)
Response: FRONDS
LR Chisq Df Pr(>Chisq)
HLD_DIAM 51.145 1 8.578e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1