Definitions
Acronym | Meaning | Use |
---|
PDF | probability density function (continuous) | describes relative likelihood |
PMF | probability mass function (discrete) | probability at specific values |
CDF | cumulative distribution function | probability ≤ x |
CI | confidence interval | estimates parameter range |
ANOVA | analysis of variance | tests group mean differences |
Key Symbols
Symbol | Meaning | Use |
---|
n | sample size | number of observations |
xi | i-th data value | individual measurement |
xˉ | sample mean | average of sample |
μ | population mean | true average |
s, s2 | sample SD, variance | sample spread |
σ, σ2 | population SD, variance | true spread |
p | success probability | Bernoulli success rate |
k | count of successes | number of positive outcomes |
λ | rate parameter | events per unit time |
α | significance level | probability of Type I error |
z,t | standard quantiles | critical values |
χ2 | chi-square quantile | variance test value |
β0,β1 | intercept, slope | regression coefficients |
R2 | coefficient of determination | explained variance fraction |
Basics
Concept | Formula | Use |
---|
Mean | xˉ=n1∑ixi | central location |
Sample Variance | s2=n−11∑(xi−xˉ)2 | measure of dispersion |
Population Variance | σ2=n1∑(xi−μ)2 | actual dispersion |
PDF | f(x) | density shape |
CDF (cont.) | F(x)=∫−∞xf(t)dt | cumulative probability |
PMF | P(X=k) | discrete probability at k |
CDF (disc.) | F(k)=∑i=0kP(X=i) | cumulative discrete prob. |
Discrete Distributions
Distribution | PMF | CDF | Mean / Var | Use |
---|
Binomial (n,p) | P(X=k)=(kn)pk(1−p)n−k or like (P) | P(X≤x)=∑i=0k(in)pi(1−p)n−i | np, np(1−p) | modeling # successes in n trials |
Poisson (λ) | P(X=k)=k!λke−λ | F(k)=∑i=0ki!λie−λ | λ, λ | counts of rare events |
Binomial | [nx]=(n−x)!x!n! | | | |
Continuous Distributions
Distribution | PDF | CDF | E, Var | Use |
---|
Normal (μ,σ2) | σ2π1e−2σ2(x−μ)2 | Φ(σx−μ) | μ, σ2 | symmetric spread around mean |
Exponential (λ) | λe−λx,x≥0 | 1−e−λx | 1/λ, 1/λ2 | time between Poisson events |
Lognormal (μ,σ2) | xσ2π1e−2σ2(lnx−μ)2 | Φ(σlnx−μ) | eμ+21σ2, (eσ2−1)e2μ+σ2 | modeling skewed positive data |
Gamma (α,β) | Γ(α)βαxα−1e−x/β | Γ(α)1γ(α,x/β) | αβ, αβ2 | sum of α exponentials |
Weibull (k,λ) | λk(λx)k−1e−(x/λ)k | 1−e−(x/λ)k | λΓ(1+k1), λ2[Γ(1+k2)−Γ(1+k1)2] | life/failure time distribution |
Parameter Estimation
Estimate | Formula | Use |
---|
CI for mean | xˉ±tn−1,α/2ns | interval for true mean |
CI for variance | (χ1−α/22(n−1)s2,χα/22(n−1)s2) | interval for true variance |
CI for proportion | p^±znp^(1−p^) | interval for true proportion |
Hypothesis Testing
Test | Statistic | Decision Rule | Use |
---|
Z-test | Z=σ/nxˉ−μ0 | reject if ∣Z∣>zα/2 | test mean with known σ |
T-test | T=s/nxˉ−μ0 | reject if ∣T∣>tn−1,α/2 | test mean with unknown σ |
p-value | p=P(∣Z∣>∣z∣) | reject if p<α | quantify evidence against H0 |
Error Propagation
Rule | Formula | Use |
---|
Addition / Subtraction | Z=X±Y, ΔZ=ΔX+ΔY | combine absolute errors |
Multiplication / Division | Z=X⋅Y or Z=X/Y, ∥Z∥ΔZ=∥X∥ΔX+∥Y∥ΔY | Z |
Power | Z=Xn, ∥Z∥ΔZ=∣n∣∥X∥ΔX | Z |
General Function | Z=f(X1,…,Xk), ΔZ=∑i=1k(∂Xi∂fΔXi)2 | derivative propagation |
Outlier Criterion (Chauvenet) | ∣x−xˉ∣>kσ | detect outliers |
Regression Analysis
Concept | Formula | Use |
---|
Slope | β^1=∑(xi−xˉ)2∑(xi−xˉ)(yi−yˉ) | effect per unit change |
Intercept | β^0=yˉ−β^1xˉ | value when x=0 |
R2 | 1−∑(yi−yˉ)2∑ei2 | proportion variance explained |
ANOVA F | F=SSE/(n−2)SSR/1 | test overall regression fit |