Probability mass/density distribution have defined generating functions which faciliate the derivation and the understanding of different probability distributions.

Probability generating functions for discrete variables

The probability generating function is defined as

\[ \Phi_X(t) = E[t^X] = \sum_{k=0}^{\infty} P(X=k) t^k \]

which is a power series representation of the probability mass function of the random variables. The probability of random variable \(X=k\) is generated by

\[ P(X=k) = \frac{\Phi_X^{(k)}(0)}{k!} \]

If random variables have equal probability generating functions, then they have identical distributions.

moments gotten from derivatives

Two important results (mean and variance) can be obtained by differentiating \(\Phi_X(t)\) with respect to \(t\). The first derivative is

\[ \frac{d\Phi_X(t)}{dt} = \sum_{k=0}^{\infty} kP(X=k)t^{k-1} \]

and the second derivative is

\[ \frac{d^2\Phi_X(t)}{dt^2} = \sum_{k=0}^{\infty} k(k-1)P(X=k)t^{k-2} \]

We assign \(t=1\) and get

\[ \Phi'_X(1) = E(X) \]

\[ \Phi''_X(1) = E(X^2)-E(X) \]

So from derivates of \(\Phi_X(t)\) we get

\[ V(X) = E(X^2)-[E(X)]^2 = \Phi''_X(1) + \Phi'_X(1) - [\Phi'_X(1)]^2 \]

sum of multivariate discrete distributions

For a multivariate discrete distribution, the original \(\Phi_X(t)\) is defined as

\[ \begin{aligned} \Phi_{X+Y}(t) &= \sum_{k=0}^{\infty} P(X+Y=k)t^k \\ &= \sum_{k=0}^{\infty} t^k \sum_{r=0}^{r=k} P(X=r)P(Y=k-r) \\ &= \sum_{k=0}^{\infty}\sum_{r=0}^{r=k} P(X=r)t^rP(Y=k-r)t^{k-r} \\ &= \sum_{r=0}^{\infty}\sum_{k=r}^{\infty} P(X=r)t^rP(Y=k-r)t^{k-r} \end{aligned} \]

The change of summation order is analogous to the change of integration order

\[ \int_0^{\infty}dy \int_0^y dx = \int_0^{\infty} dx \int_x^{\infty} dy \]

Thus, the probability generating function of variable \(X+Y\) is equal to the product of respective probability generating functions of \(X\) and \(Y\)

\[ \Phi_{X+Y}(t) = \Phi_X(t) \Phi_Y(t) \]

which also stands for moments generating functions.

Moments generating functions

Moments generating functions are defined as

\[ M_X(t) = E[e^{Xt}] = \begin{cases} \sum_i e^{tx_i} P(X=x_i) & \text{for a discrete distribution} \\ \int e^{tx} f(x) dx & \text{for a continuous distribution} \end{cases} \]

The exponent can be expand as a power series as

\[ M_X(t) = E[e^{tX}] = E[\sum_{k=0}^{\infty} \frac{X^k}{k!} t^k] = \sum_{k=0}^{\infty} \frac{t^k}{k!} E[X^k] \]

So moments of different orders can generated from corresponding derivates of \(M_X(t)\)

\[ E[X^k] = M_X^{(k)}(0) \]

sum of multivariate distributions

Similarly, for an independent multivariate \(X+Y=N\) distribution, the moments generating function is defined as

\[ \begin{aligned} M_{X+Y}(t) &= E[e^{(X+Y)t}] \\ &= \int_{0}^{\infty} e^{nt} f_{X+Y}(n) dn \\ &= \int_{0}^{\infty} e^{nt} dn \int_{0}^{n} f_X(x)f_Y(n-x) dx \\ &= \int_{0}^{\infty} dn \int_{0}^{n} e^{(n-x)t} e^{xt} f_X(x)f_Y(n-x) dx \\ &= \int_{0}^{\infty} dx \int_{x}^{\infty} e^{(n-x)t} e^{xt} f_X(x)f_Y(n-x) dn \\ &= \int_{0}^{\infty} e^{xt} f_X(x) dx \int_{x}^{\infty} e^{(n-x)t} f_Y(n-x) dn \\ &= M_{X}(t)M_{Y}(t) \end{aligned} \]

which is the product of respective moments generating functions of independent variables \(X\) and \(Y\). This property is useful in the derivation of Poisson/Binomial distributions from Binomial/Bernoulli distributions or Gamma distribution from exponential distribution.

Characteristic functions

Characteristic functions are defined as

\[ C_X(t) = E[e^{iXt}] = \int_{-\infty}^{\infty} e^{ixt} f(x) dx \]

It's analogous to moments generating functions \(C_X(t) = M_X(it)\). So they share similar properties including,

\[ E[X^k] = (-i)^k C_X^k(0) \]

\[ C_{X+Y}(t) = C_X(t)C_Y(t) \]

Besides, \(C_X(t)\) is exactly the Fourier transform of \(f(x)\) and vice versa.

\[ f(x) = \frac{1}{2\pi} \int_{-\infty}^{\infty} C_X(t) e^{-ixt} dt \]