11.2.8 Weighted Samples

The functions described in this section allow the computation of statistics for weighted samples. The functions accept an array of samples, $ x_i$ , with associated weights, $ w_i$ . Each sample $ x_i$ is considered as having been drawn from a Gaussian distribution with variance $ \sigma_i^2$ . The sample weight $ w_i$ is defined as the reciprocal of this variance, $ w_i =
1/\sigma_i^2$ . Setting a weight to zero corresponds to removing a sample from a dataset.

wmean( w, data)
This function returns the weighted mean of the dataset data using the set of weights w. The weighted mean is defined as

$\displaystyle \hat\mu = (\sum w_i x_i) / (\sum w_i)$ (11.16)

wvariance( w, data)
This function returns the estimated variance of the dataset data, using the set of weights w. The estimated variance of a weighted dataset is defined as

$\displaystyle \hat\sigma^2 = ((\sum w_i)/((\sum w_i)^2 - \sum (w_i^2))) \sum w_i (x_i - \hat\mu)^2$ (11.17)

Note that this expression reduces to an unweighted variance with the familiar $ 1/(N-1)$ factor when there are $ N$ equal non-zero weights.

wvariance_m( w, data, wmean)
This function returns the estimated variance of the weighted dataset data using the given weighted mean wmean.

wsd( w, data)
The standard deviation is defined as the square root of the variance. This function returns the square root of the corresponding variance function wvariance above.

wsd_m( w, data, wmean)
This function returns the square root of the corresponding variance function wvariance_m above.

wvariance_with_fixed_mean( w, data, mean)
This function computes an unbiased estimate of the variance of weighted dataset data when the population mean mean of the underlying distribution is known _a priori_. In this case the estimator for the variance replaces the sample mean $ \hat\mu$ by the known population mean $ \mu$ ,

$\displaystyle \hat\sigma^2 = (\sum w_i (x_i - \mu)^2) / (\sum w_i)$ (11.18)

wsd_with_fixed_mean( w, data, mean)
The standard deviation is defined as the square root of the variance. This function returns the square root of the corresponding variance function above.

wabsdev( w, data)
This function computes the weighted absolute deviation from the weighted mean of data. The absolute deviation from the mean is defined as

$\displaystyle absdev = (\sum w_i \vert x_i - \hat\mu\vert) / (\sum w_i)$ (11.19)

wabsdev_m( w, data, wmean)
This function computes the absolute deviation of the weighted dataset DATA about the given weighted mean WMEAN.

wskew( w, data)
This function computes the weighted skewness of the dataset DATA.

$\displaystyle skew = (\sum w_i ((x_i - xbar)/\sigma)^3) / (\sum w_i)$ (11.20)

wskew_m_sd( w, data, mean, wsd)
This function computes the weighted skewness of the dataset data using the given values of the weighted mean and weighted standard deviation, wmean and wsd.

wkurtosis( w, data)
This function computes the weighted kurtosis of the dataset data. The kurtosis is defined as

$\displaystyle kurtosis = ((\sum w_i ((x_i - xbar)/sigma)^4) / (\sum w_i)) - 3$ (11.21)

wkurtosis_m_sd( w, data, mean, wsd)
This function computes the weighted kurtosis of the dataset data using the given values of the weighted mean and weighted standard deviation, wmean and wsd.