pypfilt.stats#

Note that there are 9 common definitions of sample quantiles. Weighted versions of these 9 definitions are presented in this article. The qtl_wt() function currently implements type 2 weighted quantiles. However, the differences between these definitions is typically small when there are many values (i.e., particles).

pypfilt.stats.cov_wt(x, wt, cor=False)#

Estimate the weighted covariance or correlation matrix.

Equivalent to cov.wt(x, wt, cor, center=TRUE, method="unbiased") as provided by the stats package for R.

Parameters:

x – A 2-D array; columns represent variables and rows represent observations.
wt – A 1-D array of observation weights.
cor – Whether to return a correlation matrix instead of a covariance matrix.

Returns:

The covariance matrix (if cor=False) or the correlation matrix (if cor=True).

pypfilt.stats.avg_var_wt(x, weights, biased=True)#

Return the weighted average and variance (based on a Stack Overflow answer).

Parameters:

x – A 1-D array of values.
weights – A 1-D array of normalised weights.
biased – Use a biased variance estimator.

Returns:

A tuple that contains the weighted average and weighted variance.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.

pypfilt.stats.qtl_wt(x, weights, probs)#

Calculate weighted quantiles of an array of values, where each value has a fractional weighting.

Weights are summed over exact ties, yielding distinct values x_1 < x_2 < … < x_N, with corresponding weights w_1, w_2, …, w_N. Let s_j denote the sum of the first j weights, and let W denote the sum of all the weights. For a probability p:

If p * W < s_1 the estimated quantile is x_1.
If s_j < p * W < s_{j + 1} the estimated quantile is x_{j + 1}.
If p * W == s_N the estimated quantile is x_N.
If p * W == s_j the estimated quantile is (x_j + x_{j + 1}) / 2.

Parameters:

x – A 1-D array of values.
weights – A 1-D array of weights.
probs – The quantile(s) to compute.

Returns:

The array of weighted quantiles.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.

pypfilt.stats.cred_wt(x, weights, creds)#

Calculate weighted credible intervals.

Parameters:

x – A 1-D array of values.
weights – A 1-D array of weights.
creds (List(int)) – The credible interval(s) to compute (0..100, where 0 represents the median and 100 the entire range).

Returns:

A dictionary that maps credible intervals to the lower and upper interval bounds.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.