pypfilt.stats

Note that there are 9 common definitions of sample quantiles. Weighted versions of these 9 definitions are presented in this article. The qtl_wt() function currently implements type 2 weighted quantiles. However, the differences between these definitions is typically small when there are many values (i.e., particles).

pypfilt.stats.cov_wt(x, wt, cor=False)

Estimate the weighted covariance or correlation matrix.

Equivalent to cov.wt(x, wt, cor, center=TRUE, method="unbiased") as provided by the stats package for R.

Parameters:
  • x – A 2-D array; columns represent variables and rows represent observations.

  • wt – A 1-D array of observation weights.

  • cor – Whether to return a correlation matrix instead of a covariance matrix.

Returns:

The covariance matrix (if cor=False) or the correlation matrix (if cor=True).

pypfilt.stats.avg_var_wt(x, weights, biased=True)

Return the weighted average and variance (based on a Stack Overflow answer).

Parameters:
  • x – A 1-D array of values.

  • weights – A 1-D array of normalised weights.

  • biased – Use a biased variance estimator.

Returns:

A tuple that contains the weighted average and weighted variance.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.

pypfilt.stats.qtl_wt(x, weights, probs)

Calculate weighted quantiles of an array of values, where each value has a fractional weighting.

Weights are summed over exact ties, yielding distinct values x_1 < x_2 < … < x_N, with corresponding weights w_1, w_2, …, w_N. Let s_j denote the sum of the first j weights, and let W denote the sum of all the weights. For a probability p:

  • If p * W < s_1 the estimated quantile is x_1.

  • If s_j < p * W < s_{j + 1} the estimated quantile is x_{j + 1}.

  • If p * W == s_N the estimated quantile is x_N.

  • If p * W == s_j the estimated quantile is (x_j + x_{j + 1}) / 2.

Parameters:
  • x – A 1-D array of values.

  • weights – A 1-D array of weights.

  • probs – The quantile(s) to compute.

Returns:

The array of weighted quantiles.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.

pypfilt.stats.cred_wt(x, weights, creds)

Calculate weighted credible intervals.

Parameters:
  • x – A 1-D array of values.

  • weights – A 1-D array of weights.

  • creds (List(int)) – The credible interval(s) to compute (0..100, where 0 represents the median and 100 the entire range).

Returns:

A dictionary that maps credible intervals to the lower and upper interval bounds.

Raises:

ValueError – if x or weights are not one-dimensional, or if x and weights have different dimensions.