pypfilt.sampler.Independenthas been removed. Use
pypfilt.sampler.LatinHypercubeinstead. Note that this may require changing distribution names and parameters.
pypfiltno longer includes the particle history matrix when returning the results of fitting and forecasting passes. Returning these matrices substantially increased memory usage when running many estimation and/or forecasting passes, which was unintentional. To include particle history matrices in the results, change the value of the new setting
Breaking change: structured types are now used for particle filter events, and event handlers must be updated accordingly. See the
pypfilt.eventmodule for details.
pypfilt.model.OdeModelnow reads custom solver options from the new setting
Enhancement: the post-regularisation kernel bandwidth can now be adjusted by changing the value of
Enhancement: images produced by
pypfilt.plot.Plotshould now be reproducible. Previously, these images included the matplotlib version in their metadata.
pypfilt.io.write_tableno longer requires a time scale, and can instead convert time values into strings using
pypfilt.io.load_datasetnow supports loading subsets of datasets.
Enhancement: add a new convenience function
pypfilt.io.load_summary_tablefor loading summary tables.
pypfilt.io.load_summary_tablenow accept a time scale, a scenario instance, or a simulation context as their first argument, rather than requiring a time scale.
pypfilt.adaptive_fit()now return the simulation results as structured types, rather than plain dictionaries. See the online documentation for details.
'time'is now used instead of
'fs_date') to identify observation times, time values in summary tables, etc. Summary tables, summary monitors, and observation models must be updated.
Model.update()now receives a
TimeStepargument instead of the time-step end and step size arguments. All simulation models that implement this method must be updated.
Breaking change: changed how
pypfilt.stats.qtl_wtcalculates quantiles, to avoid under-estimating credible interval bounds. Previously, this function was a faithful re-implementation of the
wtd.quantile()function in the
HmiscR package. However, this included a long-standing bug that produced the following behaviour:
import numpy as np import pypfilt.stats x = np.array([1, 2]) weights = np.array([0.1, 0.9]) # Returns [2.0, 2.0]. pypfilt.stats.qtl_wt(x, weights, probs=[0, 0.05])
This affects only the following summary tables in
While some of the credible intervals produced by these tables may differ from those produced by earlier versions of
pypfilt, the absolute differences should be very small. Accordingly, while this may affect the reproducibility of regression tests, it should not affect the meaning or interpretation of the outputs.
Breaking change: summary table and monitor methods no longer receive the
end_datearguments. The simulation start and end times can be retrieve with
Context.end_time(), respectively, and
Context.summary_times()now defines the times at which summary statistics will be calculated. All summary tables and summary monitors must be updated.
Bug fix: allow forecasting from the start of the simulation period, rather than raising an exception. This addresses an edge case that may not be particularly useful, but which might conceivably occur.
Enhancement: allow the particle ensemble to be divided into separate partitions. This can be used to, e.g., maintain an invariant distribution for a subset of model parameters.
Enhancement: the new summary tables
pypfilt.summary.PartitionPredictiveCIsrecord separate credible intervals for each partition in the ensemble.
Enhancement: add support for reservoir partitions, which are never reweighted or resampled. This can be used to preserve a representative sample of the model prior distribution, or to provide candidate particles when resampling non-reservoir partitions.
Enhancement: ignore observations that result in zero net weight. When conditioning on observation(s) at a given time would result in all particles having zero weight, the new default behaviour is to ignore these problematic observations and retain the previous weights. The original behaviour (raising an exception) can be recovered by setting
Enhancement: add adaptive fitting methods that run a series of estimation passes where the observation models are tuned or scaled in each pass. See the
pypfilt.adaptive_fitdocumentation for details.
Enhancement: snapshots can now be sliced to capture a subset of the particles. As a consequence, the
ixsargument has been removed from observation model methods, because the caller can slice snapshots instead.
Enhancement: add support for resampling the particles before each forecasting pass. Set
"filter.resample.before_forecasting"to true to enable this behaviour.
Enhancement: summary statistics can now be recorded more than once per unit time, by changing the value of the new setting
pypfilt.summary.SimulatedObssummary table now supports controlling the number of particles for which observations are simulated, and the number of observations per particle.
pypfilt.simulate_from_model()now allows the user to define how many observations to simulate for each particle.
pypfilt.model.Modelnow provides default implementations of the
pypfilt.obs.Univariatenow stores the observation model settings dictionary in
self.settings, so that sub-classes can access these settings without defining their own
Enhancement: parameter-free observation models no longer require an empty parameters table in the scenario definition.
Enhancement: state vector fields can be regularised without lower or upper bounds.
scatter(), which allows the marker size to be controlled by the
Time.with_observation_tables()generator now supports multiple observations from each stream at a single time-step, as appropriate. If there are multiple observations from a stream that pertain to the same time-step, but which have different observation dates, a warning will be raised that smaller time-steps should be considered.
pypfilt.simulate_from_model()now returns observation tables that do not include an
pypfilt.io.write_table()function to save data tables to plain-text files with column headers. This can be used for, e.g., saving simulated observations so that these observations can then be used for fitting and forecasting.
pypfilt.time.Datetimesupports custom date-time format strings, which can be defined in the scenario settings, and can be temporarily overridden with the
pypfilt.model.OdeModelclass for ODE models, which provides a convenient wrapper around
scipy.integrate.solve_ivp(), allowing right-hand side functions to work with structured state vector arrays.
pypfilt.examples.lorenz.Lorenz63model of the Lorenz-63 system.
Enhancement: suppress spurious runtime warnings when calculating quantiles. Evaluating the percent point function for an observation model can cause run-time warnings when there are outlier particles, even though the returned values are valid. We now suppress these warnings and explicitly check that the returned values are finite.
Bug fix: if the output directory does not exist and multiple simulations are running in parallel, there was a potential race condition where multiple simulations would attempt to create the output directory, causing one or more simulations to raise a
FileExistsError. We now prevent
os.makedirs()from raising this error by passing
Enhancement: suppress spurious runtime warnings when calculating quantiles. Evaluating the weighted CDF for an observation model can cause run-time warnings when there are outlier particles, even though the returned values are valid. We now suppress these warnings and explicitly check that the returned values are finite.
Documentation: explain how to read prior samples from external data files. Samples can be read from space-delimited text files, and from HDF5 datasets.
This release introduces major improvements and simplifications. Similar to the 0.6.0 release, this involves some structural changes and breaks backwards compatibility with earlier releases. Key changes include:
Breaking change: require Python 3.7 or newer.
Breaking change: the particle history matrix and particle states are now represented using structured NumPy arrays. All simulation models must be updated.
Breaking change: the observation model interface has been simplified, and are no longer passed a separate parameters dictionary. All observation models must be updated.
Breaking change: observations and simulated observations are now stored in structured arrays, rather than as lists of dictionaries. All observation models must be updated.
Breaking change: summary tables no longer need to (de)serialise time and string values, but should instead identify these columns with
pypfilt.io.string_field(). All summary tables must be updated.
Breaking change: particle states are now provided as snapshots to summary tables and monitors, and to observation models. All summary tables, summary monitors, and observation models must be updated.
Breaking change: major changes to forecast scenario definitions in TOML files; many tables and settings have been moved and/or renamed. All TOML files must be updated.
Breaking change: in output HDF5 files, summary tables are now saved in the “tables” group, rather than in the “data” group.
pypfilt.sweephave been removed. Use
pypfilt.load_instances()to iterate over scenarios.
pypfilt.obs.Univariateclass greatly simplifies implementing new observation models.
Enhancement: add support for reading prior samples from plain-text data files and from HDF5 datasets.
Enhancement: PRNG states are now cached, so that outputs are identical whether or not a forecast begins from a cached state.
Enhancement: add support for parameter-free and state-free models.
Enhancement: add support for “mini-steps” with the
pypfilt.model.ministepsdecorator, which can greatly reduce the size of the history matrix.
Enhancement: record a greater number of credible intervals by default.
Enhancement: add support for measuring forecast accuracy with CRPS.
Enhancement: add support for saving the particle history matrix and the back-cast matrix.
Enhancement: add summary tables for calculating back-cast statistics.
pyproject.toml(PEPs 517 and 518).
h5py requirement to ensure that version 2.x is installed.
This release introduces major structural changes to the entire package, and
incorporates a number of features that were originally implemented in the
Please see the online documentation for further details.
The major user-facing changes are:
Breaking change: drop support for Python 2, require Python 3.6 or newer.
Breaking change: forecast scenarios are now defined in TOML files.
Bug fix: ensure that
pypfilt.steprecords the true start of the simulation period, if it has not already been defined.
pypfilt.runnow returns the current index into the history matrix, which allows repeat calls to
pypfilt.runto be chained together. This may be of use when, e.g., generating a sequence of forecasts where each forecast is sufficiently short that it will not cause the simulation window to move past the end of the previous estimation run.
Ensure the documentation builds correctly on Read The Docs.
Bug fix: ensure the true start of the simulation period is always recorded.
Enhancement: record the true start of the simulation period, so that even if the estimation run or forecasting run begins at a later date, the true start is available (
Enhancement: axis and series labels can now be defined by arbitrary functions.
pypfilt.plot.seriesnow support string scales.
pypfilt.checkmodule provides convenience functions for checking invariants. Currently, it is able to check the history matrix dimensions. See the API documentation for further details.
Enhancement: add instructions for install
Enhancement: provide example commands for the release process.
Bug fix: make
pypfilt.examplesa valid Python module.
Bug fix: fix the Lotka-Volterra model in
pypfilt.examples.predationto work correctly with scalar and non-scalar time scales.
Bug fix: correctly generate summaries for the case where no table rows will be generated. This bug was introduced in pypfilt 0.5.0 (commit
Breaking change: the base model class has been renamed to
Breaking change: the base model class has been simplified; the
param_boundsmethods have been replaced by a single method,
describe. This method also defines, for each element of the state vector, whether that element can be sampled continuously (e.g., by the post-regularised filter).
pypfilt.summary.HDF5no longer creates a table of observations if no such table has been defined, since it may be desirable to store observations in multiple tables (e.g., grouped by source or observation unit). To retain the previous behaviour, add the new observations table
pypfilt.summary.Obsto the summary object.
Breaking change: particle weights are now passed as an additional argument to the log-likelihood function. Previously, the log-likelihood function was inspected to determine whether it accepted an extra argument (a nasty hack).
Bug fix: avoid raising an exception when
False(this was the intended behaviour in previous versions).
Bug fix: ensure that
pypfilt.summary.obs_tablecorrectly encodes the observation source and units.
Bug fix: correct an off-by-one error in
pypfilt.stats.qtl_wtthat caused the weighted quantiles to be calculated incorrectly. The calculation error was inversely proportional to the number of particles and should be negligible for any reasonable number of particles (e.g., one thousand or more).
Enhancement: custom simulation time scales are supported. Two time scales are provided (
pypfilt.Scalar) and additional time scales can be implemented by inheriting from
Enhancement: allow likelihoods to depend on past states by settings
params['last_n_periods']to N > 1, so that the current observation period can be compared to previous observation periods.
Enhancement: monitor states are now cached and restored, allowing them to calculate statistics over the combined estimation and forecasting runs. This means that, e.g., peak times and sizes are correctly reported even if they occurred prior to the forecasting date.
Enhancement: add conversion functions for manipulating individual columns in structured arrays.
Enhancement: plotting functions are provided by a new module,
pypfilt.plot(adding an optional dependency on matplotlib).
Enhancement: provide a base class for simulation metadata (
Enhancement: the (continuous) Lotka-Volterra equations are provided as an example in
pypfilt.examples.predationand act as the example system in the documentation.
pypfilt.summary.dtype_names_to_strnow also accepts fields as a list field names (i.e., strings).
Enhancement: test cases for several modules are now provided in
./testsand can be run with tox.
Enhancement: document how to install required packages as wheels, avoiding lengthy compilation times.
Enhancement: document the release process and provide instructions for uploading packages to PyPI.
Bug fix: correct the basic resampling method. Previously, random samples were drawn from the unit interval and were erroneously assumed to be in sorted order (as is the case for the stratified and deterministic methods).
Enhancement: automatically convert Unicode field names to native strings when using Python 2, to prevent NumPy from throwing a TypeError, as may occur when using
from __future__ import unicode_literals.
This functionality is provided by
Enhancement: ensure that temporary files are deleted when the simulation process is terminated by the SIGTERM signal.
Previously, they were only deleted upon normal termination (as noted in the atexit documentation).
Enhancement: consistently separate Unicode strings from bytes, and provide utility functions in the
Enhancement: forecast from the most recent known-good cached state, avoiding the estimation pass whenever possible.
Enhancement: allow the observation table to be generated externally. This means that users can include additional columns as needed.
Enhancement: separate the calculation of log-likelihoods from the adjustment of particle weights, resulting in the new function
Enhancement: provide particle weights to the log-likelihood function, if the log-likelihood function accepts an extra argument. This has no impact on existing log-likelihood functions.
Enhancement: by default, allow simulations to continue if regularisation fails. This behaviour can be changed:
params['resample']['regularise_or_fail'] = True
pypfilt.forecastwill raise an exception if no forecasting dates are provided.
Add installation instructions for Red Hat Enterprise Linux, Fedora, and Mac OS X (using Homebrew).
Enhancement: allow forecasts to resume from cached states, greatly improving the speed with which forecasts can be generated when new or updated observations become available. This is enabled by defining a cache file:
params['hist']['cache_file'] = 'cache.hdf5'
Enhancement: add option to restrict summary statistics to forecasting simulations, ignoring the initial estimation run. This is enabled by passing
only_fs=Trueas an argument to the
Breaking change: require models to define default parameter bounds by implementing the
Enhancement: offer the post-regularised particle filter (post-RPF) as an alternative means of avoiding particle impoverishment (as opposed to incorporating stochastic noise into the model equations). This is enabled by setting:
params['resample']['regularisation'] = True
See the example script (
./doc/example/run.py) for a demonstration.
Improved documentation for
pypfilt.model.Baseand summary statistics.
Add documentation for installing in a virtual environment.
This release includes a complete overhaul of simulation metadata and summary statistics. See
./doc/example/run.pyfor an overview of these changes.
Breaking change: decrease the default resampling threshold from 75% to 25%.
Breaking change: define base classes for summary statistics and output.
Breaking change: define a base class for simulation models.
Breaking change: collate the resampling and history matrix parameters to reduce clutter.
Breaking change: move
Bug fix: prevent
stats.cov_wtfrom mutating the history matrix.
Bug fix: ensure that the time-step mapping behaves as documented.
Bug fix: ensure that state vector slices have correct dimensions.
Enhancement: ensure that forecasting dates lie within the simulation period.
Performance improvement: Vectorise the history matrix initialisation.
Host the documentation at Read The Docs.
Notify models whether the current simulation is a forecast (i.e., if there are no observations). This allows deterministic models to add noise when estimating, to allow identical particles to differ in their behaviour, and to avoid doing so when forecasting.
Note that this is a breaking change, as it alters the parameters passed to the model update function.
Simplify the API for running a single simulation;
pypfilt.set_limitshas been removed and
pypfilt.Timeis not included in the API documentation, on the grounds that users should not need to make use of this class.
Greater use of NumPy array functions, removing the dependency on six >= 1.7.
Minor corrections to the example script (
Avoid error messages if no logging handler is configured by the application.
Use a relative path for the output directory. This makes simulation metadata easier to reproduce, since the absolute path of the output directory is no longer included in the output file.
Build a universal wheel via
python setup.py bdist_wheel, which supports both Python 2 and Python 3.
Make the output directory a simulation parameter (
out_dir) so that it can be changed without affecting the working directory, and vice versa.