Change Log

0.8.2 (2024-04-03)

Enhancement: pypfilt.cache.load_state() now issues a UserWarning when non-finite values are found in a cached state, or in the simulation context. These warnings are intended to flag invalid data. Observations and lookup table values should be finite, and non-finite values are not equal to themselves, so they also interfere with the caching system. Note that this does not change the value returned by pypfilt.cache.load_state().
Enhancement: add a new convenience function pypfilt.examples.lorenz.save_lorenz63_scenario_files() to save all of the example Lorenz-63 scenario files in the working directory.
Documentation: demonstrate how to evaluate forecast performance with pypfilt.crps.simulated_obs_crps().
Documentation: demonstrate how to use cache files to reuse particle state vectors.

0.8.1 (2023-10-13)

Breaking change: pypfilt.sampler.Independent has been removed. Use pypfilt.sampler.LatinHypercube instead. Note that this may require changing distribution names and parameters.
Breaking change: pypfilt no longer includes the particle history matrix when returning the results of fitting and forecasting passes. Returning these matrices substantially increased memory usage when running many estimation and/or forecasting passes, which was unintentional. To include particle history matrices in the results, change the value of the new setting filter.results.save_history to true.
Breaking change: structured types are now used for particle filter events, and event handlers must be updated accordingly. See the pypfilt.event module for details.
Enhancement: pypfilt.model.OdeModel now reads custom solver options from the new setting model.ode_solver_options.
Enhancement: the post-regularisation kernel bandwidth can now be adjusted by changing the value of "filter.regularisation.bandwidth_scale".
Enhancement: images produced by pypfilt.plot.Plot should now be reproducible. Previously, these images included the matplotlib version in their metadata.
Enhancement: pypfilt.io.write_table no longer requires a time scale, and can instead convert time values into strings using str().
Enhancement: pypfilt.io.load_dataset now supports loading subsets of datasets.
Enhancement: add a new convenience function pypfilt.io.load_summary_table for loading summary tables.
Enhancement: pypfilt.io.load_dataset and pypfilt.io.load_summary_table now accept a time scale, a scenario instance, or a simulation context as their first argument, rather than requiring a time scale.

0.8.0 (2022-11-03)

Breaking change: pypfilt.forecast(), pypfilt.fit(), and pypfilt.adaptive_fit() now return the simulation results as structured types, rather than plain dictionaries. See the online documentation for details.
Breaking change: 'time' is now used instead of 'date' (and 'fs_time' instead of 'fs_date') to identify observation times, time values in summary tables, etc. Summary tables, summary monitors, and observation models must be updated.
Breaking change: Model.update() now receives a TimeStep argument instead of the time-step end and step size arguments. All simulation models that implement this method must be updated.
Breaking change: changed how pypfilt.stats.qtl_wt calculates quantiles, to avoid under-estimating credible interval bounds. Previously, this function was a faithful re-implementation of the wtd.quantile() function in the Hmisc R package. However, this included a long-standing bug that produced the following behaviour:
```
import numpy as np
import pypfilt.stats
x = np.array([1, 2])
weights = np.array([0.1, 0.9])
# Returns [2.0, 2.0].
pypfilt.stats.qtl_wt(x, weights, probs=[0, 0.05])
```
This affects only the following summary tables in pypfilt and epifx:
- pypfilt.summary.ModelCIs;
- epifx.summary.ExpectedObs; and
- epifx.summary.PeakForecastCIs.
While some of the credible intervals produced by these tables may differ from those produced by earlier versions of pypfilt, the absolute differences should be very small. Accordingly, while this may affect the reproducibility of regression tests, it should not affect the meaning or interpretation of the outputs.
Breaking change: summary table and monitor methods no longer receive the n_days, start_date, and end_date arguments. The simulation start and end times can be retrieve with Context.start_time() and Context.end_time(), respectively, and Context.summary_times() now defines the times at which summary statistics will be calculated. All summary tables and summary monitors must be updated.
Bug fix: allow forecasting from the start of the simulation period, rather than raising an exception. This addresses an edge case that may not be particularly useful, but which might conceivably occur.
Enhancement: allow the particle ensemble to be divided into separate partitions. This can be used to, e.g., maintain an invariant distribution for a subset of model parameters.
Enhancement: the new summary tables pypfilt.summary.PartitionModelCIs and pypfilt.summary.PartitionPredictiveCIs record separate credible intervals for each partition in the ensemble.
Enhancement: add support for reservoir partitions, which are never reweighted or resampled. This can be used to preserve a representative sample of the model prior distribution, or to provide candidate particles when resampling non-reservoir partitions.
Enhancement: ignore observations that result in zero net weight. When conditioning on observation(s) at a given time would result in all particles having zero weight, the new default behaviour is to ignore these problematic observations and retain the previous weights. The original behaviour (raising an exception) can be recovered by setting "filter.reweight_or_fail" to true.
Enhancement: add adaptive fitting methods that run a series of estimation passes where the observation models are tuned or scaled in each pass. See the pypfilt.adaptive_fit documentation for details.
Enhancement: snapshots can now be sliced to capture a subset of the particles. As a consequence, the ixs argument has been removed from observation model methods, because the caller can slice snapshots instead.
Enhancement: add support for resampling the particles before each forecasting pass. Set "filter.resample.before_forecasting" to true to enable this behaviour.
Enhancement: summary statistics can now be recorded more than once per unit time, by changing the value of the new setting "time.summaries_per_unit".
Enhancement: the pypfilt.summary.SimulatedObs summary table now supports controlling the number of particles for which observations are simulated, and the number of observations per particle.
Enhancement: pypfilt.simulate_from_model() now allows the user to define how many observations to simulate for each particle.
Enhancement: pypfilt.model.Model now provides default implementations of the init() and can_smooth() methods.
Enhancement: pypfilt.obs.Univariate now stores the observation model settings dictionary in self.settings, so that sub-classes can access these settings without defining their own __init__() method.
Enhancement: parameter-free observation models no longer require an empty parameters table in the scenario definition.
Enhancement: state vector fields can be regularised without lower or upper bounds.
Enhancement: pypfilt.plot.observations() now uses scatter(), which allows the marker size to be controlled by the "s" keyword argument.
Enhancement: the Time.with_observation_tables() generator now supports multiple observations from each stream at a single time-step, as appropriate. If there are multiple observations from a stream that pertain to the same time-step, but which have different observation dates, a warning will be raised that smaller time-steps should be considered.
Enhancement: pypfilt.simulate_from_model() now returns observation tables that do not include an 'fs_date' (now 'fs_time') column.
Enhancement: new pypfilt.io.write_table() function to save data tables to plain-text files with column headers. This can be used for, e.g., saving simulated observations so that these observations can then be used for fitting and forecasting.
Enhancement: pypfilt.time.Datetime supports custom date-time format strings, which can be defined in the scenario settings, and can be temporarily overridden with the custom_format_strings() method.
Enhancement: new pypfilt.model.OdeModel class for ODE models, which provides a convenient wrapper around scipy.integrate.solve_ivp(), allowing right-hand side functions to work with structured state vector arrays.
Enhancement: new pypfilt.examples.lorenz.Lorenz63 model of the Lorenz-63 system.

0.7.2 (2022-06-03)

Enhancement: suppress spurious runtime warnings when calculating quantiles. Evaluating the percent point function for an observation model can cause run-time warnings when there are outlier particles, even though the returned values are valid. We now suppress these warnings and explicitly check that the returned values are finite.
Bug fix: if the output directory does not exist and multiple simulations are running in parallel, there was a potential race condition where multiple simulations would attempt to create the output directory, causing one or more simulations to raise a FileExistsError. We now prevent os.makedirs() from raising this error by passing exist_ok=True.

0.7.1 (2022-05-26)

Enhancement: suppress spurious runtime warnings when calculating quantiles. Evaluating the weighted CDF for an observation model can cause run-time warnings when there are outlier particles, even though the returned values are valid. We now suppress these warnings and explicitly check that the returned values are finite.
Documentation: explain how to read prior samples from external data files. Samples can be read from space-delimited text files, and from HDF5 datasets.

0.7.0 (2022-04-20)

This release introduces major improvements and simplifications. Similar to the 0.6.0 release, this involves some structural changes and breaks backwards compatibility with earlier releases. Key changes include:

Breaking change: require Python 3.7 or newer.
Breaking change: the particle history matrix and particle states are now represented using structured NumPy arrays. All simulation models must be updated.
Breaking change: the observation model interface has been simplified, and are no longer passed a separate parameters dictionary. All observation models must be updated.
Breaking change: observations and simulated observations are now stored in structured arrays, rather than as lists of dictionaries. All observation models must be updated.
Breaking change: summary tables no longer need to (de)serialise time and string values, but should instead identify these columns with pypfilt.io.time_field() and pypfilt.io.string_field(). All summary tables must be updated.
Breaking change: particle states are now provided as snapshots to summary tables and monitors, and to observation models. All summary tables, summary monitors, and observation models must be updated.
Breaking change: major changes to forecast scenario definitions in TOML files; many tables and settings have been moved and/or renamed. All TOML files must be updated.
Breaking change: in output HDF5 files, summary tables are now saved in the “tables” group, rather than in the “data” group.
Breaking change: pypfilt.config, pypfilt.context, pypfilt.params, and pypfilt.sweep have been removed. Use pypfilt.load_instances() to iterate over scenarios.
Enhancement: the pypfilt.obs.Univariate class greatly simplifies implementing new observation models.
Enhancement: add support for reading prior samples from plain-text data files and from HDF5 datasets.
Enhancement: PRNG states are now cached, so that outputs are identical whether or not a forecast begins from a cached state.
Enhancement: add support for parameter-free and state-free models.
Enhancement: add support for “mini-steps” with the pypfilt.model.ministeps decorator, which can greatly reduce the size of the history matrix.
Enhancement: record a greater number of credible intervals by default.
Enhancement: add support for measuring forecast accuracy with CRPS.
Enhancement: add support for saving the particle history matrix and the back-cast matrix.
Enhancement: add summary tables for calculating back-cast statistics.
Migration from setup.py to pyproject.toml (PEPs 517 and 518).

0.6.1 (2022-01-05)

Update the h5py requirement to ensure that version 2.x is installed.

0.6.0 (2020-08-12)

This release introduces major structural changes to the entire package, and incorporates a number of features that were originally implemented in the epifx package. Please see the online documentation for further details. The major user-facing changes are:

Breaking change: drop support for Python 2, require Python 3.6 or newer.
Breaking change: forecast scenarios are now defined in TOML files.

0.5.5 (2019-11-25)

Bug fix: ensure that pypfilt.step records the true start of the simulation period, if it has not already been defined.
Enhancement: pypfilt.run now returns the current index into the history matrix, which allows repeat calls to pypfilt.run to be chained together. This may be of use when, e.g., generating a sequence of forecasts where each forecast is sufficiently short that it will not cause the simulation window to move past the end of the previous estimation run.
Ensure the documentation builds correctly on Read The Docs.

0.5.4 (2017-10-26)

Bug fix: ensure the true start of the simulation period is always recorded.

0.5.3 (2017-10-26)

Enhancement: record the true start of the simulation period, so that even if the estimation run or forecasting run begins at a later date, the true start is available (params['epoch']).
Enhancement: axis and series labels can now be defined by arbitrary functions.
Enhancement: pypfilt.plot.series now support string scales.
Enhancement: the pypfilt.check module provides convenience functions for checking invariants. Currently, it is able to check the history matrix dimensions. See the API documentation for further details.
Enhancement: add instructions for install pypfilt with pip.
Enhancement: provide example commands for the release process.

0.5.2 (2017-05-05)

Bug fix: make pypfilt.examples a valid Python module.
Bug fix: fix the Lotka-Volterra model in pypfilt.examples.predation to work correctly with scalar and non-scalar time scales.

0.5.1 (2017-04-28)

Bug fix: correctly generate summaries for the case where no table rows will be generated. This bug was introduced in pypfilt 0.5.0 (commit 8a0a614).

0.5.0 (2017-04-26)

Breaking change: the base model class has been renamed to pypfilt.Model.
Breaking change: the base model class has been simplified; the state_info, param_info, and param_bounds methods have been replaced by a single method, describe. This method also defines, for each element of the state vector, whether that element can be sampled continuously (e.g., by the post-regularised filter).
Breaking change: pypfilt.summary.HDF5 no longer creates a table of observations if no such table has been defined, since it may be desirable to store observations in multiple tables (e.g., grouped by source or observation unit). To retain the previous behaviour, add the new observations table pypfilt.summary.Obs to the summary object.
Breaking change: particle weights are now passed as an additional argument to the log-likelihood function. Previously, the log-likelihood function was inspected to determine whether it accepted an extra argument (a nasty hack).
Bug fix: avoid raising an exception when regularise_or_fail is False (this was the intended behaviour in previous versions).
Bug fix: ensure that pypfilt.summary.obs_table correctly encodes the observation source and units.
Bug fix: correct an off-by-one error in pypfilt.stats.qtl_wt that caused the weighted quantiles to be calculated incorrectly. The calculation error was inversely proportional to the number of particles and should be negligible for any reasonable number of particles (e.g., one thousand or more).
Enhancement: custom simulation time scales are supported. Two time scales are provided (pypfilt.Datetime and pypfilt.Scalar) and additional time scales can be implemented by inheriting from pypfilt.time.Time.
Enhancement: allow likelihoods to depend on past states by settings params['last_n_periods'] to N > 1, so that the current observation period can be compared to previous observation periods.
Enhancement: monitor states are now cached and restored, allowing them to calculate statistics over the combined estimation and forecasting runs. This means that, e.g., peak times and sizes are correctly reported even if they occurred prior to the forecasting date.
Enhancement: add conversion functions for manipulating individual columns in structured arrays.
Enhancement: plotting functions are provided by a new module, pypfilt.plot (adding an optional dependency on matplotlib).
Enhancement: provide a base class for simulation metadata (pypfilt.summary.Metadata).
Enhancement: the (continuous) Lotka-Volterra equations are provided as an example in pypfilt.examples.predation and act as the example system in the documentation.
Enhancement: pypfilt.summary.dtype_names_to_str now also accepts fields as a list field names (i.e., strings).
Enhancement: test cases for several modules are now provided in ./tests and can be run with tox.
Enhancement: document how to install required packages as wheels, avoiding lengthy compilation times.
Enhancement: document the release process and provide instructions for uploading packages to PyPI.

0.4.3 (2016-09-16)

Bug fix: correct the basic resampling method. Previously, random samples were drawn from the unit interval and were erroneously assumed to be in sorted order (as is the case for the stratified and deterministic methods).
Enhancement: automatically convert Unicode field names to native strings when using Python 2, to prevent NumPy from throwing a TypeError, as may occur when using from __future__ import unicode_literals.

This functionality is provided by pypfilt.summary.dtype_names_to_str.
Enhancement: ensure that temporary files are deleted when the simulation process is terminated by the SIGTERM signal.

Previously, they were only deleted upon normal termination (as noted in the atexit documentation).
Enhancement: consistently separate Unicode strings from bytes, and provide utility functions in the pypfilt.text module.
Enhancement: forecast from the most recent known-good cached state, avoiding the estimation pass whenever possible.
Enhancement: allow the observation table to be generated externally. This means that users can include additional columns as needed.
Enhancement: separate the calculation of log-likelihoods from the adjustment of particle weights, resulting in the new function pypfilt.log_llhd_of.
Enhancement: provide particle weights to the log-likelihood function, if the log-likelihood function accepts an extra argument. This has no impact on existing log-likelihood functions.
Enhancement: by default, allow simulations to continue if regularisation fails. This behaviour can be changed:
```
params['resample']['regularise_or_fail'] = True
```

0.4.2 (2016-06-16)

Breaking change: pypfilt.forecast will raise an exception if no forecasting dates are provided.
Add installation instructions for Red Hat Enterprise Linux, Fedora, and Mac OS X (using Homebrew).

0.4.1 (2016-04-26)

Enhancement: allow forecasts to resume from cached states, greatly improving the speed with which forecasts can be generated when new or updated observations become available. This is enabled by defining a cache file:
```
params['hist']['cache_file'] = 'cache.hdf5'
```
Enhancement: add option to restrict summary statistics to forecasting simulations, ignoring the initial estimation run. This is enabled by passing only_fs=True as an argument to the pypfilt.summary.HDF5 constructor.

0.4.0 (2016-04-22)

Breaking change: require models to define default parameter bounds by implementing the param_bounds method.
Enhancement: offer the post-regularised particle filter (post-RPF) as an alternative means of avoiding particle impoverishment (as opposed to incorporating stochastic noise into the model equations). This is enabled by setting:
```
params['resample']['regularisation'] = True
```
See the example script (./doc/example/run.py) for a demonstration.
Improved documentation for pypfilt.model.Base and summary statistics.
Add documentation for installing in a virtual environment.

0.3.0 (2016-02-23)

This release includes a complete overhaul of simulation metadata and summary statistics. See ./doc/example/run.py for an overview of these changes.
Breaking change: decrease the default resampling threshold from 75% to 25%.
Breaking change: define base classes for summary statistics and output.
Breaking change: define a base class for simulation models.
Breaking change: collate the resampling and history matrix parameters to reduce clutter.
Breaking change: move pypfilt.metadata_priors to pypfilt.summary.
Bug fix: prevent stats.cov_wt from mutating the history matrix.
Bug fix: ensure that the time-step mapping behaves as documented.
Bug fix: ensure that state vector slices have correct dimensions.
Enhancement: ensure that forecasting dates lie within the simulation period.
Performance improvement: Vectorise the history matrix initialisation.
Host the documentation at Read The Docs.

0.2.0 (2015-11-16)

Notify models whether the current simulation is a forecast (i.e., if there are no observations). This allows deterministic models to add noise when estimating, to allow identical particles to differ in their behaviour, and to avoid doing so when forecasting.

Note that this is a breaking change, as it alters the parameters passed to the model update function.
Simplify the API for running a single simulation; pypfilt.set_limits has been removed and pypfilt.Time is not included in the API documentation, on the grounds that users should not need to make use of this class.
Greater use of NumPy array functions, removing the dependency on six >= 1.7.
Minor corrections to the example script (./doc/example/run.py).

0.1.2 (2015-06-08)

Avoid error messages if no logging handler is configured by the application.
Use a relative path for the output directory. This makes simulation metadata easier to reproduce, since the absolute path of the output directory is no longer included in the output file.
Build a universal wheel via python setup.py bdist_wheel, which supports both Python 2 and Python 3.

0.1.1 (2015-06-01)

Make the output directory a simulation parameter (out_dir) so that it can be changed without affecting the working directory, and vice versa.

0.1.0 (2015-05-29)

Initial release.