CosmoBolognaLib
Free Software C++/Python libraries for cosmological calculations
cbl::modelling::CombinedModelling Class Reference

The class CombinedModelling. More...

#include "Headers/CombinedModelling.h"

Inheritance diagram for cbl::modelling::CombinedModelling:
Collaboration diagram for cbl::modelling::CombinedModelling:

Public Member Functions

Constructors/destructors
 CombinedModelling ()=default
 default constuctor
 
 CombinedModelling (std::vector< std::shared_ptr< modelling::Modelling >> modelling, std::vector< std::string > repeated_par={}, const std::vector< std::vector< std::vector< int >>> common_repeated_par={})
 Constuctor used to combine statistically independent probes. More...
 
 CombinedModelling (std::vector< std::vector< std::shared_ptr< modelling::Modelling >>> modelling, const std::vector< std::shared_ptr< data::CovarianceMatrix >> covariance, const std::vector< cbl::statistics::LikelihoodType > likelihood_types, const std::vector< std::string > repeated_par={}, const std::vector< std::vector< std::vector< int >>> common_repeated_par={}, const std::vector< std::shared_ptr< cosmology::SuperSampleCovariance >> SSC={})
 Constructor used to set the modelling of statistically dependent probes. For each vector of cbl::statistics::Posterior objects in the posteriors argument, a cbl::data::CovarianceMatrix object must be defined. The probes in each vector within posteriors are described by the same likelihood function. The final log-likelihood is given by the sum of the logarithms of each likelihood describing a set of dependent probes. More...
 
virtual ~CombinedModelling ()=default
 default destructor
 
Member functions used to manage likelihood/posterior

distributions

void maximize_combined_posterior (const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3)
 function that maximizes the combined posterior, finds the best-fit parameters and stores them in the model More...
 
void sample_combined_posterior (const int chain_size, const int nwalkers, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains by drawing from the prior distributions More...
 
void sample_combined_posterior (const int chain_size, const int nwalkers, const double radius, const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values More...
 
void sample_combined_posterior (const int chain_size, const int nwalkers, const std::string input_dir, const std::string input_file, const int seed, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains reading the input values from an input file More...
 
Member functions used for Input/Output
void write_combined_results (const std::string output_dir, const std::string root_file, const int start=0, const int thin=1, const int nbins=50, const bool fits=false, const bool compute_mode=false, const int ns=-1)
 write the results of the MCMC sampling to file More...
 
void write_model_from_combined_chain (const std::string output_dir, const std::string output_file, const int start, const int thin, const std::vector< double > xx={}, const std::vector< double > yy={})
 write the model computing 16th, 50th and 84th percentiles from the MCMC More...
 
- Public Member Functions inherited from cbl::modelling::Modelling
void m_set_posterior (const int seed)
 set the interal variable m_posterior More...
 
 Modelling ()=default
 default constuctor
 
virtual ~Modelling ()=default
 default destructor
 
std::shared_ptr< data::Datadata ()
 return the dataset More...
 
std::shared_ptr< data::Datadata_fit ()
 return the dataset More...
 
std::shared_ptr< statistics::Likelihoodlikelihood ()
 return the likelihood parameters More...
 
std::shared_ptr< statistics::Posteriorposterior ()
 return the posterior parameters More...
 
std::shared_ptr< statistics::ModelParameterslikelihood_parameters ()
 return the likelihood parameters More...
 
std::shared_ptr< statistics::ModelParametersposterior_parameters ()
 return the posterior parameters More...
 
virtual void set_parameter_from_string (const std::string parameter, const double value)
 set the value of a parameter providing its name string More...
 
virtual double get_parameter_from_string (const std::string parameter) const
 get the value of a parameter providing its name string More...
 
std::shared_ptr< statistics::PriorDistributionget_prior (const int i)
 get the internal variable m_parameter_priors More...
 
std::shared_ptr< statistics::Modelget_response_function ()
 return the response function used to compute the super-sample covariance More...
 
void reset_fit_range ()
 reset the fit range More...
 
void set_fit_range (const double xmin, const double xmax)
 set the fit range More...
 
void set_fit_range (const double xmin, const double xmax, const double ymin, const double ymax)
 set the fit range More...
 
void set_data (const std::shared_ptr< data::Data > dataset)
 set the dataset More...
 
void set_likelihood (const statistics::LikelihoodType likelihood_type, const std::vector< size_t > x_index={0, 2}, const int w_index=-1, const double prec=1.e-10, const int Nres=-1)
 set the likelihood function More...
 
void set_likelihood (const cbl::statistics::Likelihood_function log_likelihood_function)
 set the likelihood function, given a user-defined log-likelihood function More...
 
void maximize_likelihood (const std::vector< double > start, const std::vector< std::vector< double >> parameter_limits, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3)
 function that maximizes the posterior, finds the best-fit parameters and stores them in the model More...
 
void maximize_posterior (const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const int seed=666)
 function that maximizes the posterior, finds the best-fit parameters and stores them in the model More...
 
void sample_posterior (const int chain_size, const int nwalkers, const int seed=666, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains by drawing from the prior distributions More...
 
void sample_posterior (const int chain_size, const int nwalkers, const double radius, const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const int seed=666, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values More...
 
void sample_posterior (const int chain_size, const int nwalkers, std::vector< double > &value, const double radius, const int seed=666, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains by drawing from the prior distributions More...
 
void sample_posterior (const int chain_size, const std::vector< std::vector< double >> chain_value, const int seed=666, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains with input values More...
 
void sample_posterior (const int chain_size, const int nwalkers, const std::string input_dir, const std::string input_file, const int seed=666, const double aa=2, const bool parallel=true)
 sample the posterior, initializing the chains reading the input values from an input file More...
 
void importance_sampling (const std::string input_dir, const std::string input_file, const int seed=666, const std::vector< size_t > column={}, const int header_lines_to_skip=1, const bool is_FITS_format=false, const bool apply_to_likelihood=false)
 perform importance sampling More...
 
void write_chain (const std::string output_dir, const std::string output_file, const int start=0, const int thin=1, const bool is_FITS_format=false, const int prec=5, const int ww=14)
 write the chains obtained after the MCMC sampling More...
 
void read_chain (const std::string input_dir, const std::string input_file, const int nwalkers, const std::vector< size_t > columns={}, const int skip_header=1, const bool fits=false)
 read the chains More...
 
void show_results (const int start=0, const int thin=1, const int nbins=50, const bool show_mode=false, const int ns=-1)
 show the results of the MCMC sampling on screen More...
 
void write_results (const std::string output_dir, const std::string root_file, const int start=0, const int thin=1, const int nbins=50, const bool fits=false, const bool compute_mode=false, const int ns=-1)
 write the results of the MCMC sampling to file More...
 
virtual void write_model (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > parameters)
 write the model at xx for given parameters More...
 
virtual void write_model (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy, const std::vector< double > parameters)
 write the model at xx, yy for given parameters More...
 
virtual void write_model_at_bestfit (const std::string output_dir, const std::string output_file, const std::vector< double > xx)
 write the model at xx with best-fit parameters obtained from posterior maximization More...
 
virtual void write_model_at_bestfit (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy)
 write the model at xx, yy with best-fit parameters obtained from likelihood maximization More...
 
virtual void write_model_from_chains (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const int start=0, const int thin=1)
 write the model at xx computing 16th, 50th and 84th percentiles from the chains More...
 
virtual void write_model_from_chains (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy, const int start=0, const int thin=1)
 write the model at xx, yy computing 16th, 50th and 84th percentiles from the chains More...
 
double reduced_chi2 (const std::vector< double > parameter={})
 the reduced \(\chi^2\) More...
 

Protected Attributes

std::shared_ptr< statistics::CombinedPosteriorm_combined_posterior
 combined posterior
 
- Protected Attributes inherited from cbl::modelling::Modelling
std::shared_ptr< data::Datam_data = NULL
 input data to be modelled
 
bool m_fit_range = false
 check if fit range has been set
 
std::shared_ptr< data::Datam_data_fit
 input data restricted to the range used for the fit
 
std::shared_ptr< statistics::Modelm_model = NULL
 input model
 
std::shared_ptr< statistics::Modelm_response_func = NULL
 response function for the computation of the super-sample covariance
 
std::shared_ptr< statistics::Likelihoodm_likelihood = NULL
 likelihood
 
std::vector< std::shared_ptr< statistics::PriorDistribution > > m_parameter_priors
 prior
 
std::shared_ptr< statistics::Posteriorm_posterior = NULL
 posterior
 

Additional Inherited Members

- Protected Member Functions inherited from cbl::modelling::Modelling
void m_set_prior (std::vector< statistics::PriorDistribution > prior_distribution)
 set the internal variable m_parameter_priors More...
 
void m_isSet_response ()
 check if the response function used to compute the super-sample covariance is set
 

Detailed Description

The class CombinedModelling.

This file defines the interface of the base class CombinedModelling, used for combining any kind of modelling

Definition at line 65 of file CombinedModelling.h.

Constructor & Destructor Documentation

◆ CombinedModelling() [1/2]

cbl::modelling::CombinedModelling::CombinedModelling ( std::vector< std::shared_ptr< modelling::Modelling >>  modelling,
std::vector< std::string >  repeated_par = {},
const std::vector< std::vector< std::vector< int >>>  common_repeated_par = {} 
)

Constuctor used to combine statistically independent probes.

Parameters
modellingvector of pointers to the single Modelling objects
repeated_parparameters shared by different probes, for which the user wants different posteriors for each probe. For example, if the probes A and B depend on the same parameter \(p\), whose identification string is "par", then "par" must be given in input if the user desires to have different posteriors of \(p\) for the two probes A and B. Every probe depending on \(p\) will provide a different posterior on \(p\). This is useful in the case of astrophysical parameters, e.g. the parameters describing galaxy cluster profiles.
common_repeated_parfor each argument in repeated_par, a vector of vectors of integers can be defined here. Such vectors define the sets of probes for which the same priors and posteriors are provided for the same parameters. For example, let us consider the probes {A1, A2, A3, A4}, provided in the modelling parameter. All the probes (A1, A2, A3, A4) depend on the parameter \(p\), whose identification string is "par", and we set repeated_par = {"par"}. If we want the pair of probes {A1, A2}, {A3, A4}, to provide a different posterior on \(p\), we must set common_repeated_par = { { {0,1}, {2,3} } }. This is useful when different probes in the same bin provide constraints on the same parameters. If common_repeated_par is not provided, every probe depending on \(p\) will provide a different posterior on \(p\). If two parameters are provided in repeated_par, e.g. repeated_par = {"par1", "par2"}, and only "par2" must be shared by more than one probe, then leave blank the vector of vectors corresponding to "par1" in common_repeated_par.

Definition at line 46 of file CombinedModelling.cpp.

◆ CombinedModelling() [2/2]

cbl::modelling::CombinedModelling::CombinedModelling ( std::vector< std::vector< std::shared_ptr< modelling::Modelling >>>  modelling,
const std::vector< std::shared_ptr< data::CovarianceMatrix >>  covariance,
const std::vector< cbl::statistics::LikelihoodType likelihood_types,
const std::vector< std::string >  repeated_par = {},
const std::vector< std::vector< std::vector< int >>>  common_repeated_par = {},
const std::vector< std::shared_ptr< cosmology::SuperSampleCovariance >>  SSC = {} 
)

Constructor used to set the modelling of statistically dependent probes. For each vector of cbl::statistics::Posterior objects in the posteriors argument, a cbl::data::CovarianceMatrix object must be defined. The probes in each vector within posteriors are described by the same likelihood function. The final log-likelihood is given by the sum of the logarithms of each likelihood describing a set of dependent probes.

If sets of probes are described by user-defined likelihoods, then the cbl::data::CovarianceMatrix objects must not be provided for such sets.

Parameters
modellingvector of vectors of pointers to Modelling objects. In each vector, a set of probes is contained, with a covariance matrix defined by the corresponding cbl::data::CovarianceMatrix object given as input in the second argument of this constructor
covarianceobjects defining the covariance matrices for the Posterior objects
likelihood_typeslikelihood types for each set of probes
repeated_parparameters shared by different probes, for which the user wants different posteriors for each probe. For example, if the probes A and B depend on the same parameter \(p\), whose identification string is "par", then "par" must be given in input if the user desires to have different posteriors of \(p\) for the two probes A and B. This is useful in the case of astrophysical parameters, e.g. the parameters describing galaxy cluster profiles.
common_repeated_parfor each argument in repeated_par, a vector of vectors of integers can be defined here. Such vectors define the sets of probes for which the same priors and posteriors are provided for the same parameters. For example, let us consider the two sets of probes {A1, A2, A3} and {B1, B2, B3}, provided in the modelling parameter. All the probes (A1, A2, A3, B1, B2, B3) depend on the parameter \(p\), whose identification string is "par", and we set repeated_par = {"par"}. If we want each pair of probes {A1, B1}, {A2, B2}, {A3, B3}, to provide a different posterior on \(p\), we must set common_repeated_par = { { {0,3}, {1,4}, {2,5} } }. This is useful when different probes in the same bin provide constraints on the same parameters. If common_repeated_par is not provided, every probe depending on \(p\) will provide a different posterior on \(p\). If two parameters are provided in repeated_par, e.g. repeated_par = {"par1", "par2"}, and only "par2" must be shared by more than one probe, then leave blank the vector of vectors corresponding to "par1" in common_repeated_par.
SSCvector of pointers to cbl::cosmology::SuperSampleCovariance objects, for the computation of the \(S_{ij}\) matrices. If, for example, the sets of probes A, B, C, are considered, and only for A and C the super-sample covariance must be computed, then set the second element of SSC equal to NULL.

Definition at line 61 of file CombinedModelling.cpp.

Member Function Documentation

◆ maximize_combined_posterior()

void cbl::modelling::CombinedModelling::maximize_combined_posterior ( const std::vector< double >  start,
const unsigned int  max_iter = 10000,
const double  tol = 1.e-6,
const double  epsilon = 1.e-3 
)

function that maximizes the combined posterior, finds the best-fit parameters and stores them in the model

this function exploits the Nelder-Mead method https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method

the algorithm defines a simplex (i.e a k-dimensional polytope which is the convex hull of its k+1 vertices) in the parameter space. At each step, it identifies the simplex vertex at which the function to be minimised (i.e. the negative posterior in this case) has the greatest value, and moves it, via reflections and scaling, to a new position in which the function has a lower value. This iteration stops when the simplex area becomes lower than the tolerance. For instance, in 2D, the starting vertices of the simplex (a triangle in 2D) are the following: (start[0], start[1]) ; (start[0]+epsilon, start[1]) ; (start[0], start[1]+epsilon)

Parameters
startvector containing initial values for the posterior maximization
max_iterthe maximum number of iterations
tolthe tolerance to find convergence
epsilonthe relative fraction of the initial simplex size

Definition at line 79 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [1/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior ( const int  chain_size,
const int  nwalkers,
const double  aa = 2,
const bool  parallel = true 
)

sample the posterior, initializing the chains by drawing from the prior distributions

the starting values of the chain are extracted from the (possibly different) distributions of the priors

Parameters
chain_sizethe chain lenght
nwalkersthe number of parallel chains
aathe parameter of the \(g(z)\) distribution
parallelfalse \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 88 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [2/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior ( const int  chain_size,
const int  nwalkers,
const double  radius,
const std::vector< double >  start,
const unsigned int  max_iter = 10000,
const double  tol = 1.e-6,
const double  epsilon = 1.e-3,
const double  aa = 2,
const bool  parallel = true 
)

sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values

the starting values of the chain are extracted from uniform distributions in the range [parameter-radius, parameter+radius] (for each likelihood parameter)

this function first maximizes the posterior, starting the computation at the values of the input vector 'start', then it inizializes the chain

Parameters
chain_sizethe chain lenght
nwalkersthe number of parallel chains
radiusradius of the ball in parameter space
startstd::vector containing initial values for the posterior maximization
max_iterthe maximum number of iterations
tolthe tolerance in finding convergence
epsilonthe simplex side
aathe parameter of the \(g(z)\) distribution
parallelfalse \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 98 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [3/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior ( const int  chain_size,
const int  nwalkers,
const std::string  input_dir,
const std::string  input_file,
const int  seed,
const double  aa = 2,
const bool  parallel = true 
)

sample the posterior, initializing the chains reading the input values from an input file

Parameters
chain_sizethe chain lenght
nwalkersthe number of parallel chains
input_dirinput directory
input_fileinput file
seedthe seed
aathe parameter of the \(g(z)\) distribution
parallelfalse \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 108 of file CombinedModelling.cpp.

◆ write_combined_results()

void cbl::modelling::CombinedModelling::write_combined_results ( const std::string  output_dir,
const std::string  root_file,
const int  start = 0,
const int  thin = 1,
const int  nbins = 50,
const bool  fits = false,
const bool  compute_mode = false,
const int  ns = -1 
)

write the results of the MCMC sampling to file

this function stores to file the posterior mean, the posterior standard deviation, the posterior median, 18th and 82th posterior percentiles, and, optionally, the posterior mode.

If the covariance matrix has been estimated from a set of mock catalogues, and the input parameters ns ( \(n_s\), the number of samples used to estimate the covariance matrix) is provided (>0), then the parameter errors ( \(\sigma_p\)) will be corrected to take into account the uncertainities in the covariance estimate (Percival et al. 2014):

\[ \sigma_p = \sqrt{\frac{1+B(n_b-n_p)}{1+A+B(n_p+1)}} \]

where

\[ A = \frac{2}{(n_s-n_b-1)(n_s-n_b-4)} \,, \]

\[ B = \frac{(n_s-n_b-2)}{(n_s-n_b-1)(n_s-n_b-4)} \,, \]

where \(n_b\) is number of data measurements (e.g. the bins of the dataset).

This correction can be applied only if the likelihood is Gaussian. Morever, the inverce covariance matrix estimator has to be corrected to take into account the inverse Wishart distribution (Hartlap, Simon and Schneider 2006).

Parameters
output_dirthe output director
root_filethe root of the output files: - file_root_parameters.dat file containing the output of the MCMC sampling for each parameter - file_root_covariance.dat file containing the covariance of the parameters - file_root_chain file containing the chains: the extention can be .dat or .fits
startthe minimum chain position to be written
thinthe step used for dilution on screen
nbinsthe number of bins
fitsfalse \(\rightarrow\) ascii file; true \(\rightarrow\) fits file
compute_modetrue \(\rightarrow\) compute the posterior mode; false \(\rightarrow\) do not compute the posterior mode
nsnumber of samples used to estimate the covariance matrix

Definition at line 118 of file CombinedModelling.cpp.

◆ write_model_from_combined_chain()

void cbl::modelling::CombinedModelling::write_model_from_combined_chain ( const std::string  output_dir,
const std::string  output_file,
const int  start,
const int  thin,
const std::vector< double >  xx = {},
const std::vector< double >  yy = {} 
)

write the model computing 16th, 50th and 84th percentiles from the MCMC

Parameters
output_dirthe output directory
output_filethe output file
startthe minimum chain position to be written
thinthe step used for dilution on screen
xxx points where the model is computed. If not provided, the x points set for the MCMC are considered.
yyy points where the model is computed. If not provided, the y points set for the MCMC are considered.

Definition at line 127 of file CombinedModelling.cpp.


The documentation for this class was generated from the following files: