#include "Headers/CombinedModelling.h"

Inheritance diagram for cbl::modelling::CombinedModelling:

Collaboration diagram for cbl::modelling::CombinedModelling:

Public Member Functions
Constructors/destructors
	CombinedModelling ()=default
	default constuctor

	CombinedModelling (std::vector< std::shared_ptr< modelling::Modelling >> modelling, std::vector< std::string > repeated_par={}, const std::vector< std::vector< std::vector< int >>> common_repeated_par={})
	Constuctor used to combine statistically independent probes. More...

	CombinedModelling (std::vector< std::vector< std::shared_ptr< modelling::Modelling >>> modelling, const std::vector< std::shared_ptr< data::CovarianceMatrix >> covariance, const std::vector< cbl::statistics::LikelihoodType > likelihood_types, const std::vector< std::string > repeated_par={}, const std::vector< std::vector< std::vector< int >>> common_repeated_par={}, const std::vector< std::shared_ptr< cosmology::SuperSampleCovariance >> SSC={})
	Constructor used to set the modelling of statistically dependent probes. For each vector of cbl::statistics::Posterior objects in the posteriors argument, a cbl::data::CovarianceMatrix object must be defined. The probes in each vector within posteriors are described by the same likelihood function. The final log-likelihood is given by the sum of the logarithms of each likelihood describing a set of dependent probes. More...

virtual	~CombinedModelling ()=default
	default destructor

Member functions used to manage likelihood/posterior
distributions
void	maximize_combined_posterior (const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3)
	function that maximizes the combined posterior, finds the best-fit parameters and stores them in the model More...

void	sample_combined_posterior (const int chain_size, const int nwalkers, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains by drawing from the prior distributions More...

void	sample_combined_posterior (const int chain_size, const int nwalkers, const double radius, const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values More...

void	sample_combined_posterior (const int chain_size, const int nwalkers, const std::string input_dir, const std::string input_file, const int seed, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains reading the input values from an input file More...

Member functions used for Input/Output
void	write_combined_results (const std::string output_dir, const std::string root_file, const int start=0, const int thin=1, const int nbins=50, const bool fits=false, const bool compute_mode=false, const int ns=-1)
	write the results of the MCMC sampling to file More...

void	write_model_from_combined_chain (const std::string output_dir, const std::string output_file, const int start, const int thin, const std::vector< double > xx={}, const std::vector< double > yy={})
	write the model computing 16th, 50th and 84th percentiles from the MCMC More...

Public Member Functions inherited from cbl::modelling::Modelling
void	m_set_posterior (const int seed)
	set the interal variable m_posterior More...

	Modelling ()=default
	default constuctor

virtual	~Modelling ()=default
	default destructor

std::shared_ptr< data::Data >	data ()
	return the dataset More...

std::shared_ptr< data::Data >	data_fit ()
	return the dataset More...

std::shared_ptr< statistics::Likelihood >	likelihood ()
	return the likelihood parameters More...

std::shared_ptr< statistics::Posterior >	posterior ()
	return the posterior parameters More...

std::shared_ptr< statistics::ModelParameters >	likelihood_parameters ()
	return the likelihood parameters More...

std::shared_ptr< statistics::ModelParameters >	posterior_parameters ()
	return the posterior parameters More...

virtual void	set_parameter_from_string (const std::string parameter, const double value)
	set the value of a parameter providing its name string More...

virtual double	get_parameter_from_string (const std::string parameter) const
	get the value of a parameter providing its name string More...

std::shared_ptr< statistics::PriorDistribution >	get_prior (const int i)
	get the internal variable m_parameter_priors More...

std::shared_ptr< statistics::Model >	get_response_function ()
	return the response function used to compute the super-sample covariance More...

void	reset_fit_range ()
	reset the fit range More...

void	set_fit_range (const double xmin, const double xmax)
	set the fit range More...

void	set_fit_range (const double xmin, const double xmax, const double ymin, const double ymax)
	set the fit range More...

void	set_data (const std::shared_ptr< data::Data > dataset)
	set the dataset More...

void	set_likelihood (const statistics::LikelihoodType likelihood_type, const std::vector< size_t > x_index={0, 2}, const int w_index=-1, const double prec=1.e-10, const int Nres=-1)
	set the likelihood function More...

void	set_likelihood (const cbl::statistics::Likelihood_function log_likelihood_function)
	set the likelihood function, given a user-defined log-likelihood function More...

void	maximize_likelihood (const std::vector< double > start, const std::vector< std::vector< double >> parameter_limits, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3)
	function that maximizes the posterior, finds the best-fit parameters and stores them in the model More...

void	maximize_posterior (const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const int seed=666)
	function that maximizes the posterior, finds the best-fit parameters and stores them in the model More...

void	sample_posterior (const int chain_size, const int nwalkers, const int seed=666, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains by drawing from the prior distributions More...

void	sample_posterior (const int chain_size, const int nwalkers, const double radius, const std::vector< double > start, const unsigned int max_iter=10000, const double tol=1.e-6, const double epsilon=1.e-3, const int seed=666, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values More...

void	sample_posterior (const int chain_size, const int nwalkers, std::vector< double > &value, const double radius, const int seed=666, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains by drawing from the prior distributions More...

void	sample_posterior (const int chain_size, const std::vector< std::vector< double >> chain_value, const int seed=666, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains with input values More...

void	sample_posterior (const int chain_size, const int nwalkers, const std::string input_dir, const std::string input_file, const int seed=666, const double aa=2, const bool parallel=true)
	sample the posterior, initializing the chains reading the input values from an input file More...

void	importance_sampling (const std::string input_dir, const std::string input_file, const int seed=666, const std::vector< size_t > column={}, const int header_lines_to_skip=1, const bool is_FITS_format=false, const bool apply_to_likelihood=false)
	perform importance sampling More...

void	write_chain (const std::string output_dir, const std::string output_file, const int start=0, const int thin=1, const bool is_FITS_format=false, const int prec=5, const int ww=14)
	write the chains obtained after the MCMC sampling More...

void	read_chain (const std::string input_dir, const std::string input_file, const int nwalkers, const std::vector< size_t > columns={}, const int skip_header=1, const bool fits=false)
	read the chains More...

void	show_results (const int start=0, const int thin=1, const int nbins=50, const bool show_mode=false, const int ns=-1)
	show the results of the MCMC sampling on screen More...

void	write_results (const std::string output_dir, const std::string root_file, const int start=0, const int thin=1, const int nbins=50, const bool fits=false, const bool compute_mode=false, const int ns=-1)
	write the results of the MCMC sampling to file More...

virtual void	write_model (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > parameters)
	write the model at xx for given parameters More...

virtual void	write_model (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy, const std::vector< double > parameters)
	write the model at xx, yy for given parameters More...

virtual void	write_model_at_bestfit (const std::string output_dir, const std::string output_file, const std::vector< double > xx)
	write the model at xx with best-fit parameters obtained from posterior maximization More...

virtual void	write_model_at_bestfit (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy)
	write the model at xx, yy with best-fit parameters obtained from likelihood maximization More...

virtual void	write_model_from_chains (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const int start=0, const int thin=1)
	write the model at xx computing 16th, 50th and 84th percentiles from the chains More...

virtual void	write_model_from_chains (const std::string output_dir, const std::string output_file, const std::vector< double > xx, const std::vector< double > yy, const int start=0, const int thin=1)
	write the model at xx, yy computing 16th, 50th and 84th percentiles from the chains More...

double	reduced_chi2 (const std::vector< double > parameter={})
	the reduced \(\chi^2\) More...

Protected Attributes
std::shared_ptr< statistics::CombinedPosterior >	m_combined_posterior
	combined posterior

Protected Attributes inherited from cbl::modelling::Modelling
std::shared_ptr< data::Data >	m_data = NULL
	input data to be modelled

bool	m_fit_range = false
	check if fit range has been set

std::shared_ptr< data::Data >	m_data_fit
	input data restricted to the range used for the fit

std::shared_ptr< statistics::Model >	m_model = NULL
	input model

std::shared_ptr< statistics::Model >	m_response_func = NULL
	response function for the computation of the super-sample covariance

std::shared_ptr< statistics::Likelihood >	m_likelihood = NULL
	likelihood

std::vector< std::shared_ptr< statistics::PriorDistribution > >	m_parameter_priors
	prior

std::shared_ptr< statistics::Posterior >	m_posterior = NULL
	posterior

Additional Inherited Members
Protected Member Functions inherited from cbl::modelling::Modelling
void	m_set_prior (std::vector< statistics::PriorDistribution > prior_distribution)
	set the internal variable m_parameter_priors More...

void	m_isSet_response ()
	check if the response function used to compute the super-sample covariance is set

Detailed Description

The class CombinedModelling.

This file defines the interface of the base class CombinedModelling, used for combining any kind of modelling

Definition at line 65 of file CombinedModelling.h.

Constructor & Destructor Documentation

◆ CombinedModelling() [1/2]

cbl::modelling::CombinedModelling::CombinedModelling	(	std::vector< std::shared_ptr< modelling::Modelling >>	modelling,
		std::vector< std::string >	repeated_par = `{}`,
		const std::vector< std::vector< std::vector< int >>>	common_repeated_par = `{}`
	)

Constuctor used to combine statistically independent probes.

Parameters

modelling	vector of pointers to the single Modelling objects
repeated_par	parameters shared by different probes, for which the user wants different posteriors for each probe. For example, if the probes A and B depend on the same parameter \(p\), whose identification string is "par", then "par" must be given in input if the user desires to have different posteriors of \(p\) for the two probes A and B. Every probe depending on \(p\) will provide a different posterior on \(p\). This is useful in the case of astrophysical parameters, e.g. the parameters describing galaxy cluster profiles.
common_repeated_par	for each argument in repeated_par, a vector of vectors of integers can be defined here. Such vectors define the sets of probes for which the same priors and posteriors are provided for the same parameters. For example, let us consider the probes {A1, A2, A3, A4}, provided in the modelling parameter. All the probes (A1, A2, A3, A4) depend on the parameter \(p\), whose identification string is "par", and we set repeated_par = {"par"}. If we want the pair of probes {A1, A2}, {A3, A4}, to provide a different posterior on \(p\), we must set common_repeated_par = { { {0,1}, {2,3} } }. This is useful when different probes in the same bin provide constraints on the same parameters. If common_repeated_par is not provided, every probe depending on \(p\) will provide a different posterior on \(p\). If two parameters are provided in repeated_par, e.g. repeated_par = {"par1", "par2"}, and only "par2" must be shared by more than one probe, then leave blank the vector of vectors corresponding to "par1" in common_repeated_par.

Definition at line 46 of file CombinedModelling.cpp.

◆ CombinedModelling() [2/2]

cbl::modelling::CombinedModelling::CombinedModelling	(	std::vector< std::vector< std::shared_ptr< modelling::Modelling >>>	modelling,
		const std::vector< std::shared_ptr< data::CovarianceMatrix >>	covariance,
		const std::vector< cbl::statistics::LikelihoodType >	likelihood_types,
		const std::vector< std::string >	repeated_par = `{}`,
		const std::vector< std::vector< std::vector< int >>>	common_repeated_par = `{}`,
		const std::vector< std::shared_ptr< cosmology::SuperSampleCovariance >>	SSC = `{}`
	)

Constructor used to set the modelling of statistically dependent probes. For each vector of cbl::statistics::Posterior objects in the posteriors argument, a cbl::data::CovarianceMatrix object must be defined. The probes in each vector within posteriors are described by the same likelihood function. The final log-likelihood is given by the sum of the logarithms of each likelihood describing a set of dependent probes.

If sets of probes are described by user-defined likelihoods, then the cbl::data::CovarianceMatrix objects must not be provided for such sets.

Parameters

modelling	vector of vectors of pointers to Modelling objects. In each vector, a set of probes is contained, with a covariance matrix defined by the corresponding cbl::data::CovarianceMatrix object given as input in the second argument of this constructor
covariance	objects defining the covariance matrices for the Posterior objects
likelihood_types	likelihood types for each set of probes
repeated_par	parameters shared by different probes, for which the user wants different posteriors for each probe. For example, if the probes A and B depend on the same parameter \(p\), whose identification string is "par", then "par" must be given in input if the user desires to have different posteriors of \(p\) for the two probes A and B. This is useful in the case of astrophysical parameters, e.g. the parameters describing galaxy cluster profiles.
common_repeated_par	for each argument in repeated_par, a vector of vectors of integers can be defined here. Such vectors define the sets of probes for which the same priors and posteriors are provided for the same parameters. For example, let us consider the two sets of probes {A1, A2, A3} and {B1, B2, B3}, provided in the modelling parameter. All the probes (A1, A2, A3, B1, B2, B3) depend on the parameter \(p\), whose identification string is "par", and we set repeated_par = {"par"}. If we want each pair of probes {A1, B1}, {A2, B2}, {A3, B3}, to provide a different posterior on \(p\), we must set common_repeated_par = { { {0,3}, {1,4}, {2,5} } }. This is useful when different probes in the same bin provide constraints on the same parameters. If common_repeated_par is not provided, every probe depending on \(p\) will provide a different posterior on \(p\). If two parameters are provided in repeated_par, e.g. repeated_par = {"par1", "par2"}, and only "par2" must be shared by more than one probe, then leave blank the vector of vectors corresponding to "par1" in common_repeated_par.
SSC	vector of pointers to cbl::cosmology::SuperSampleCovariance objects, for the computation of the \(S_{ij}\) matrices. If, for example, the sets of probes A, B, C, are considered, and only for A and C the super-sample covariance must be computed, then set the second element of SSC equal to NULL.

Definition at line 61 of file CombinedModelling.cpp.

Member Function Documentation

◆ maximize_combined_posterior()

void cbl::modelling::CombinedModelling::maximize_combined_posterior	(	const std::vector< double >	start,
		const unsigned int	max_iter = `10000`,
		const double	tol = `1.e-6`,
		const double	epsilon = `1.e-3`
	)

function that maximizes the combined posterior, finds the best-fit parameters and stores them in the model

this function exploits the Nelder-Mead method https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method

the algorithm defines a simplex (i.e a k-dimensional polytope which is the convex hull of its k+1 vertices) in the parameter space. At each step, it identifies the simplex vertex at which the function to be minimised (i.e. the negative posterior in this case) has the greatest value, and moves it, via reflections and scaling, to a new position in which the function has a lower value. This iteration stops when the simplex area becomes lower than the tolerance. For instance, in 2D, the starting vertices of the simplex (a triangle in 2D) are the following: (start[0], start[1]) ; (start[0]+epsilon, start[1]) ; (start[0], start[1]+epsilon)

Parameters

start	vector containing initial values for the posterior maximization
max_iter	the maximum number of iterations
tol	the tolerance to find convergence
epsilon	the relative fraction of the initial simplex size

Definition at line 79 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [1/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior	(	const int	chain_size,
		const int	nwalkers,
		const double	aa = `2`,
		const bool	parallel = `true`
	)

sample the posterior, initializing the chains by drawing from the prior distributions

the starting values of the chain are extracted from the (possibly different) distributions of the priors

Parameters

chain_size	the chain lenght
nwalkers	the number of parallel chains
aa	the parameter of the \(g(z)\) distribution
parallel	false \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 88 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [2/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior	(	const int	chain_size,
		const int	nwalkers,
		const double	radius,
		const std::vector< double >	start,
		const unsigned int	max_iter = `10000`,
		const double	tol = `1.e-6`,
		const double	epsilon = `1.e-3`,
		const double	aa = `2`,
		const bool	parallel = `true`
	)

sample the posterior, initializing the chains in a ball around the posterior best-fit parameters values

the starting values of the chain are extracted from uniform distributions in the range [parameter-radius, parameter+radius] (for each likelihood parameter)

this function first maximizes the posterior, starting the computation at the values of the input vector 'start', then it inizializes the chain

Parameters

chain_size	the chain lenght
nwalkers	the number of parallel chains
radius	radius of the ball in parameter space
start	std::vector containing initial values for the posterior maximization
max_iter	the maximum number of iterations
tol	the tolerance in finding convergence
epsilon	the simplex side
aa	the parameter of the \(g(z)\) distribution
parallel	false \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 98 of file CombinedModelling.cpp.

◆ sample_combined_posterior() [3/3]

void cbl::modelling::CombinedModelling::sample_combined_posterior	(	const int	chain_size,
		const int	nwalkers,
		const std::string	input_dir,
		const std::string	input_file,
		const int	seed,
		const double	aa = `2`,
		const bool	parallel = `true`
	)

sample the posterior, initializing the chains reading the input values from an input file

Parameters

chain_size	the chain lenght
nwalkers	the number of parallel chains
input_dir	input directory
input_file	input file
seed	the seed
aa	the parameter of the \(g(z)\) distribution
parallel	false \(\rightarrow\) non-parallel sampler; true \(\rightarrow\) parallel sampler

Definition at line 108 of file CombinedModelling.cpp.

◆ write_combined_results()

void cbl::modelling::CombinedModelling::write_combined_results	(	const std::string	output_dir,
		const std::string	root_file,
		const int	start = `0`,
		const int	thin = `1`,
		const int	nbins = `50`,
		const bool	fits = `false`,
		const bool	compute_mode = `false`,
		const int	ns = `-1`
	)

write the results of the MCMC sampling to file

this function stores to file the posterior mean, the posterior standard deviation, the posterior median, 18th and 82th posterior percentiles, and, optionally, the posterior mode.

If the covariance matrix has been estimated from a set of mock catalogues, and the input parameters ns ( \(n_s\), the number of samples used to estimate the covariance matrix) is provided (>0), then the parameter errors ( \(\sigma_p\)) will be corrected to take into account the uncertainities in the covariance estimate (Percival et al. 2014):

\[ \sigma_p = \sqrt{\frac{1+B(n_b-n_p)}{1+A+B(n_p+1)}} \]

where

\[ A = \frac{2}{(n_s-n_b-1)(n_s-n_b-4)} \,, \]

\[ B = \frac{(n_s-n_b-2)}{(n_s-n_b-1)(n_s-n_b-4)} \,, \]

where \(n_b\) is number of data measurements (e.g. the bins of the dataset).

This correction can be applied only if the likelihood is Gaussian. Morever, the inverce covariance matrix estimator has to be corrected to take into account the inverse Wishart distribution (Hartlap, Simon and Schneider 2006).

Parameters

output_dir	the output director
root_file	the root of the output files: - file_root_parameters.dat file containing the output of the MCMC sampling for each parameter - file_root_covariance.dat file containing the covariance of the parameters - file_root_chain file containing the chains: the extention can be .dat or .fits
start	the minimum chain position to be written
thin	the step used for dilution on screen
nbins	the number of bins
fits	false \(\rightarrow\) ascii file; true \(\rightarrow\) fits file
compute_mode	true \(\rightarrow\) compute the posterior mode; false \(\rightarrow\) do not compute the posterior mode
ns	number of samples used to estimate the covariance matrix

Definition at line 118 of file CombinedModelling.cpp.

◆ write_model_from_combined_chain()

void cbl::modelling::CombinedModelling::write_model_from_combined_chain	(	const std::string	output_dir,
		const std::string	output_file,
		const int	start,
		const int	thin,
		const std::vector< double >	xx = `{}`,
		const std::vector< double >	yy = `{}`
	)

write the model computing 16th, 50th and 84th percentiles from the MCMC

Parameters

output_dir	the output directory
output_file	the output file
start	the minimum chain position to be written
thin	the step used for dilution on screen
xx	x points where the model is computed. If not provided, the x points set for the MCMC are considered.
yy	y points where the model is computed. If not provided, the y points set for the MCMC are considered.

Definition at line 127 of file CombinedModelling.cpp.

The documentation for this class was generated from the following files:

Headers/CombinedModelling.h
Modelling/Global/CombinedModelling.cpp

Public Member Functions

Protected Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ CombinedModelling() [1/2]

◆ CombinedModelling() [2/2]

Member Function Documentation

◆ maximize_combined_posterior()

◆ sample_combined_posterior() [1/3]

◆ sample_combined_posterior() [2/3]

◆ sample_combined_posterior() [3/3]

◆ write_combined_results()

◆ write_model_from_combined_chain()