API Reference

API documents for exhaustive search with Bayesian model averaging.

Estimators

class exhbma.exhaustive_search.ExhaustiveLinearRegression(sigma_noise_points: List[RandomVariable], sigma_coef_points: List[RandomVariable], alpha: float = 0.5, exclude_null: bool = False)

ExhaustiveSearchModel with linear_model.LinearRegression

Intercept of the linear model is assumed to be zero.
- Assume that target variable y is centralized.
- Assume that all features x are centralized and normalized.

Parameters:

sigma_noise_points (List[RandomVariable]) – Data points to explore sigma_noise parameter in exhaustive search.
sigma_coef_points (List[RandomVariable]) – Data points to explore sigma_coef parameter in exhaustive search.
alpha (float (default: 0.5)) – Alpha parameter is fixed to this value.
exclude_null (bool (default: False)) – Whether or not exclude a null model.

n_features_in_

Number of features seen during fit.

Type:: int

coef_

Coefficients of the regression model (mean of distribution).

Type:: List[float]

log_likelihood_

Log-likelihood of the model. Marginalization is performed over sigma_noise, sigma_coef, indicators.

Type:: float

log_likelihood_over_sigma_

Log-likelihood over \(\sigma_{noise}\) and \(\sigma_{coef}\), \(p(y| \sigma_{noise}, \sigma_{coef}, X)\), which is marginalized over indicators. Prior distributions for both sigma are not included.

Type:: List[List[float]]

feature_posteriors_

Posterior probabilities for each feature.

Type:: List[float]

indicators_

List of indicator vectors. Null model [0, 0, …, 0] are excluded, so length is \(2^{n\_features\_in\_} - 1\).

Type:: List[List[int]]

log_priors_

List of log-prior probabilities for each model specified by indicator.

Type:: List[float]

log_likelihoods_

List of log-likelihood of each model specified by indicator.

Type:: List[float]

models_

Information for all models specified by indicator vector. Length of this attribute is equal to that of indicators_ and models correspond to each other.

Type:: List[ModelInfo]

fit(X: ndarray, y: ndarray, verbose: bool = True)

Train a model

Create indicator vectors
Fit sub-models for each indicator vector with marginalizing sigma_noise and sigma_coef for each indicator
Calculate final model averaged over sub-models

Parameters:

X (np.ndarray with shape (n_data, n_features)) – Feature matrix. Each row corresponds to single data.
y (np.ndarray with shape (n_data,)) – Target value vector.
verbose (bool (default: True)) – If this is set to True, progress bar is displayed.

predict(X: ndarray, mode: str, threshold: float = 0.5) → ndarray

Predict using the model

Predict about new data.

Parameters:

X (np.ndarray with shape (n_data, n_features)) – Feature matrix for prediction.
mode (str) – Mode of prediction.
threshold (float (default: 0.5)) – Feature selection threshold used in mode ‘select’.

Returns:

y – Prediction values.

Return type:

np.ndarray

select_variables(threshold: float = 0.5) → List[int]: Return indicator with posterior probability greater than or equal to threshold.

class exhbma.linear_regression.LinearRegression(sigma_noise: float, sigma_coef: float)

Model description:

Base model: Linear regression model
Observation noise: Gaussian distribution
Prior distribution for coefficient: Gaussian distribution
Intercept of the linear model is assumed to be zero.
- Assume that target variable y is centralized.
- Assume that all features x are centralized and normalized.
- Marginalization over intercept is performed.

Parameters:

sigma_noise (float) – Standard deviation of gaussian noise. Model assumes that observation noise obeys Gaussian distribution with mean = 0, variance = sigma_noise^2.
sigma_coef (float) – Standard deviation of gaussian noise. Model assumes that each coefficient value obeys Gaussian distribution with mean = 0, variance = sigma_coef^2 independently.

n_features_in_

Number of features seen during fit.

Type:: int

coef_

Coefficients of the regression model (mean of distribution).

Type:: List[float]

log_likelihood_

Log-likelihood of the model. Marginalization is performed over sigma_noise, sigma_coef, indicators.

Type:: float

fit(X: ndarray, y: ndarray, skip_preprocessing_validation: bool = False)

Calculate coefficient used in prediction and log likelihood for this data.

Parameters:

X (np.ndarray with shape (n_data, n_features)) – Feature matrix. Each row corresponds to single data.
y (np.ndarray with shape (n_data,)) – Target value vector.

predict(X)

Prediction using trained model.

Xnp.ndarray with shape (n_data, n_features): Feature matrix for prediction.

Utilities

Utility classes and functions for constructing estimators.

class exhbma.scaler.StandardScaler(n_dim, scaling: bool = True)

exhbma.probabilities.gamma(x: ndarray, low: float | None = None, high: float | None = None, shape: float = 0.001, scale: float = 1000.0) → List[RandomVariable]: Gamma distribution: x ~ (const.) * x^(shape - 1) * exp(- x / scale) Distribution is limited on range of [low, high]. If low(high) is None, low(high) is set as the minimum(maximum) value of x.

exhbma.probabilities.inverse(x: ndarray, low: float | None = None, high: float | None = None) → List[RandomVariable]: x-inverse distribution: p(x) = 1 / x This distribution becomes proper when finite interval is considered.

exhbma.probabilities.uniform(x: ndarray, low: float = 0.0, high: float = 1.0) → List[RandomVariable]: Uniform distribution: p(x) = 1 / (high - low)

exhbma.integrate.integrate_log_values_in_line(log_values: List[float], x1: List[float], weights: List[float] | None = None, expect_positive: bool = True)

Integrate along line.

log_values: 1 dimension array with length len(x1): Log values to be integrated along x1
x1: List[float]: 1st axis points.
weights: Optional[List[float]], default: None: Weights to log_values.
expect_positive: bool, default: True: If set True, ValueError is raised when result is not positive.

exhbma.integrate.integrate_log_values_in_square(log_values: List[List[float]], x1: List[float], x2: List[float], weights: List[List[float]] | None = None, expect_positive: bool = True)

Integrate box region. int weight * exp(log_value) dx1 dx2

log_values: 2 dimension array with shape (len(x1), len(x2)): Log values to be integrated over x1 and x2 axes.
x1: List[float]: 1st axis points.
x2: List[float]: 2nd axis points.
weights: Optional[List[List[float]]], default: None: Weights to log_values.
expect_positive: bool, default: True: If set True, ValueError is raised when result is not positive.