API Reference
API documents for exhaustive search with Bayesian model averaging.
Estimators
- class exhbma.exhaustive_search.ExhaustiveLinearRegression(sigma_noise_points: List[RandomVariable], sigma_coef_points: List[RandomVariable], alpha: float = 0.5, exclude_null: bool = False)
ExhaustiveSearchModel with linear_model.LinearRegression
Intercept of the linear model is assumed to be zero.
Assume that target variable y is centralized.
Assume that all features x are centralized and normalized.
- Parameters:
sigma_noise_points (List[RandomVariable]) – Data points to explore sigma_noise parameter in exhaustive search.
sigma_coef_points (List[RandomVariable]) – Data points to explore sigma_coef parameter in exhaustive search.
alpha (float (default: 0.5)) – Alpha parameter is fixed to this value.
exclude_null (bool (default: False)) – Whether or not exclude a null model.
- n_features_in_
Number of features seen during fit.
- Type:
int
- coef_
Coefficients of the regression model (mean of distribution).
- Type:
List[float]
- log_likelihood_
Log-likelihood of the model. Marginalization is performed over sigma_noise, sigma_coef, indicators.
- Type:
float
- log_likelihood_over_sigma_
Log-likelihood over \(\sigma_{noise}\) and \(\sigma_{coef}\), \(p(y| \sigma_{noise}, \sigma_{coef}, X)\), which is marginalized over indicators. Prior distributions for both sigma are not included.
- Type:
List[List[float]]
- feature_posteriors_
Posterior probabilities for each feature.
- Type:
List[float]
- indicators_
List of indicator vectors. Null model [0, 0, …, 0] are excluded, so length is \(2^{n\_features\_in\_} - 1\).
- Type:
List[List[int]]
- log_priors_
List of log-prior probabilities for each model specified by indicator.
- Type:
List[float]
- log_likelihoods_
List of log-likelihood of each model specified by indicator.
- Type:
List[float]
- models_
Information for all models specified by indicator vector. Length of this attribute is equal to that of indicators_ and models correspond to each other.
- Type:
List[ModelInfo]
- fit(X: ndarray, y: ndarray, verbose: bool = True)
Train a model
Create indicator vectors
Fit sub-models for each indicator vector with marginalizing sigma_noise and sigma_coef for each indicator
Calculate final model averaged over sub-models
- Parameters:
X (np.ndarray with shape (n_data, n_features)) – Feature matrix. Each row corresponds to single data.
y (np.ndarray with shape (n_data,)) – Target value vector.
verbose (bool (default: True)) – If this is set to True, progress bar is displayed.
- predict(X: ndarray, mode: str, threshold: float = 0.5) ndarray
Predict using the model
Predict about new data.
- Parameters:
X (np.ndarray with shape (n_data, n_features)) – Feature matrix for prediction.
mode (str) – Mode of prediction.
threshold (float (default: 0.5)) – Feature selection threshold used in mode ‘select’.
- Returns:
y – Prediction values.
- Return type:
np.ndarray
- select_variables(threshold: float = 0.5) List[int]
Return indicator with posterior probability greater than or equal to threshold.
- class exhbma.linear_regression.LinearRegression(sigma_noise: float, sigma_coef: float)
Model description:
Base model: Linear regression model
Observation noise: Gaussian distribution
Prior distribution for coefficient: Gaussian distribution
Intercept of the linear model is assumed to be zero.
Assume that target variable y is centralized.
Assume that all features x are centralized and normalized.
Marginalization over intercept is performed.
- Parameters:
sigma_noise (float) – Standard deviation of gaussian noise. Model assumes that observation noise obeys Gaussian distribution with mean = 0, variance = sigma_noise^2.
sigma_coef (float) – Standard deviation of gaussian noise. Model assumes that each coefficient value obeys Gaussian distribution with mean = 0, variance = sigma_coef^2 independently.
- n_features_in_
Number of features seen during fit.
- Type:
int
- coef_
Coefficients of the regression model (mean of distribution).
- Type:
List[float]
- log_likelihood_
Log-likelihood of the model. Marginalization is performed over sigma_noise, sigma_coef, indicators.
- Type:
float
- fit(X: ndarray, y: ndarray, skip_preprocessing_validation: bool = False)
Calculate coefficient used in prediction and log likelihood for this data.
- Parameters:
X (np.ndarray with shape (n_data, n_features)) – Feature matrix. Each row corresponds to single data.
y (np.ndarray with shape (n_data,)) – Target value vector.
- predict(X)
Prediction using trained model.
- Xnp.ndarray with shape (n_data, n_features)
Feature matrix for prediction.
Utilities
Utility classes and functions for constructing estimators.
- class exhbma.scaler.StandardScaler(n_dim, scaling: bool = True)
- exhbma.probabilities.gamma(x: ndarray, low: float | None = None, high: float | None = None, shape: float = 0.001, scale: float = 1000.0) List[RandomVariable]
Gamma distribution: x ~ (const.) * x^(shape - 1) * exp(- x / scale) Distribution is limited on range of [low, high]. If low(high) is None, low(high) is set as the minimum(maximum) value of x.
- exhbma.probabilities.inverse(x: ndarray, low: float | None = None, high: float | None = None) List[RandomVariable]
x-inverse distribution: p(x) = 1 / x This distribution becomes proper when finite interval is considered.
- exhbma.probabilities.uniform(x: ndarray, low: float = 0.0, high: float = 1.0) List[RandomVariable]
Uniform distribution: p(x) = 1 / (high - low)
- exhbma.integrate.integrate_log_values_in_line(log_values: List[float], x1: List[float], weights: List[float] | None = None, expect_positive: bool = True)
Integrate along line.
- log_values: 1 dimension array with length len(x1)
Log values to be integrated along x1
- x1: List[float]
1st axis points.
- weights: Optional[List[float]], default: None
Weights to log_values.
- expect_positive: bool, default: True
If set True, ValueError is raised when result is not positive.
- exhbma.integrate.integrate_log_values_in_square(log_values: List[List[float]], x1: List[float], x2: List[float], weights: List[List[float]] | None = None, expect_positive: bool = True)
Integrate box region. int weight * exp(log_value) dx1 dx2
- log_values: 2 dimension array with shape (len(x1), len(x2))
Log values to be integrated over x1 and x2 axes.
- x1: List[float]
1st axis points.
- x2: List[float]
2nd axis points.
- weights: Optional[List[List[float]]], default: None
Weights to log_values.
- expect_positive: bool, default: True
If set True, ValueError is raised when result is not positive.