balance.cli¶
- class balance.cli.BalanceCLI(args: Namespace)[source]¶
Helper class that encapsulates CLI argument handling and execution.
Examples
- adapt_output(output_df: DataFrame) DataFrame[source]¶
Filter raw output dataframe to user’s requested rows/columns.
First we filter to the rows we are supposed to keep.
Next we select the columns that need to be returned.
- Parameters:
output_df – DataFrame produced by the adjustment step.
- Returns:
Filtered DataFrame containing requested rows and columns.
Examples
- batch_columns() List[str][source]¶
Return the list of batch column names.
- Returns:
Batch column names parsed from the CLI argument.
Examples
- check_input_columns(columns: List[str] | pd.Index) None[source]¶
Validate the input frame includes required columns.
- Parameters:
columns – Available column names in the input data.
- Returns:
None.
Examples
- covariate_columns() List[str][source]¶
Return the list of covariate column names.
- Returns:
Covariate column names parsed from the CLI argument.
Examples
- covariate_columns_for_diagnostics() List[str] | None[source]¶
Return covariate columns used for diagnostics reporting.
- Returns:
List of columns to keep in diagnostics or
None.
Examples
- formula() str | None[source]¶
Return the formula string used for model matrices.
- Returns:
Formula string or
Noneif unset.
Examples
- has_batch_columns() bool[source]¶
Return True when batch columns are supplied.
- Returns:
Trueif batch columns are set, otherwiseFalse.
Examples
- has_keep_columns() bool[source]¶
Return True when output keep columns are supplied.
- Returns:
Trueif keep columns are set, otherwiseFalse.
Examples
- has_keep_row_column() bool[source]¶
Return True when a keep-row column is supplied.
- Returns:
Trueif a keep-row column is set, otherwiseFalse.
Examples
- has_outcome_columns() bool[source]¶
Return True when outcome columns are explicitly supplied.
- Returns:
Trueif outcome columns are set, otherwiseFalse.
Examples
- id_column() str[source]¶
Return the identifier column name.
- Returns:
Name of the ID column.
Examples
- keep_columns() List[str] | None[source]¶
Return the subset of columns to keep in outputs.
- Returns:
List of columns to keep or
Noneif unspecified.
Examples
- keep_row_column() str | None[source]¶
Return the keep-row indicator column name.
- Returns:
Name of the keep-row indicator column.
Examples
- lambda_max() float | None[source]¶
Return the maximum L1 penalty setting.
- Returns:
Maximum L1 penalty value or
None.
Examples
- lambda_min() float | None[source]¶
Return the minimum L1 penalty setting.
- Returns:
Minimum L1 penalty value or
None.
Examples
- load_and_check_input() DataFrame[source]¶
Read the input file and log basic information.
- Returns:
DataFrame loaded from the input file.
Examples
- logistic_regression_kwargs() Dict[str, Any] | None[source]¶
Parse JSON keyword arguments for the IPW logistic regression model.
- Returns:
Parsed keyword arguments dictionary or
None.
Examples
- logistic_regression_model() ClassifierMixin | None[source]¶
Build a LogisticRegression model when IPW kwargs are supplied.
- Returns:
Configured LogisticRegression instance or
None.
Examples
- main() None[source]¶
Run the CLI workflow from input loading to output writing.
- Returns:
None.
Examples
- max_de() float | None[source]¶
Return the max design effect setting.
- Returns:
Maximum design effect or
Noneif unset.
Examples
- method() str[source]¶
Return the adjustment method name.
- Returns:
The adjustment method string (for example,
"ipw").
Examples
- num_lambdas() int | None[source]¶
Return the number of lambda values to search over.
- Returns:
Number of lambdas as an integer or
None.
Examples
- one_hot_encoding() bool | None[source]¶
Return the parsed one-hot encoding flag.
- Returns:
True/Falsefor one-hot encoding, orNoneif unset.
Examples
- outcome_columns() List[str] | None[source]¶
Return the list of outcome columns if provided.
- Returns:
List of outcome columns or
Noneif unset.
Examples
- process_batch(batch_df: pd.DataFrame, transformations: Dict[str, Any] | str | None = 'default', formula: str | None = None, penalty_factor: None = None, one_hot_encoding: bool = False, max_de: float | None = 1.5, lambda_min: float | None = 1e-05, lambda_max: float | None = 10, num_lambdas: int | None = 250, weight_trimming_mean_ratio: float | None = 20, sample_cls: Type[balance_sample_cls] = <class 'balance.sample_class.Sample'>, sample_package_name: str = 'balance') Dict[str, pd.DataFrame][source]¶
Run adjustment for a batch of data and return outputs.
- Parameters:
batch_df – Input data for the current batch.
transformations – Transformations argument for Sample.adjust.
formula – Optional formula for model matrices.
penalty_factor – Optional penalty factor passed to adjust.
one_hot_encoding – Whether to one-hot encode categorical features.
max_de – Maximum design effect constraint.
lambda_min – Minimum penalty value for IPW.
lambda_max – Maximum penalty value for IPW.
num_lambdas – Number of penalty values to search.
weight_trimming_mean_ratio – Mean ratio for trimming weights.
sample_cls – Sample implementation used to build sample/target.
sample_package_name – Name used in logging.
- Returns:
Dict with adjusted data and diagnostics frames.
Examples
- rows_to_keep_for_diagnostics() str | None[source]¶
Return the diagnostics row-filter expression.
- Returns:
The pandas expression string used to filter rows.
Examples
- sample_column() str[source]¶
Return the column indicating sample membership.
- Returns:
Name of the sample indicator column.
Examples
- split_sample(df: DataFrame) Tuple[DataFrame, DataFrame][source]¶
Split the input frame into sample and target partitions.
- Parameters:
df – Input DataFrame containing sample and target rows.
- Returns:
A tuple of (sample_df, target_df).
Examples
- standardize_types() bool[source]¶
Return whether to standardize input types in Sample.from_frame.
- Returns:
Trueif standardization is enabled, otherwiseFalse.
Examples
- transformations() str | None[source]¶
Return the transformations config for adjustment.
- Returns:
Transformations setting or
Noneif disabled.
Examples
- update_attributes_for_main_used_by_adjust() None[source]¶
Prepare cached attributes for main to use in adjustment.
- Returns:
None.
Examples
- weight_column() str[source]¶
Return the weight column name.
- Returns:
Name of the weight column.
Examples
- weight_trimming_mean_ratio() float | None[source]¶
Return the mean ratio used for trimming weights.
- Returns:
Weight trimming ratio or
Noneif unset.
Examples