balance.cli¶

class balance.cli.BalanceCLI(args: Namespace)[source]¶

Helper class that encapsulates CLI argument handling and execution.

Examples

Tutorial:: https://import-balance.org/docs/tutorials/balance_cli_tutorial/

adapt_output(output_df: DataFrame) → DataFrame[source]¶

Filter raw output dataframe to user’s requested rows/columns.

First we filter to the rows we are supposed to keep.
Next we select the columns that need to be returned.

Parameters:: output_df – DataFrame produced by the adjustment step.
Returns:: Filtered DataFrame containing requested rows and columns.

Examples

batch_columns() → List[str][source]¶

Return the list of batch column names.

Returns:: Batch column names parsed from the CLI argument.

Examples

check_input_columns(columns: List[str] | pd.Index) → None[source]¶

Validate the input frame includes required columns.

Parameters:: columns – Available column names in the input data.
Returns:: None.

Examples

covariate_columns() → List[str][source]¶

Return the list of covariate column names.

Returns:: Covariate column names parsed from the CLI argument.

Examples

covariate_columns_for_diagnostics() → List[str] | None[source]¶

Return covariate columns used for diagnostics reporting.

Returns:: List of columns to keep in diagnostics or None.

Examples

formula() → str | None[source]¶

Return the formula string used for model matrices.

Returns:: Formula string or None if unset.

Examples

has_batch_columns() → bool[source]¶

Return True when batch columns are supplied.

Returns:: True if batch columns are set, otherwise False.

Examples

has_keep_columns() → bool[source]¶

Return True when output keep columns are supplied.

Returns:: True if keep columns are set, otherwise False.

Examples

has_keep_row_column() → bool[source]¶

Return True when a keep-row column is supplied.

Returns:: True if a keep-row column is set, otherwise False.

Examples

has_outcome_columns() → bool[source]¶

Return True when outcome columns are explicitly supplied.

Returns:: True if outcome columns are set, otherwise False.

Examples

id_column() → str[source]¶

Return the identifier column name.

Returns:: Name of the ID column.

Examples

keep_columns() → List[str] | None[source]¶

Return the subset of columns to keep in outputs.

Returns:: List of columns to keep or None if unspecified.

Examples

keep_row_column() → str | None[source]¶

Return the keep-row indicator column name.

Returns:: Name of the keep-row indicator column.

Examples

lambda_max() → float | None[source]¶

Return the maximum L1 penalty setting.

Returns:: Maximum L1 penalty value or None.

Examples

lambda_min() → float | None[source]¶

Return the minimum L1 penalty setting.

Returns:: Minimum L1 penalty value or None.

Examples

load_and_check_input() → DataFrame[source]¶

Read the input file and log basic information.

Returns:: DataFrame loaded from the input file.

Examples

logistic_regression_kwargs() → Dict[str, Any] | None[source]¶

Parse JSON keyword arguments for the IPW logistic regression model.

Returns:: Parsed keyword arguments dictionary or None.

Examples

logistic_regression_model() → ClassifierMixin | None[source]¶

Build a LogisticRegression model when IPW kwargs are supplied.

Returns:: Configured LogisticRegression instance or None.

Examples

main() → None[source]¶

Run the CLI workflow from input loading to output writing.

Returns:: None.

Examples

max_de() → float | None[source]¶

Return the max design effect setting.

Returns:: Maximum design effect or None if unset.

Examples

method() → str[source]¶

Return the adjustment method name.

Returns:: The adjustment method string (for example, "ipw").

Examples

num_lambdas() → int | None[source]¶

Return the number of lambda values to search over.

Returns:: Number of lambdas as an integer or None.

Examples

one_hot_encoding() → bool | None[source]¶

Return the parsed one-hot encoding flag.

Returns:: True/False for one-hot encoding, or None if unset.

Examples

outcome_columns() → List[str] | None[source]¶

Return the list of outcome columns if provided.

Returns:: List of outcome columns or None if unset.

Examples

process_batch(batch_df: pd.DataFrame, transformations: Dict[str, Any] | str | None = 'default', formula: str | None = None, penalty_factor: None = None, one_hot_encoding: bool = False, max_de: float | None = 1.5, lambda_min: float | None = 1e-05, lambda_max: float | None = 10, num_lambdas: int | None = 250, weight_trimming_mean_ratio: float | None = 20, sample_cls: Type[balance_sample_cls] = <class 'balance.sample_class.Sample'>, sample_package_name: str = 'balance') → Dict[str, pd.DataFrame][source]¶

Run adjustment for a batch of data and return outputs.

Parameters:

batch_df – Input data for the current batch.
transformations – Transformations argument for Sample.adjust.
formula – Optional formula for model matrices.
penalty_factor – Optional penalty factor passed to adjust.
one_hot_encoding – Whether to one-hot encode categorical features.
max_de – Maximum design effect constraint.
lambda_min – Minimum penalty value for IPW.
lambda_max – Maximum penalty value for IPW.
num_lambdas – Number of penalty values to search.
weight_trimming_mean_ratio – Mean ratio for trimming weights.
sample_cls – Sample implementation used to build sample/target.
sample_package_name – Name used in logging.

Returns:

Dict with adjusted data and diagnostics frames.

Examples

rows_to_keep_for_diagnostics() → str | None[source]¶

Return the diagnostics row-filter expression.

Returns:: The pandas expression string used to filter rows.

Examples

sample_column() → str[source]¶

Return the column indicating sample membership.

Returns:: Name of the sample indicator column.

Examples

split_sample(df: DataFrame) → Tuple[DataFrame, DataFrame][source]¶

Split the input frame into sample and target partitions.

Parameters:: df – Input DataFrame containing sample and target rows.
Returns:: A tuple of (sample_df, target_df).

Examples

standardize_types() → bool[source]¶

Return whether to standardize input types in Sample.from_frame.

Returns:: True if standardization is enabled, otherwise False.

Examples

transformations() → str | None[source]¶

Return the transformations config for adjustment.

Returns:: Transformations setting or None if disabled.

Examples

update_attributes_for_main_used_by_adjust() → None[source]¶

Prepare cached attributes for main to use in adjustment.

Returns:: None.

Examples

weight_column() → str[source]¶

Return the weight column name.

Returns:: Name of the weight column.

Examples

weight_trimming_mean_ratio() → float | None[source]¶

Return the mean ratio used for trimming weights.

Returns:: Weight trimming ratio or None if unset.

Examples

weights_impact_on_outcome_method() → str | None[source]¶

Return the outcome weight impact method for diagnostics.

Returns:: The method name or None if disabled.

Examples

write_outputs(output_df: DataFrame, diagnostics_df: DataFrame) → None[source]¶

Write adjusted output and diagnostics CSV files.

Parameters:

output_df – Adjusted output DataFrame to write.
diagnostics_df – Diagnostics DataFrame to write.

Returns:

None.

Examples

balance.cli.add_arguments_to_parser(parser: ArgumentParser) → ArgumentParser[source]¶

Register CLI arguments on an argparse parser.

Parameters:: parser – Parser to add arguments to.
Returns:: The parser instance with CLI arguments registered.

Examples

balance.cli.main() → None[source]¶

Entry point for the balance CLI.

Returns:: None.

Examples

balance.cli.make_parser() → ArgumentParser[source]¶

Create and return the CLI argument parser.

Returns:: A configured ArgumentParser for the balance CLI.

Examples

balance.cli¶

Table of Contents

Previous topic

Next topic

This Page