CLI tutorial¶
This tutorial walks through using the balance command-line interface (CLI) to adjust a sample dataset to a target. We will build a small demo dataset, run the CLI, and inspect the outputs.
The real power of a CLI lies in how seamlessly it integrates into the broader ecosystem of automation and data workflows. A CLI command can be invoked directly from shell scripts, scheduled via cron jobs, embedded in CI/CD pipelines, or orchestrated through tools like Airflow - all with minimal overhead. This composability means you can chain balance operations with other command-line tools using pipes, process batches of files in a loop, or trigger analyses based on events, all while maintaining a clear audit trail since the command itself documents exactly what was run. The non-zero exit codes that CLIs return on failure integrate naturally with automated systems that need to halt pipelines or send alerts when something goes wrong. In short, a CLI transforms balance from something you use interactively into a building block for production-grade, reproducible workflows.
Prerequisites¶
Make sure balance is installed and the balance CLI is on your PATH. You can also run the CLI via python -m balance.cli from a checkout of the repository.
import os
import subprocess
import tempfile
import pandas as pd
from balance import load_data
INFO (2026-01-20 20:37:23,324) [__init__/<module> (line 72)]: Using balance version 0.15.0
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Use the bundled demo data¶
Balance ships with a small demo dataset via load_data(). You can build the CLI input by
adding a sample indicator and weight columns, then concatenate sample and target frames.
target_df, sample_df = load_data()
sample_df = sample_df.copy()
target_df = target_df.copy()
sample_df["is_respondent"] = 1
target_df["is_respondent"] = 0
sample_df["weight"] = 1.0
target_df["weight"] = 1.0
load_data_input_df = pd.concat([sample_df, target_df], ignore_index=True)
load_data_input_df.head()
| id | gender | age_group | income | happiness | is_respondent | weight | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 26.043029 | 1 | 1.0 |
| 1 | 1 | Female | 18-24 | 9.940280 | 66.885485 | 1 | 1.0 |
| 2 | 2 | Male | 18-24 | 2.673623 | 37.091922 | 1 | 1.0 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 49.394050 | 1 | 1.0 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 72.304208 | 1 | 1.0 |
Run the CLI¶
We'll write the input dataset to disk, then call the CLI to compute weights and diagnostics.
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_out.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_out.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file",
input_path,
"--output_file",
output_path,
"--diagnostics_output_file",
diagnostics_path,
"--covariate_columns",
"gender,age_group,income",
"--method",
"ipw",
]
print("CLI command:", " ".join(cmd))
subprocess.check_call(cmd)
load_data_adjusted_df = pd.read_csv(output_path)
load_data_diagnostics_df = pd.read_csv(diagnostics_path)
load_data_adjusted_df.head()
CLI command: python -m balance.cli --input_file /tmp/tmp6mblcj1g/input.csv --output_file /tmp/tmp6mblcj1g/weights_out.csv --diagnostics_output_file /tmp/tmp6mblcj1g/diagnostics_out.csv --covariate_columns gender,age_group,income --method ipw
INFO (2026-01-20 20:37:24,837) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:37:24,839) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:37:24,839) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:37:24,849) [cli/load_and_check_input (line 869)]: Number of rows in input file: 11000
INFO (2026-01-20 20:37:24,849) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
WARNING (2026-01-20 20:37:24,998) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:25,007) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:25,007) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:25,008) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:25,008) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:25,009) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:25,010) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
WARNING (2026-01-20 20:37:25,012) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:25,026) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:25,026) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:25,027) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:25,027) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:25,027) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:25,029) [cli/process_batch (line 702)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
INFO (2026-01-20 20:37:25,034) [ipw/ipw (line 694)]: Starting ipw function
INFO (2026-01-20 20:37:25,035) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:37:25,035) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-01-20 20:37:25,044) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income'] INFO (2026-01-20 20:37:25,051) [ipw/ipw (line 728)]: Building model matrix INFO (2026-01-20 20:37:25,148) [ipw/ipw (line 750)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender'] INFO (2026-01-20 20:37:25,148) [ipw/ipw (line 753)]: The number of columns in the model matrix: 18 INFO (2026-01-20 20:37:25,148) [ipw/ipw (line 754)]: The number of rows in the model matrix: 11000
INFO (2026-01-20 20:37:26,643) [ipw/ipw (line 915)]: Done with sklearn INFO (2026-01-20 20:37:26,643) [ipw/ipw (line 917)]: max_de: 1.5 INFO (2026-01-20 20:37:26,643) [ipw/choose_regularization (line 371)]: Starting choosing regularisation parameters
INFO (2026-01-20 20:37:34,450) [ipw/choose_regularization (line 457)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.064155 91 2.5 1.495568 0.535793 0.090706
INFO (2026-01-20 20:37:34,451) [ipw/ipw (line 972)]: Chosen lambda: 0.06415476458273757
INFO (2026-01-20 20:37:34,452) [ipw/ipw (line 990)]: Proportion null deviance explained 0.17451470039667905
INFO (2026-01-20 20:37:34,454) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:37:34,456) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
adjustment details:
method: ipw
weight trimming mean ratio: 2.5
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.669
effective sample size (ESS): 668.6
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
3 common variables: gender,age_group,income
INFO (2026-01-20 20:37:34,456) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:37:34,456) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:37:34,456) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:37:34,700) [sample_class/diagnostics (line 2013)]: Done computing diagnostics INFO (2026-01-20 20:37:34,704) [cli/process_batch (line 741)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.495568 design_effect .. ... ... ... 91 covar_main_asmd_improvement 0.182901 income 92 covar_main_asmd_adjusted 0.173284 mean(asmd) 93 covar_main_asmd_unadjusted 0.326799 mean(asmd) 94 covar_main_asmd_improvement 0.153514 mean(asmd) 95 adjustment_failure 0.000000 NaN [96 rows x 3 columns] INFO (2026-01-20 20:37:34,706) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
| id | gender | age_group | income | happiness | is_respondent | weight | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 26.043029 | 1.0 | 7.602327 |
| 1 | 1 | Female | 18-24 | 9.940280 | 66.885485 | 1.0 | 9.398526 |
| 2 | 2 | Male | 18-24 | 2.673623 | 37.091922 | 1.0 | 3.433026 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 49.394050 | 1.0 | 6.491044 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 72.304208 | 1.0 | 4.886635 |
Inspect diagnostics¶
The diagnostics output is a flat table that includes adjustment metadata and balance
metrics. The metric column identifies the type of diagnostic, while var indicates the
variable (or NaN for overall summaries). It is most useful to inspect var in the
context of the metric it belongs to. The cells below use the diagnostics from the
previous CLI run (load_data_diagnostics_df).
(
load_data_diagnostics_df.groupby("metric")["var"]
.apply(lambda col: sorted(col.dropna().unique()))
.sort_index()
)
metric adjustment_failure [] adjustment_method [ipw] covar_asmd_adjusted [age_group[T.25-34], age_group[T.35-44], age_g... covar_asmd_improvement [age_group[T.25-34], age_group[T.35-44], age_g... covar_asmd_unadjusted [age_group[T.25-34], age_group[T.35-44], age_g... covar_main_asmd_adjusted [age_group, gender, income, mean(asmd)] covar_main_asmd_improvement [age_group, gender, income, mean(asmd)] covar_main_asmd_unadjusted [age_group, gender, income, mean(asmd)] ipw_model_glance [intercept_, n_iter_] ipw_multi_class [auto] ipw_penalty [l2] ipw_solver [lbfgs] model_coef [C(_is_na_gender, one_hot_encoding_greater_2)[... model_glance [deviance, l1_ratio, lambda, null_deviance, pr... size [sample_covars, sample_obs, target_covars, tar... weights_diagnostics [describe_25%, describe_50%, describe_75%, des... Name: var, dtype: object
load_data_diagnostics_df.query("metric == 'adjustment_method'")
| metric | val | var | |
|---|---|---|---|
| 28 | adjustment_method | 0.0 | ipw |
CLI Help and Arguments¶
You can view all available CLI arguments using --help. Because the full output is long,
the snippet below prints the first section only.
# Print a shorter CLI help snippet
help_output = subprocess.run(
["python", "-m", "balance.cli", "--help"],
check=False,
capture_output=True,
text=True,
).stdout
print("\n".join(help_output.splitlines()[:40]))
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
usage: cli.py [-h] --input_file INPUT_FILE --output_file OUTPUT_FILE
[--diagnostics_output_file DIAGNOSTICS_OUTPUT_FILE]
[--method METHOD] [--sample_column SAMPLE_COLUMN]
[--id_column ID_COLUMN] [--weight_column WEIGHT_COLUMN]
--covariate_columns COVARIATE_COLUMNS
[--outcome_columns OUTCOME_COLUMNS]
[--covariate_columns_for_diagnostics COVARIATE_COLUMNS_FOR_DIAGNOSTICS]
[--rows_to_keep_for_diagnostics ROWS_TO_KEEP_FOR_DIAGNOSTICS]
[--batch_columns BATCH_COLUMNS] [--keep_columns KEEP_COLUMNS]
[--keep_row_column KEEP_ROW_COLUMN]
[--sep_input_file SEP_INPUT_FILE]
[--sep_output_file SEP_OUTPUT_FILE]
[--sep_diagnostics_output_file SEP_DIAGNOSTICS_OUTPUT_FILE]
[--no_output_header] [--succeed_on_weighting_failure]
[--max_de MAX_DE] [--lambda_min LAMBDA_MIN]
[--lambda_max LAMBDA_MAX] [--num_lambdas NUM_LAMBDAS]
[--ipw_logistic_regression_kwargs IPW_LOGISTIC_REGRESSION_KWARGS]
[--weight_trimming_mean_ratio WEIGHT_TRIMMING_MEAN_RATIO]
[--one_hot_encoding ONE_HOT_ENCODING]
[--transformations TRANSFORMATIONS] [--formula FORMULA]
[--return_df_with_original_dtypes]
[--standardize_types STANDARDIZE_TYPES]
optional arguments:
-h, --help show this help message and exit
--input_file INPUT_FILE
Path to input sample/target
--output_file OUTPUT_FILE
Path to write output weights
Key CLI Arguments Summary¶
Here are the most commonly used arguments:
| Argument | Default | Description |
|---|---|---|
--method |
ipw |
Adjustment method: ipw, cbps, or rake |
--max_de |
1.5 |
Maximum design effect. Set to None to use lambda_1se instead |
--lambda_min |
1e-05 |
Lower bound for L1 penalty (IPW only) |
--lambda_max |
10 |
Upper bound for L1 penalty (IPW only) |
--num_lambdas |
250 |
Number of lambda values to search (IPW only) |
--weight_trimming_mean_ratio |
20.0 |
Trim weights above mean(weights) * ratio |
--transformations |
default |
Covariate transformations. Use None to disable |
--formula |
None |
Custom model formula (e.g., "gender + income") |
--one_hot_encoding |
True |
One-hot encode categorical features |
--batch_columns |
None |
Columns to group by for batch processing |
--keep_columns |
None |
Subset of columns to include in output |
--outcome_columns |
None |
Columns treated as outcomes (not covariates) |
--ipw_logistic_regression_kwargs |
None |
JSON string of kwargs for sklearn LogisticRegression |
--succeed_on_weighting_failure |
False |
Return null weights instead of failing on errors |
Example: Tuning IPW parameters¶
Below we run the CLI with custom regularization settings and a custom logistic regression solver:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_tuned.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_tuned.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "ipw",
# Tuning parameters
"--max_de", "2.0",
"--lambda_min", "1e-06",
"--lambda_max", "100",
"--num_lambdas", "500",
"--weight_trimming_mean_ratio", "10.0",
# Custom logistic regression settings
"--ipw_logistic_regression_kwargs", '{"solver": "liblinear", "max_iter": 500}',
]
print("CLI command:")
print(" ".join(cmd))
subprocess.check_call(cmd)
tuned_adjusted_df = pd.read_csv(output_path)
tuned_adjusted_df.head()
CLI command:
python -m balance.cli --input_file /tmp/tmpe0eif4oj/input.csv --output_file /tmp/tmpe0eif4oj/weights_tuned.csv --diagnostics_output_file /tmp/tmpe0eif4oj/diagnostics_tuned.csv --covariate_columns gender,age_group,income --method ipw --max_de 2.0 --lambda_min 1e-06 --lambda_max 100 --num_lambdas 500 --weight_trimming_mean_ratio 10.0 --ipw_logistic_regression_kwargs {"solver": "liblinear", "max_iter": 500}
INFO (2026-01-20 20:37:38,128) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:37:38,129) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:37:38,130) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 2.0, 'lambda_min': 1e-06, 'lambda_max': 100.0, 'num_lambdas': 500, 'weight_trimming_mean_ratio': 10.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:37:38,140) [cli/load_and_check_input (line 869)]: Number of rows in input file: 11000
INFO (2026-01-20 20:37:38,140) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
WARNING (2026-01-20 20:37:38,288) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:38,296) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:38,297) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:38,297) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:38,298) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:38,298) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:38,300) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
WARNING (2026-01-20 20:37:38,302) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:38,316) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:38,316) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:38,317) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:38,317) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:38,317) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:38,319) [cli/process_batch (line 702)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
INFO (2026-01-20 20:37:38,324) [ipw/ipw (line 694)]: Starting ipw function
INFO (2026-01-20 20:37:38,325) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:37:38,325) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-01-20 20:37:38,335) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income'] INFO (2026-01-20 20:37:38,342) [ipw/ipw (line 728)]: Building model matrix INFO (2026-01-20 20:37:38,438) [ipw/ipw (line 750)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender'] INFO (2026-01-20 20:37:38,438) [ipw/ipw (line 753)]: The number of columns in the model matrix: 18 INFO (2026-01-20 20:37:38,438) [ipw/ipw (line 754)]: The number of rows in the model matrix: 11000 INFO (2026-01-20 20:37:38,465) [ipw/ipw (line 915)]: Done with sklearn INFO (2026-01-20 20:37:38,465) [ipw/ipw (line 917)]: max_de: 2.0 INFO (2026-01-20 20:37:38,465) [ipw/choose_regularization (line 371)]: Starting choosing regularisation parameters
INFO (2026-01-20 20:37:42,541) [ipw/choose_regularization (line 457)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
6 NaN 0 2.5 1.714071 0.634917 0.071337
INFO (2026-01-20 20:37:42,542) [ipw/ipw (line 972)]: Chosen lambda: nan
INFO (2026-01-20 20:37:42,543) [ipw/ipw (line 990)]: Proportion null deviance explained 0.18280833369391158
INFO (2026-01-20 20:37:42,545) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:37:42,547) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
adjustment details:
method: ipw
weight trimming mean ratio: 2.5
design effect (Deff): 1.714
effective sample size proportion (ESSP): 0.583
effective sample size (ESS): 583.4
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
3 common variables: gender,age_group,income
INFO (2026-01-20 20:37:42,547) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:37:42,547) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:37:42,547) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:37:42,791) [sample_class/diagnostics (line 2013)]: Done computing diagnostics INFO (2026-01-20 20:37:42,795) [cli/process_batch (line 741)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.714071 design_effect .. ... ... ... 91 covar_main_asmd_improvement 0.225463 income 92 covar_main_asmd_adjusted 0.143344 mean(asmd) 93 covar_main_asmd_unadjusted 0.326799 mean(asmd) 94 covar_main_asmd_improvement 0.183455 mean(asmd) 95 adjustment_failure 0.000000 NaN [96 rows x 3 columns] INFO (2026-01-20 20:37:42,797) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
| id | gender | age_group | income | happiness | is_respondent | weight | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 26.043029 | 1.0 | 6.714531 |
| 1 | 1 | Female | 18-24 | 9.940280 | 66.885485 | 1.0 | 8.721215 |
| 2 | 2 | Male | 18-24 | 2.673623 | 37.091922 | 1.0 | 2.537674 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 49.394050 | 1.0 | 5.587013 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 72.304208 | 1.0 | 3.883128 |
Example: Using a Custom Formula¶
The --formula argument allows you to specify a custom model formula, including interaction
terms. When using --formula, you should typically also set --transformations=None to
prevent automatic transformations from interfering with your custom formula.
The formula uses patsy/R-style syntax:
gender + income: additive terms (no interaction)gender * income: equivalent togender + income + gender:income(main effects + interaction)gender:income: only the interaction term
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_formula.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_formula.csv")
# Use the demo data for the formula example
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "ipw",
# Disable transformations to use raw covariates in formula
"--transformations", "None",
# Use a formula with interaction term
"--formula", "gender*income",
]
print("CLI command with custom formula:")
print(" ".join(cmd))
subprocess.check_call(cmd)
formula_diagnostics_df = pd.read_csv(diagnostics_path)
# Check model coefficients to verify formula was applied
print("\nModel coefficients (showing interaction term):")
print(formula_diagnostics_df.query("metric == 'model_coef'")[["var", "val"]])
CLI command with custom formula: python -m balance.cli --input_file /tmp/tmpw3y1r_ra/input.csv --output_file /tmp/tmpw3y1r_ra/weights_formula.csv --diagnostics_output_file /tmp/tmpw3y1r_ra/diagnostics_formula.csv --covariate_columns gender,age_group,income --method ipw --transformations None --formula gender*income
INFO (2026-01-20 20:37:44,511) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:37:44,512) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:37:44,513) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': None, 'formula': 'gender*income', 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:37:44,523) [cli/load_and_check_input (line 869)]: Number of rows in input file: 11000
INFO (2026-01-20 20:37:44,523) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
WARNING (2026-01-20 20:37:44,672) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:44,681) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:44,681) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:44,682) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:44,682) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:44,683) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:44,684) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
WARNING (2026-01-20 20:37:44,686) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:44,700) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:44,700) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:44,701) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:37:44,701) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:44,702) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:37:44,704) [cli/process_batch (line 702)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
INFO (2026-01-20 20:37:44,708) [ipw/ipw (line 694)]: Starting ipw function
INFO (2026-01-20 20:37:44,709) [ipw/ipw (line 728)]: Building model matrix
INFO (2026-01-20 20:37:44,751) [ipw/ipw (line 750)]: The formula used to build the model matrix: ['gender*income'] INFO (2026-01-20 20:37:44,752) [ipw/ipw (line 753)]: The number of columns in the model matrix: 7 INFO (2026-01-20 20:37:44,752) [ipw/ipw (line 754)]: The number of rows in the model matrix: 11000
INFO (2026-01-20 20:37:46,079) [ipw/ipw (line 915)]: Done with sklearn INFO (2026-01-20 20:37:46,079) [ipw/ipw (line 917)]: max_de: 1.5 INFO (2026-01-20 20:37:46,079) [ipw/choose_regularization (line 371)]: Starting choosing regularisation parameters
INFO (2026-01-20 20:37:52,076) [ipw/choose_regularization (line 457)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.043507 98 5.0 1.496216 0.517269 0.157756
INFO (2026-01-20 20:37:52,078) [ipw/ipw (line 972)]: Chosen lambda: 0.043506507030756265
INFO (2026-01-20 20:37:52,078) [ipw/ipw (line 990)]: Proportion null deviance explained 0.09595811553953071
WARNING (2026-01-20 20:37:52,078) [ipw/ipw (line 998)]: The propensity model has low fraction null deviance explained (0.09595811553953071). Results may not be accurate
INFO (2026-01-20 20:37:52,081) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:37:52,083) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.668
effective sample size (ESS): 668.4
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
3 common variables: gender,age_group,income
INFO (2026-01-20 20:37:52,083) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:37:52,083) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:37:52,083) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:37:52,329) [sample_class/diagnostics (line 2013)]: Done computing diagnostics INFO (2026-01-20 20:37:52,333) [cli/process_batch (line 741)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.496216 design_effect .. ... ... ... 80 covar_main_asmd_improvement 0.301914 income 81 covar_main_asmd_adjusted 0.157756 mean(asmd) 82 covar_main_asmd_unadjusted 0.326799 mean(asmd) 83 covar_main_asmd_improvement 0.169043 mean(asmd) 84 adjustment_failure 0.000000 NaN [85 rows x 3 columns] INFO (2026-01-20 20:37:52,335) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Model coefficients (showing interaction term):
var val
40 intercept 0.458758
41 C(gender, one_hot_encoding_greater_2)[Female] -0.189079
42 C(gender, one_hot_encoding_greater_2)[Female]:... -0.225013
43 C(gender, one_hot_encoding_greater_2)[Male] 0.178582
44 C(gender, one_hot_encoding_greater_2)[Male]:in... -0.198394
45 C(gender, one_hot_encoding_greater_2)[_NA] 0.007039
46 C(gender, one_hot_encoding_greater_2)[_NA]:income -0.091899
47 income -0.372909
Batch Processing Example¶
The --batch_columns argument allows you to run separate adjustments for each unique
combination of values in the specified columns. This is useful when you want to compute
weights independently for different subgroups (e.g., by gender or region).
# Create a dataset with a batch column for gender
batch_input_df = load_data_input_df.copy()
# The 'gender' column has values like 'Female', 'Male', and possibly NA
# Filter to only rows with non-null gender for this example
batch_input_df = batch_input_df[batch_input_df["gender"].notna()].copy()
print(f"Rows after filtering: {len(batch_input_df)}")
print(f"Gender distribution:\n{batch_input_df['gender'].value_counts()}")
Rows after filtering: 10014 Gender distribution: gender Male 5195 Female 4819 Name: count, dtype: int64
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input_batch.csv")
output_path = os.path.join(tmpdir, "weights_batch.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_batch.csv")
batch_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "age_group,income", # Note: gender is now used as batch column
"--outcome_columns", "happiness",
"--batch_columns", "gender", # Process each gender separately
"--method", "ipw",
]
print("CLI command with batch processing:")
print(" ".join(cmd))
subprocess.check_call(cmd)
batch_adjusted_df = pd.read_csv(output_path)
batch_diagnostics_df = pd.read_csv(diagnostics_path)
print(f"\nOutput rows: {len(batch_adjusted_df)}")
batch_adjusted_df.head()
CLI command with batch processing: python -m balance.cli --input_file /tmp/tmpzxn5xte8/input_batch.csv --output_file /tmp/tmpzxn5xte8/weights_batch.csv --diagnostics_output_file /tmp/tmpzxn5xte8/diagnostics_batch.csv --covariate_columns age_group,income --outcome_columns happiness --batch_columns gender --method ipw
INFO (2026-01-20 20:37:54,059) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:37:54,061) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:37:54,061) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:37:54,071) [cli/load_and_check_input (line 869)]: Number of rows in input file: 10014
INFO (2026-01-20 20:37:54,071) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
INFO (2026-01-20 20:37:54,072) [cli/main (line 1085)]: Running weighting for batch = ('Female',)
WARNING (2026-01-20 20:37:54,223) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:54,230) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:54,230) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:54,231) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-01-20 20:37:54,231) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:54,232) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
id object
is_respondent float64
dtype: object
INFO (2026-01-20 20:37:54,233) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
268 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
WARNING (2026-01-20 20:37:54,234) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:54,244) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:54,244) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:54,245) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-01-20 20:37:54,245) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:54,246) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
id object
is_respondent float64
dtype: object
INFO (2026-01-20 20:37:54,247) [cli/process_batch (line 702)]: balance target object:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
INFO (2026-01-20 20:37:54,250) [ipw/ipw (line 694)]: Starting ipw function
INFO (2026-01-20 20:37:54,251) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:37:54,251) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['age_group', 'income']
INFO (2026-01-20 20:37:54,256) [adjustment/apply_transformations (line 469)]: Final variables in output: ['age_group', 'income']
INFO (2026-01-20 20:37:54,260) [ipw/ipw (line 728)]: Building model matrix INFO (2026-01-20 20:37:54,293) [ipw/ipw (line 750)]: The formula used to build the model matrix: ['income + age_group'] INFO (2026-01-20 20:37:54,293) [ipw/ipw (line 753)]: The number of columns in the model matrix: 14 INFO (2026-01-20 20:37:54,293) [ipw/ipw (line 754)]: The number of rows in the model matrix: 4819
INFO (2026-01-20 20:37:55,233) [ipw/ipw (line 915)]: Done with sklearn INFO (2026-01-20 20:37:55,233) [ipw/ipw (line 917)]: max_de: 1.5 INFO (2026-01-20 20:37:55,233) [ipw/choose_regularization (line 371)]: Starting choosing regularisation parameters
INFO (2026-01-20 20:37:59,335) [ipw/choose_regularization (line 457)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
6 0.105705 82 5.0 1.4897 0.494125 0.098702
INFO (2026-01-20 20:37:59,336) [ipw/ipw (line 972)]: Chosen lambda: 0.10570520810009826
INFO (2026-01-20 20:37:59,337) [ipw/ipw (line 990)]: Proportion null deviance explained 0.14889612147544162
INFO (2026-01-20 20:37:59,339) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:37:59,341) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
268 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.490
effective sample size proportion (ESSP): 0.671
effective sample size (ESS): 179.9
target:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
2 common variables: age_group,income
INFO (2026-01-20 20:37:59,341) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:37:59,341) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:37:59,341) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:37:59,470) [sample_class/diagnostics (line 2013)]: Done computing diagnostics
INFO (2026-01-20 20:37:59,474) [cli/process_batch (line 741)]: balance diagnostics object: metric val var
0 size 268.000000 sample_obs
1 size 2.000000 sample_covars
2 size 4551.000000 target_obs
3 size 2.000000 target_covars
4 weights_diagnostics 1.489700 design_effect
.. ... ... ...
78 covar_main_asmd_improvement 0.185597 income
79 covar_main_asmd_adjusted 0.220390 mean(asmd)
80 covar_main_asmd_unadjusted 0.422500 mean(asmd)
81 covar_main_asmd_improvement 0.202110 mean(asmd)
82 adjustment_failure 0.000000 NaN
[83 rows x 3 columns]
INFO (2026-01-20 20:37:59,476) [cli/main (line 1102)]: Done processing batch ('Female',)
INFO (2026-01-20 20:37:59,476) [cli/main (line 1085)]: Running weighting for batch = ('Male',)
WARNING (2026-01-20 20:37:59,477) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:59,484) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:59,484) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:59,485) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-01-20 20:37:59,485) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:59,486) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
id object
is_respondent float64
dtype: object
INFO (2026-01-20 20:37:59,487) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
644 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
WARNING (2026-01-20 20:37:59,488) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:37:59,497) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:37:59,497) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:37:59,498) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-01-20 20:37:59,498) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:37:59,498) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
id object
is_respondent float64
dtype: object
INFO (2026-01-20 20:37:59,500) [cli/process_batch (line 702)]: balance target object:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
INFO (2026-01-20 20:37:59,502) [ipw/ipw (line 694)]: Starting ipw function
INFO (2026-01-20 20:37:59,503) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:37:59,503) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['age_group', 'income']
INFO (2026-01-20 20:37:59,508) [adjustment/apply_transformations (line 469)]: Final variables in output: ['age_group', 'income']
INFO (2026-01-20 20:37:59,511) [ipw/ipw (line 728)]: Building model matrix
INFO (2026-01-20 20:37:59,543) [ipw/ipw (line 750)]: The formula used to build the model matrix: ['income + age_group'] INFO (2026-01-20 20:37:59,543) [ipw/ipw (line 753)]: The number of columns in the model matrix: 14 INFO (2026-01-20 20:37:59,543) [ipw/ipw (line 754)]: The number of rows in the model matrix: 5195
INFO (2026-01-20 20:38:00,393) [ipw/ipw (line 915)]: Done with sklearn INFO (2026-01-20 20:38:00,393) [ipw/ipw (line 917)]: max_de: 1.5 INFO (2026-01-20 20:38:00,393) [ipw/choose_regularization (line 371)]: Starting choosing regularisation parameters
INFO (2026-01-20 20:38:04,420) [ipw/choose_regularization (line 457)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.111736 81 5.0 1.495973 0.566289 0.087357
INFO (2026-01-20 20:38:04,422) [ipw/ipw (line 972)]: Chosen lambda: 0.11173591019485084
INFO (2026-01-20 20:38:04,422) [ipw/ipw (line 990)]: Proportion null deviance explained 0.14267734119400044
INFO (2026-01-20 20:38:04,425) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:38:04,427) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
644 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.668
effective sample size (ESS): 430.5
target:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
2 common variables: age_group,income
INFO (2026-01-20 20:38:04,427) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:38:04,427) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:38:04,427) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:38:04,559) [sample_class/diagnostics (line 2013)]: Done computing diagnostics
INFO (2026-01-20 20:38:04,563) [cli/process_batch (line 741)]: balance diagnostics object: metric val var
0 size 644.000000 sample_obs
1 size 2.000000 sample_covars
2 size 4551.000000 target_obs
3 size 2.000000 target_covars
4 weights_diagnostics 1.495973 design_effect
.. ... ... ...
78 covar_main_asmd_improvement 0.235896 income
79 covar_main_asmd_adjusted 0.192202 mean(asmd)
80 covar_main_asmd_unadjusted 0.430017 mean(asmd)
81 covar_main_asmd_improvement 0.237816 mean(asmd)
82 adjustment_failure 0.000000 NaN
[83 rows x 3 columns]
INFO (2026-01-20 20:38:04,565) [cli/main (line 1102)]: Done processing batch ('Male',)
INFO (2026-01-20 20:38:04,565) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Output rows: 912
| id | age_group | income | happiness | weight | gender | is_respondent | |
|---|---|---|---|---|---|---|---|
| 0 | 1 | 18-24 | 9.940280 | 66.885485 | 10.380592 | Female | 1.0 |
| 1 | 92 | 35-44 | 0.185097 | 84.464522 | 18.177351 | Female | 1.0 |
| 2 | 94 | 35-44 | 1.183696 | 65.742184 | 20.852401 | Female | 1.0 |
| 3 | 95 | 18-24 | 3.716007 | 67.624539 | 10.522912 | Female | 1.0 |
| 4 | 98 | 35-44 | 16.751931 | 44.868651 | 40.368284 | Female | 1.0 |
# Inspect weights by gender - each group was adjusted independently
print("Weight statistics by gender (sample only):")
sample_only = batch_adjusted_df[batch_adjusted_df["is_respondent"] == 1]
print(sample_only.groupby("gender")["weight"].describe().round(3))
Weight statistics by gender (sample only):
count mean std min 25% 50% 75% max
gender
Female 268.0 16.981 11.906 6.787 9.567 13.703 19.155 85.647
Male 644.0 7.067 4.981 2.913 3.260 5.775 9.234 35.371
Alternative Weighting Methods¶
The CLI supports three adjustment methods:
- IPW (Inverse Probability Weighting): The default method, uses logistic regression to estimate propensity scores
- CBPS (Covariate Balancing Propensity Score): Balances covariates while estimating propensity scores
- Rake (Raking/Iterative Proportional Fitting): Adjusts weights iteratively to match marginal distributions
Example: CBPS Method¶
CBPS simultaneously optimizes covariate balance and propensity score estimation:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_cbps.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_cbps.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "cbps",
]
print("CLI command with CBPS method:")
print(" ".join(cmd))
subprocess.check_call(cmd)
cbps_diagnostics_df = pd.read_csv(diagnostics_path)
# Verify the method used
print("\nAdjustment method used:")
print(cbps_diagnostics_df.query("metric == 'adjustment_method'")[["var", "val"]])
CLI command with CBPS method: python -m balance.cli --input_file /tmp/tmpvme2x18i/input.csv --output_file /tmp/tmpvme2x18i/weights_cbps.csv --diagnostics_output_file /tmp/tmpvme2x18i/diagnostics_cbps.csv --covariate_columns gender,age_group,income --method cbps
INFO (2026-01-20 20:38:06,308) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:38:06,310) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:38:06,310) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:38:06,320) [cli/load_and_check_input (line 869)]: Number of rows in input file: 11000
INFO (2026-01-20 20:38:06,320) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
WARNING (2026-01-20 20:38:06,469) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:38:06,477) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:38:06,477) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:38:06,478) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:38:06,478) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:38:06,479) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:38:06,480) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
WARNING (2026-01-20 20:38:06,482) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:38:06,496) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:38:06,497) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:38:06,497) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:38:06,497) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:38:06,498) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:38:06,500) [cli/process_batch (line 702)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
INFO (2026-01-20 20:38:06,504) [cbps/cbps (line 537)]: Starting cbps function
INFO (2026-01-20 20:38:06,506) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:38:06,506) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-01-20 20:38:06,515) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income'] INFO (2026-01-20 20:38:06,616) [cbps/cbps (line 588)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender'] INFO (2026-01-20 20:38:06,617) [cbps/cbps (line 599)]: The number of columns in the model matrix: 16 INFO (2026-01-20 20:38:06,617) [cbps/cbps (line 600)]: The number of rows in the model matrix: 11000 INFO (2026-01-20 20:38:06,624) [cbps/cbps (line 669)]: Finding initial estimator for GMM optimization
INFO (2026-01-20 20:38:06,757) [cbps/cbps (line 696)]: Finding initial estimator for GMM optimization that minimizes the balance loss
INFO (2026-01-20 20:38:07,356) [cbps/cbps (line 732)]: Running GMM optimization
INFO (2026-01-20 20:38:08,711) [cbps/cbps (line 859)]: Done cbps function
INFO (2026-01-20 20:38:08,714) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:38:08,717) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using cbps
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
adjustment details:
method: cbps
design effect (Deff): 1.500
effective sample size proportion (ESSP): 0.667
effective sample size (ESS): 666.7
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
3 common variables: gender,age_group,income
INFO (2026-01-20 20:38:08,717) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:38:08,717) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:38:08,717) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:38:08,995) [sample_class/diagnostics (line 2013)]: Done computing diagnostics INFO (2026-01-20 20:38:08,999) [cli/process_batch (line 741)]: balance diagnostics object: metric val var 0 size 1000.0 sample_obs 1 size 3.0 sample_covars 2 size 10000.0 target_obs 3 size 3.0 target_covars 4 weights_diagnostics 1.5 design_effect .. ... ... ... 86 covar_main_asmd_improvement 0.205326 income 87 covar_main_asmd_adjusted 0.175442 mean(asmd) 88 covar_main_asmd_unadjusted 0.326799 mean(asmd) 89 covar_main_asmd_improvement 0.151357 mean(asmd) 90 adjustment_failure 0 NaN [91 rows x 3 columns] INFO (2026-01-20 20:38:09,001) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Adjustment method used:
var val
28 cbps 0.0
Example: Rake Method¶
Raking iteratively adjusts weights to match target marginal distributions:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_rake.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_rake.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "rake",
]
print("CLI command with rake method:")
print(" ".join(cmd))
subprocess.check_call(cmd)
rake_diagnostics_df = pd.read_csv(diagnostics_path)
# Verify the method used
print("\nAdjustment method used:")
print(rake_diagnostics_df.query("metric == 'adjustment_method'")[["var", "val"]])
CLI command with rake method: python -m balance.cli --input_file /tmp/tmptuaq2qin/input.csv --output_file /tmp/tmptuaq2qin/weights_rake.csv --diagnostics_output_file /tmp/tmptuaq2qin/diagnostics_rake.csv --covariate_columns gender,age_group,income --method rake
INFO (2026-01-20 20:38:10,720) [__init__/<module> (line 72)]: Using balance version 0.15.0
INFO (2026-01-20 20:38:10,722) [cli/main (line 1039)]: Running cli.main() using balance version 0.15.0
INFO (2026-01-20 20:38:10,722) [cli/main (line 1074)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.15.0'}
INFO (2026-01-20 20:38:10,732) [cli/load_and_check_input (line 869)]: Number of rows in input file: 11000
INFO (2026-01-20 20:38:10,732) [cli/load_and_check_input (line 875)]: Number of columns in input file: 7
WARNING (2026-01-20 20:38:10,881) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:38:10,890) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:38:10,890) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:38:10,891) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:38:10,891) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:38:10,892) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:38:10,893) [cli/process_batch (line 691)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
WARNING (2026-01-20 20:38:10,894) [sample_class/from_frame (line 457)]: Casting id column to string
WARNING (2026-01-20 20:38:10,909) [pandas_utils/_warn_of_df_dtypes_change (line 492)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-01-20 20:38:10,909) [pandas_utils/_warn_of_df_dtypes_change (line 503)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-01-20 20:38:10,909) [pandas_utils/_warn_of_df_dtypes_change (line 506)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-01-20 20:38:10,909) [pandas_utils/_warn_of_df_dtypes_change (line 507)]: The (new) dtypes saved in df (after the change):
WARNING (2026-01-20 20:38:10,910) [pandas_utils/_warn_of_df_dtypes_change (line 508)]:
is_respondent float64
id object
dtype: object
INFO (2026-01-20 20:38:10,912) [cli/process_batch (line 702)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
INFO (2026-01-20 20:38:10,918) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-01-20 20:38:10,918) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-01-20 20:38:10,928) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-01-20 20:38:10,963) [rake/rake (line 274)]: Final covariates and levels that will be used in raking: {'age_group': ['18-24', '25-34', '35-44', '45+'], 'gender': ['Female', 'Male', '__NaN__'], 'income': ['(-0.0009997440000000001, 0.44]', '(0.44, 1.664]', '(1.664, 3.472]', '(11.312, 15.139]', '(15.139, 20.567]', '(20.567, 29.504]', '(29.504, 128.536]', '(3.472, 5.663]', '(5.663, 8.211]', '(8.211, 11.312]']}.
INFO (2026-01-20 20:38:10,983) [cli/process_batch (line 725)]: Succeeded with adjusting sample to target
INFO (2026-01-20 20:38:10,985) [cli/process_batch (line 726)]: balance adjusted object:
Adjusted balance Sample object with target set using rake
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
adjustment details:
method: rake
design effect (Deff): 3.774
effective sample size proportion (ESSP): 0.265
effective sample size (ESS): 265.0
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness,is_respondent
3 common variables: gender,age_group,income
INFO (2026-01-20 20:38:10,985) [cli/process_batch (line 728)]: Condition on which rows to keep for diagnostics: None
INFO (2026-01-20 20:38:10,985) [cli/process_batch (line 732)]: Names of columns to keep for diagnostics: None
INFO (2026-01-20 20:38:10,985) [sample_class/diagnostics (line 1792)]: Starting computation of diagnostics of the fitting
INFO (2026-01-20 20:38:11,229) [sample_class/diagnostics (line 2013)]: Done computing diagnostics INFO (2026-01-20 20:38:11,233) [cli/process_batch (line 741)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 3.773786 design_effect .. ... ... ... 61 covar_main_asmd_improvement 0.462436 income 62 covar_main_asmd_adjusted 0.014651 mean(asmd) 63 covar_main_asmd_unadjusted 0.326799 mean(asmd) 64 covar_main_asmd_improvement 0.312147 mean(asmd) 65 adjustment_failure 0.000000 NaN [66 rows x 3 columns] INFO (2026-01-20 20:38:11,235) [cli/main (line 1128)]: Done fitting the model, writing output
balance (Version 0.15.0) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Adjustment method used:
var val
28 rake 0.0
Next steps¶
- Try
--method cbpsor--method rakefor alternative weighting approaches. - Use
--outcome_columnsto control which columns are treated as outcomes. - Supply
--ipw_logistic_regression_kwargsto tune the IPW model. - Use
--succeed_on_weighting_failurefor pipelines where you want null weights instead of errors. - Explore
--covariate_columns_for_diagnosticsand--rows_to_keep_for_diagnosticsto customize diagnostic output.
Session info¶
For reproducibility, here is the session information:
import session_info
session_info.show(html=False, dependencies=True)
----- balance 0.15.0 pandas 2.3.3 session_info v1.0.1 ----- PIL 11.3.0 anyio NA arrow 1.4.0 asttokens NA attr 25.4.0 attrs 25.4.0 babel 2.17.0 certifi 2026.01.04 charset_normalizer 3.4.4 comm 0.2.3 cycler 0.12.1 cython_runtime NA dateutil 2.9.0.post0 debugpy 1.8.19 decorator 5.2.1 defusedxml 0.7.1 exceptiongroup 1.3.1 executing 2.2.1 fastjsonschema NA fqdn NA idna 3.11 importlib_metadata NA importlib_resources NA ipykernel 6.31.0 isoduration NA jedi 0.19.2 jinja2 3.1.6 joblib 1.5.3 json5 0.13.0 jsonpointer 3.0.0 jsonschema 4.25.1 jsonschema_specifications NA jupyter_events 0.12.0 jupyter_server 2.17.0 jupyterlab_server 2.28.0 kiwisolver 1.4.7 lark 1.3.1 markupsafe 3.0.3 matplotlib 3.9.4 mpl_toolkits NA narwhals 2.15.0 nbformat 5.10.4 numpy 1.26.4 overrides NA packaging 25.0 parso 0.8.5 patsy 1.0.2 pexpect 4.9.0 platformdirs 4.4.0 plotly 6.5.2 prometheus_client NA prompt_toolkit 3.0.52 psutil 7.2.1 ptyprocess 0.7.0 pure_eval 0.2.3 pydev_ipython NA pydevconsole NA pydevd 3.2.3 pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.19.2 pyparsing 3.3.1 pythonjsonlogger NA pytz 2025.2 referencing NA requests 2.32.5 rfc3339_validator 0.1.4 rfc3986_validator 0.1.1 rfc3987_syntax NA rpds NA scipy 1.13.1 seaborn 0.13.2 send2trash NA six 1.17.0 sklearn 1.3.2 sphinxcontrib NA stack_data 0.6.3 statsmodels 0.14.6 threadpoolctl 3.6.0 tornado 6.5.4 traitlets 5.14.3 typing_extensions NA uri_template NA urllib3 2.6.3 wcwidth 0.2.14 webcolors NA websocket 1.9.0 yaml 6.0.3 zipp NA zmq 27.1.0 zoneinfo NA ----- IPython 8.18.1 jupyter_client 8.6.3 jupyter_core 5.8.1 jupyterlab 4.5.2 notebook 7.5.2 ----- Python 3.9.25 (main, Nov 3 2025, 15:16:36) [GCC 13.3.0] Linux-6.11.0-1018-azure-x86_64-with-glibc2.39 ----- Session information updated at 2026-01-20 20:38