CLI tutorial¶
This tutorial walks through using the balance command-line interface (CLI) to adjust a sample dataset to a target. We will build a small demo dataset, run the CLI, and inspect the outputs.
The real power of a CLI lies in how seamlessly it integrates into the broader ecosystem of automation and data workflows. A CLI command can be invoked directly from shell scripts, scheduled via cron jobs, embedded in CI/CD pipelines, or orchestrated through tools like Airflow - all with minimal overhead. This composability means you can chain balance operations with other command-line tools using pipes, process batches of files in a loop, or trigger analyses based on events, all while maintaining a clear audit trail since the command itself documents exactly what was run. The non-zero exit codes that CLIs return on failure integrate naturally with automated systems that need to halt pipelines or send alerts when something goes wrong. In short, a CLI transforms balance from something you use interactively into a building block for production-grade, reproducible workflows.
Prerequisites¶
Make sure balance is installed and the balance CLI is on your PATH. You can also run the CLI via python -m balance.cli from a checkout of the repository.
import os
import subprocess
import tempfile
import pandas as pd
from balance import load_data
INFO (2026-03-12 17:27:14,077) [__init__/<module> (line 72)]: Using balance version 0.16.1
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Use the bundled demo data¶
Balance ships with a small demo dataset via load_data(). You can build the CLI input by
adding a sample indicator and weight columns, then concatenate sample and target frames.
target_df, sample_df = load_data()
sample_df = sample_df.copy()
target_df = target_df.copy()
sample_df["is_respondent"] = 1
target_df["is_respondent"] = 0
sample_df["weight"] = 1.0
target_df["weight"] = 1.0
load_data_input_df = pd.concat([sample_df, target_df], ignore_index=True)
load_data_input_df.head()
| id | gender | age_group | income | happiness | is_respondent | weight | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 26.043029 | 1 | 1.0 |
| 1 | 1 | Female | 18-24 | 9.940280 | 66.885485 | 1 | 1.0 |
| 2 | 2 | Male | 18-24 | 2.673623 | 37.091922 | 1 | 1.0 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 49.394050 | 1 | 1.0 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 72.304208 | 1 | 1.0 |
Run the CLI¶
We'll write the input dataset to disk, then call the CLI to compute weights and diagnostics.
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_out.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_out.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file",
input_path,
"--output_file",
output_path,
"--diagnostics_output_file",
diagnostics_path,
"--covariate_columns",
"gender,age_group,income",
"--method",
"ipw",
"--weights_impact_on_outcome_method",
"t_test",
]
print("CLI command:", " ".join(cmd))
subprocess.check_call(cmd)
load_data_adjusted_df = pd.read_csv(output_path)
load_data_diagnostics_df = pd.read_csv(diagnostics_path)
load_data_adjusted_df.head()
CLI command: python -m balance.cli --input_file /tmp/tmpzj_gpg8u/input.csv --output_file /tmp/tmpzj_gpg8u/weights_out.csv --diagnostics_output_file /tmp/tmpzj_gpg8u/diagnostics_out.csv --covariate_columns gender,age_group,income --method ipw --weights_impact_on_outcome_method t_test
INFO (2026-03-12 17:27:16,236) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:27:16,238) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:27:16,238) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:27:16,249) [cli/load_and_check_input (line 926)]: Number of rows in input file: 11000
INFO (2026-03-12 17:27:16,249) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
WARNING (2026-03-12 17:27:16,401) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:16,412) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:16,412) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:16,413) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-03-12 17:27:16,413) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:16,414) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
id str
is_respondent float64
dtype: object
INFO (2026-03-12 17:27:16,415) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
WARNING (2026-03-12 17:27:16,423) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:16,436) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:16,437) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:16,437) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-03-12 17:27:16,437) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:16,438) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
id str
is_respondent float64
dtype: object
INFO (2026-03-12 17:27:16,441) [cli/process_batch (line 758)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
INFO (2026-03-12 17:27:16,445) [ipw/ipw (line 703)]: Starting ipw function
INFO (2026-03-12 17:27:16,447) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:27:16,447) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:27:16,454) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:27:16,461) [ipw/ipw (line 738)]: Building model matrix
INFO (2026-03-12 17:27:16,570) [ipw/ipw (line 764)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-03-12 17:27:16,570) [ipw/ipw (line 767)]: The number of columns in the model matrix: 18
INFO (2026-03-12 17:27:16,570) [ipw/ipw (line 768)]: The number of rows in the model matrix: 11000
INFO (2026-03-12 17:27:18,039) [ipw/ipw (line 990)]: Done with sklearn INFO (2026-03-12 17:27:18,039) [ipw/ipw (line 992)]: max_de: 1.5 INFO (2026-03-12 17:27:18,039) [ipw/choose_regularization (line 368)]: Starting choosing regularisation parameters
INFO (2026-03-12 17:27:26,575) [ipw/choose_regularization (line 454)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.064155 91 2.5 1.49551 0.535725 0.090719
INFO (2026-03-12 17:27:26,577) [ipw/ipw (line 1047)]: Chosen lambda: 0.06415476458273757
INFO (2026-03-12 17:27:26,577) [ipw/ipw (line 1065)]: Proportion null deviance explained 0.17450914016991492
INFO (2026-03-12 17:27:26,581) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:27:26,583) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
adjustment details:
method: ipw
weight trimming mean ratio: 2.5
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.669
effective sample size (ESS): 668.7
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
3 common variables: gender,age_group,income
INFO (2026-03-12 17:27:26,583) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:27:26,583) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:27:26,583) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:27:26,859) [sample_class/diagnostics (line 2069)]: Done computing diagnostics INFO (2026-03-12 17:27:26,864) [cli/process_batch (line 799)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.495510 design_effect .. ... ... ... 91 covar_main_asmd_improvement 0.182907 income 92 covar_main_asmd_adjusted 0.173301 mean(asmd) 93 covar_main_asmd_unadjusted 0.326799 mean(asmd) 94 covar_main_asmd_improvement 0.153497 mean(asmd) 95 adjustment_failure 0.000000 NaN [96 rows x 3 columns] INFO (2026-03-12 17:27:26,866) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
| id | gender | age_group | income | weight | happiness | is_respondent | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 7.602714 | 26.043029 | 1.0 |
| 1 | 1 | Female | 18-24 | 9.940280 | 9.397964 | 66.885485 | 1.0 |
| 2 | 2 | Male | 18-24 | 2.673623 | 3.433402 | 37.091922 | 1.0 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 6.491919 | 49.394050 | 1.0 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 4.887119 | 72.304208 | 1.0 |
Filter the diagnostics to review the outcome-weight impact metrics:
load_data_diagnostics_df[
load_data_diagnostics_df["metric"].str.startswith("weights_impact_on_outcome_")
]
| metric | val | var |
|---|
Inspect diagnostics¶
The diagnostics output is a flat table that includes adjustment metadata and balance
metrics. The metric column identifies the type of diagnostic, while var indicates the
variable (or NaN for overall summaries). It is most useful to inspect var in the
context of the metric it belongs to. The cells below use the diagnostics from the
previous CLI run (load_data_diagnostics_df).
(
load_data_diagnostics_df.groupby("metric")["var"]
.apply(lambda col: sorted(col.dropna().unique()))
.sort_index()
)
metric adjustment_failure [] adjustment_method [ipw] covar_asmd_adjusted [age_group[T.25-34], age_group[T.35-44], age_g... covar_asmd_improvement [age_group[T.25-34], age_group[T.35-44], age_g... covar_asmd_unadjusted [age_group[T.25-34], age_group[T.35-44], age_g... covar_main_asmd_adjusted [age_group, gender, income, mean(asmd)] covar_main_asmd_improvement [age_group, gender, income, mean(asmd)] covar_main_asmd_unadjusted [age_group, gender, income, mean(asmd)] ipw_model_glance [intercept_, n_iter_] ipw_multi_class [auto] ipw_penalty [deprecated] ipw_solver [lbfgs] model_coef [C(_is_na_gender, one_hot_encoding_greater_2)[... model_glance [deviance, l1_ratio, lambda, null_deviance, pr... size [sample_covars, sample_obs, target_covars, tar... weights_diagnostics [describe_25%, describe_50%, describe_75%, des... Name: var, dtype: object
load_data_diagnostics_df.query("metric == 'adjustment_method'")
| metric | val | var | |
|---|---|---|---|
| 28 | adjustment_method | 0.0 | ipw |
CLI Help and Arguments¶
You can view all available CLI arguments using --help. Because the full output is long,
the snippet below prints the first section only.
# Print a shorter CLI help snippet
help_output = subprocess.run(
["python", "-m", "balance.cli", "--help"],
check=False,
capture_output=True,
text=True,
).stdout
print("\n".join(help_output.splitlines()[:40]))
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
usage: cli.py [-h] --input_file INPUT_FILE --output_file OUTPUT_FILE
[--diagnostics_output_file DIAGNOSTICS_OUTPUT_FILE]
[--method METHOD] [--sample_column SAMPLE_COLUMN]
[--id_column ID_COLUMN] [--weight_column WEIGHT_COLUMN]
--covariate_columns COVARIATE_COLUMNS
[--outcome_columns OUTCOME_COLUMNS]
[--covariate_columns_for_diagnostics COVARIATE_COLUMNS_FOR_DIAGNOSTICS]
[--rows_to_keep_for_diagnostics ROWS_TO_KEEP_FOR_DIAGNOSTICS]
[--weights_impact_on_outcome_method WEIGHTS_IMPACT_ON_OUTCOME_METHOD]
[--batch_columns BATCH_COLUMNS] [--keep_columns KEEP_COLUMNS]
[--keep_row_column KEEP_ROW_COLUMN]
[--sep_input_file SEP_INPUT_FILE]
[--sep_output_file SEP_OUTPUT_FILE]
[--sep_diagnostics_output_file SEP_DIAGNOSTICS_OUTPUT_FILE]
[--no_output_header] [--succeed_on_weighting_failure]
[--max_de MAX_DE] [--lambda_min LAMBDA_MIN]
[--lambda_max LAMBDA_MAX] [--num_lambdas NUM_LAMBDAS]
[--ipw_logistic_regression_kwargs IPW_LOGISTIC_REGRESSION_KWARGS]
[--weight_trimming_mean_ratio WEIGHT_TRIMMING_MEAN_RATIO]
[--one_hot_encoding ONE_HOT_ENCODING]
[--transformations TRANSFORMATIONS] [--formula FORMULA]
[--return_df_with_original_dtypes]
[--standardize_types STANDARDIZE_TYPES]
options:
-h, --help show this help message and exit
--input_file INPUT_FILE
Path to input sample/target
--output_file OUTPUT_FILE
Key CLI Arguments Summary¶
Here are the most commonly used arguments:
| Argument | Default | Description |
|---|---|---|
--method |
ipw |
Adjustment method: ipw, cbps, or rake |
--max_de |
1.5 |
Maximum design effect. Set to None to use lambda_1se instead |
--lambda_min |
1e-05 |
Lower bound for L1 penalty (IPW only) |
--lambda_max |
10 |
Upper bound for L1 penalty (IPW only) |
--num_lambdas |
250 |
Number of lambda values to search (IPW only) |
--weight_trimming_mean_ratio |
20.0 |
Trim weights above mean(weights) * ratio |
--transformations |
default |
Covariate transformations. Use None to disable |
--formula |
None |
Custom model formula (e.g., "gender + income") |
--one_hot_encoding |
True |
One-hot encode categorical features |
--batch_columns |
None |
Columns to group by for batch processing |
--keep_columns |
None |
Subset of columns to include in output |
--outcome_columns |
None |
Columns treated as outcomes (not covariates) |
--ipw_logistic_regression_kwargs |
None |
JSON string of kwargs for sklearn LogisticRegression |
--succeed_on_weighting_failure |
False |
Return null weights instead of failing on errors |
Example: Tuning IPW parameters¶
Below we run the CLI with custom regularization settings and a custom logistic regression solver:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_tuned.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_tuned.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "ipw",
# Tuning parameters
"--max_de", "2.0",
"--lambda_min", "1e-06",
"--lambda_max", "100",
"--num_lambdas", "500",
"--weight_trimming_mean_ratio", "10.0",
# Custom logistic regression settings
"--ipw_logistic_regression_kwargs", '{"solver": "liblinear", "max_iter": 500}',
]
print("CLI command:")
print(" ".join(cmd))
subprocess.check_call(cmd)
tuned_adjusted_df = pd.read_csv(output_path)
tuned_adjusted_df.head()
CLI command:
python -m balance.cli --input_file /tmp/tmp__4to2tm/input.csv --output_file /tmp/tmp__4to2tm/weights_tuned.csv --diagnostics_output_file /tmp/tmp__4to2tm/diagnostics_tuned.csv --covariate_columns gender,age_group,income --method ipw --max_de 2.0 --lambda_min 1e-06 --lambda_max 100 --num_lambdas 500 --weight_trimming_mean_ratio 10.0 --ipw_logistic_regression_kwargs {"solver": "liblinear", "max_iter": 500}
INFO (2026-03-12 17:27:31,704) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:27:31,706) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:27:31,706) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 2.0, 'lambda_min': 1e-06, 'lambda_max': 100.0, 'num_lambdas': 500, 'weight_trimming_mean_ratio': 10.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:27:31,717) [cli/load_and_check_input (line 926)]: Number of rows in input file: 11000
INFO (2026-03-12 17:27:31,717) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
WARNING (2026-03-12 17:27:31,870) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:31,881) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:31,881) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:31,882) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:31,882) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:31,883) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:31,884) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
WARNING (2026-03-12 17:27:31,892) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:31,905) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:31,906) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:31,906) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:31,907) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:31,907) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:31,909) [cli/process_batch (line 758)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
INFO (2026-03-12 17:27:31,914) [ipw/ipw (line 703)]: Starting ipw function
INFO (2026-03-12 17:27:31,916) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:27:31,916) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:27:31,923) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:27:31,930) [ipw/ipw (line 738)]: Building model matrix
INFO (2026-03-12 17:27:32,039) [ipw/ipw (line 764)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-03-12 17:27:32,039) [ipw/ipw (line 767)]: The number of columns in the model matrix: 18
INFO (2026-03-12 17:27:32,039) [ipw/ipw (line 768)]: The number of rows in the model matrix: 11000
INFO (2026-03-12 17:27:32,066) [ipw/ipw (line 990)]: Done with sklearn
INFO (2026-03-12 17:27:32,066) [ipw/ipw (line 992)]: max_de: 2.0
INFO (2026-03-12 17:27:32,066) [ipw/choose_regularization (line 368)]: Starting choosing regularisation parameters
INFO (2026-03-12 17:27:36,743) [ipw/choose_regularization (line 454)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
6 NaN 0 2.5 1.714071 0.634917 0.071337
INFO (2026-03-12 17:27:36,745) [ipw/ipw (line 1047)]: Chosen lambda: nan
INFO (2026-03-12 17:27:36,745) [ipw/ipw (line 1065)]: Proportion null deviance explained 0.18280833369391136
INFO (2026-03-12 17:27:36,748) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:27:36,750) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
adjustment details:
method: ipw
weight trimming mean ratio: 2.5
design effect (Deff): 1.714
effective sample size proportion (ESSP): 0.583
effective sample size (ESS): 583.4
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
3 common variables: gender,age_group,income
INFO (2026-03-12 17:27:36,750) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:27:36,750) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:27:36,750) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:27:37,027) [sample_class/diagnostics (line 2069)]: Done computing diagnostics INFO (2026-03-12 17:27:37,031) [cli/process_batch (line 799)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.714071 design_effect .. ... ... ... 91 covar_main_asmd_improvement 0.225463 income 92 covar_main_asmd_adjusted 0.143344 mean(asmd) 93 covar_main_asmd_unadjusted 0.326799 mean(asmd) 94 covar_main_asmd_improvement 0.183455 mean(asmd) 95 adjustment_failure 0.000000 NaN [96 rows x 3 columns] INFO (2026-03-12 17:27:37,034) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
| id | gender | age_group | income | weight | happiness | is_respondent | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | Male | 25-34 | 6.428659 | 6.714531 | 26.043029 | 1.0 |
| 1 | 1 | Female | 18-24 | 9.940280 | 8.721215 | 66.885485 | 1.0 |
| 2 | 2 | Male | 18-24 | 2.673623 | 2.537674 | 37.091922 | 1.0 |
| 3 | 3 | NaN | 18-24 | 10.550308 | 5.587013 | 49.394050 | 1.0 |
| 4 | 4 | NaN | 18-24 | 2.689994 | 3.883128 | 72.304208 | 1.0 |
Example: Using a Custom Formula¶
The --formula argument allows you to specify a custom model formula, including interaction
terms. When using --formula, you should typically also set --transformations=None to
prevent automatic transformations from interfering with your custom formula.
The formula uses patsy/R-style syntax:
gender + income: additive terms (no interaction)gender * income: equivalent togender + income + gender:income(main effects + interaction)gender:income: only the interaction term
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_formula.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_formula.csv")
# Use the demo data for the formula example
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "ipw",
# Disable transformations to use raw covariates in formula
"--transformations", "None",
# Use a formula with interaction term
"--formula", "gender*income",
]
print("CLI command with custom formula:")
print(" ".join(cmd))
subprocess.check_call(cmd)
formula_diagnostics_df = pd.read_csv(diagnostics_path)
# Check model coefficients to verify formula was applied
print("\nModel coefficients (showing interaction term):")
print(formula_diagnostics_df.query("metric == 'model_coef'")[["var", "val"]])
CLI command with custom formula: python -m balance.cli --input_file /tmp/tmpzk20yf2u/input.csv --output_file /tmp/tmpzk20yf2u/weights_formula.csv --diagnostics_output_file /tmp/tmpzk20yf2u/diagnostics_formula.csv --covariate_columns gender,age_group,income --method ipw --transformations None --formula gender*income
INFO (2026-03-12 17:27:39,494) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:27:39,496) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:27:39,496) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': None, 'formula': 'gender*income', 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:27:39,507) [cli/load_and_check_input (line 926)]: Number of rows in input file: 11000
INFO (2026-03-12 17:27:39,507) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
WARNING (2026-03-12 17:27:39,660) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:39,671) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:39,671) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:39,672) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:39,672) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:39,673) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:39,674) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
WARNING (2026-03-12 17:27:39,681) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:39,694) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:39,695) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:39,695) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:39,695) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:39,696) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:39,699) [cli/process_batch (line 758)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
INFO (2026-03-12 17:27:39,704) [ipw/ipw (line 703)]: Starting ipw function
INFO (2026-03-12 17:27:39,705) [ipw/ipw (line 738)]: Building model matrix
INFO (2026-03-12 17:27:39,753) [ipw/ipw (line 764)]: The formula used to build the model matrix: ['gender*income']
INFO (2026-03-12 17:27:39,753) [ipw/ipw (line 767)]: The number of columns in the model matrix: 7
INFO (2026-03-12 17:27:39,753) [ipw/ipw (line 768)]: The number of rows in the model matrix: 11000
INFO (2026-03-12 17:27:41,213) [ipw/ipw (line 990)]: Done with sklearn INFO (2026-03-12 17:27:41,213) [ipw/ipw (line 992)]: max_de: 1.5 INFO (2026-03-12 17:27:41,213) [ipw/choose_regularization (line 368)]: Starting choosing regularisation parameters
INFO (2026-03-12 17:27:47,871) [ipw/choose_regularization (line 454)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.043507 98 5.0 1.495849 0.517118 0.157805
INFO (2026-03-12 17:27:47,873) [ipw/ipw (line 1047)]: Chosen lambda: 0.043506507030756265
INFO (2026-03-12 17:27:47,873) [ipw/ipw (line 1065)]: Proportion null deviance explained 0.09595118841373662
WARNING (2026-03-12 17:27:47,873) [ipw/ipw (line 1073)]: The propensity model has low fraction null deviance explained (0.09595118841373662). Results may not be accurate
INFO (2026-03-12 17:27:47,876) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:27:47,878) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.669
effective sample size (ESS): 668.5
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
3 common variables: gender,age_group,income
INFO (2026-03-12 17:27:47,878) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:27:47,878) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:27:47,878) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:27:48,157) [sample_class/diagnostics (line 2069)]: Done computing diagnostics INFO (2026-03-12 17:27:48,162) [cli/process_batch (line 799)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 1.495849 design_effect .. ... ... ... 80 covar_main_asmd_improvement 0.301760 income 81 covar_main_asmd_adjusted 0.157805 mean(asmd) 82 covar_main_asmd_unadjusted 0.326799 mean(asmd) 83 covar_main_asmd_improvement 0.168993 mean(asmd) 84 adjustment_failure 0.000000 NaN [85 rows x 3 columns] INFO (2026-03-12 17:27:48,164) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Model coefficients (showing interaction term):
var val
40 intercept 0.452742
41 C(gender, one_hot_encoding_greater_2)[Female] -0.186488
42 C(gender, one_hot_encoding_greater_2)[Female]:... -0.224843
43 C(gender, one_hot_encoding_greater_2)[Male] 0.181382
44 C(gender, one_hot_encoding_greater_2)[Male]:in... -0.198422
45 C(gender, one_hot_encoding_greater_2)[_NA] 0.008414
46 C(gender, one_hot_encoding_greater_2)[_NA]:income -0.091654
47 income -0.372709
Batch Processing Example¶
The --batch_columns argument allows you to run separate adjustments for each unique
combination of values in the specified columns. This is useful when you want to compute
weights independently for different subgroups (e.g., by gender or region).
# Create a dataset with a batch column for gender
batch_input_df = load_data_input_df.copy()
# The 'gender' column has values like 'Female', 'Male', and possibly NA
# Filter to only rows with non-null gender for this example
batch_input_df = batch_input_df[batch_input_df["gender"].notna()].copy()
print(f"Rows after filtering: {len(batch_input_df)}")
print(f"Gender distribution:\n{batch_input_df['gender'].value_counts()}")
Rows after filtering: 10014 Gender distribution: gender Male 5195 Female 4819 Name: count, dtype: int64
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input_batch.csv")
output_path = os.path.join(tmpdir, "weights_batch.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_batch.csv")
batch_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "age_group,income", # Note: gender is now used as batch column
"--outcome_columns", "happiness",
"--batch_columns", "gender", # Process each gender separately
"--method", "ipw",
]
print("CLI command with batch processing:")
print(" ".join(cmd))
subprocess.check_call(cmd)
batch_adjusted_df = pd.read_csv(output_path)
batch_diagnostics_df = pd.read_csv(diagnostics_path)
print(f"\nOutput rows: {len(batch_adjusted_df)}")
batch_adjusted_df.head()
CLI command with batch processing: python -m balance.cli --input_file /tmp/tmppjmjw6pb/input_batch.csv --output_file /tmp/tmppjmjw6pb/weights_batch.csv --diagnostics_output_file /tmp/tmppjmjw6pb/diagnostics_batch.csv --covariate_columns age_group,income --outcome_columns happiness --batch_columns gender --method ipw
INFO (2026-03-12 17:27:50,670) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:27:50,672) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:27:50,672) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:27:50,682) [cli/load_and_check_input (line 926)]: Number of rows in input file: 10014
INFO (2026-03-12 17:27:50,682) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
INFO (2026-03-12 17:27:50,685) [cli/main (line 1141)]: Running weighting for batch = ('Female',)
WARNING (2026-03-12 17:27:50,838) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:50,849) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:50,849) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:50,850) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:50,850) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:50,851) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:50,852) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
268 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
WARNING (2026-03-12 17:27:50,859) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:50,870) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:50,870) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:50,871) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:50,871) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:50,872) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:50,874) [cli/process_batch (line 758)]: balance target object:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
INFO (2026-03-12 17:27:50,877) [ipw/ipw (line 703)]: Starting ipw function
INFO (2026-03-12 17:27:50,878) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:27:50,878) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['age_group', 'income']
INFO (2026-03-12 17:27:50,883) [adjustment/apply_transformations (line 469)]: Final variables in output: ['age_group', 'income']
INFO (2026-03-12 17:27:50,887) [ipw/ipw (line 738)]: Building model matrix
INFO (2026-03-12 17:27:50,921) [ipw/ipw (line 764)]: The formula used to build the model matrix: ['income + age_group']
INFO (2026-03-12 17:27:50,921) [ipw/ipw (line 767)]: The number of columns in the model matrix: 14
INFO (2026-03-12 17:27:50,921) [ipw/ipw (line 768)]: The number of rows in the model matrix: 4819
INFO (2026-03-12 17:27:51,811) [ipw/ipw (line 990)]: Done with sklearn INFO (2026-03-12 17:27:51,812) [ipw/ipw (line 992)]: max_de: 1.5 INFO (2026-03-12 17:27:51,812) [ipw/choose_regularization (line 368)]: Starting choosing regularisation parameters
INFO (2026-03-12 17:27:55,882) [ipw/choose_regularization (line 454)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
6 0.105705 82 5.0 1.489687 0.49424 0.09868
INFO (2026-03-12 17:27:55,884) [ipw/ipw (line 1047)]: Chosen lambda: 0.10570520810009826
INFO (2026-03-12 17:27:55,884) [ipw/ipw (line 1065)]: Proportion null deviance explained 0.14888521519197495
INFO (2026-03-12 17:27:55,888) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:27:55,890) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
268 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.490
effective sample size proportion (ESSP): 0.671
effective sample size (ESS): 179.9
target:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
2 common variables: age_group,income
INFO (2026-03-12 17:27:55,890) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:27:55,890) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:27:55,890) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:27:56,041) [sample_class/diagnostics (line 2069)]: Done computing diagnostics
INFO (2026-03-12 17:27:56,046) [cli/process_batch (line 799)]: balance diagnostics object: metric val var
0 size 268.000000 sample_obs
1 size 2.000000 sample_covars
2 size 4551.000000 target_obs
3 size 2.000000 target_covars
4 weights_diagnostics 1.489687 design_effect
.. ... ... ...
86 covar_main_asmd_improvement 0.185596 income
87 covar_main_asmd_adjusted 0.220366 mean(asmd)
88 covar_main_asmd_unadjusted 0.422500 mean(asmd)
89 covar_main_asmd_improvement 0.202135 mean(asmd)
90 adjustment_failure 0.000000 NaN
[91 rows x 3 columns]
INFO (2026-03-12 17:27:56,048) [cli/main (line 1158)]: Done processing batch ('Female',)
INFO (2026-03-12 17:27:56,049) [cli/main (line 1141)]: Running weighting for batch = ('Male',)
WARNING (2026-03-12 17:27:56,058) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:56,068) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:56,068) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:56,069) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:56,069) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:56,070) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:56,071) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
644 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
WARNING (2026-03-12 17:27:56,078) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:27:56,089) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:27:56,089) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:27:56,090) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:27:56,090) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:27:56,090) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:27:56,092) [cli/process_batch (line 758)]: balance target object:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
INFO (2026-03-12 17:27:56,095) [ipw/ipw (line 703)]: Starting ipw function
INFO (2026-03-12 17:27:56,097) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:27:56,097) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['age_group', 'income']
INFO (2026-03-12 17:27:56,101) [adjustment/apply_transformations (line 469)]: Final variables in output: ['age_group', 'income']
INFO (2026-03-12 17:27:56,105) [ipw/ipw (line 738)]: Building model matrix
INFO (2026-03-12 17:27:56,139) [ipw/ipw (line 764)]: The formula used to build the model matrix: ['income + age_group']
INFO (2026-03-12 17:27:56,139) [ipw/ipw (line 767)]: The number of columns in the model matrix: 14
INFO (2026-03-12 17:27:56,139) [ipw/ipw (line 768)]: The number of rows in the model matrix: 5195
INFO (2026-03-12 17:27:56,947) [ipw/ipw (line 990)]: Done with sklearn INFO (2026-03-12 17:27:56,948) [ipw/ipw (line 992)]: max_de: 1.5 INFO (2026-03-12 17:27:56,948) [ipw/choose_regularization (line 368)]: Starting choosing regularisation parameters
INFO (2026-03-12 17:28:01,056) [ipw/choose_regularization (line 454)]: Best regularisation:
s s_index trim design_effect asmd_improvement asmd
9 0.111736 81 5.0 1.495967 0.566287 0.087357
INFO (2026-03-12 17:28:01,057) [ipw/ipw (line 1047)]: Chosen lambda: 0.11173591019485084
INFO (2026-03-12 17:28:01,058) [ipw/ipw (line 1065)]: Proportion null deviance explained 0.1426717197478461
INFO (2026-03-12 17:28:01,061) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:28:01,063) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using ipw
644 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
adjustment details:
method: ipw
weight trimming mean ratio: 5.0
design effect (Deff): 1.496
effective sample size proportion (ESSP): 0.668
effective sample size (ESS): 430.5
target:
balance Sample object
4551 observations x 2 variables: age_group,income
id_column: id, weight_column: weight,
outcome_columns: happiness
2 common variables: age_group,income
INFO (2026-03-12 17:28:01,063) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:28:01,063) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:28:01,063) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:28:01,219) [sample_class/diagnostics (line 2069)]: Done computing diagnostics
INFO (2026-03-12 17:28:01,224) [cli/process_batch (line 799)]: balance diagnostics object: metric val var
0 size 644.000000 sample_obs
1 size 2.000000 sample_covars
2 size 4551.000000 target_obs
3 size 2.000000 target_covars
4 weights_diagnostics 1.495967 design_effect
.. ... ... ...
86 covar_main_asmd_improvement 0.235830 income
87 covar_main_asmd_adjusted 0.192214 mean(asmd)
88 covar_main_asmd_unadjusted 0.430017 mean(asmd)
89 covar_main_asmd_improvement 0.237804 mean(asmd)
90 adjustment_failure 0.000000 NaN
[91 rows x 3 columns]
INFO (2026-03-12 17:28:01,226) [cli/main (line 1158)]: Done processing batch ('Male',)
INFO (2026-03-12 17:28:01,227) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Output rows: 912
| id | age_group | income | happiness | weight | gender | is_respondent | |
|---|---|---|---|---|---|---|---|
| 0 | 1 | 18-24 | 9.940280 | 66.885485 | 10.379924 | Female | 1.0 |
| 1 | 92 | 35-44 | 0.185097 | 84.464522 | 18.176360 | Female | 1.0 |
| 2 | 94 | 35-44 | 1.183696 | 65.742184 | 20.852704 | Female | 1.0 |
| 3 | 95 | 18-24 | 3.716007 | 67.624539 | 10.522336 | Female | 1.0 |
| 4 | 98 | 35-44 | 16.751931 | 44.868651 | 40.377325 | Female | 1.0 |
# Inspect weights by gender - each group was adjusted independently
print("Weight statistics by gender (sample only):")
sample_only = batch_adjusted_df[batch_adjusted_df["is_respondent"] == 1]
print(sample_only.groupby("gender")["weight"].describe().round(3))
Weight statistics by gender (sample only):
count mean std min 25% 50% 75% max
gender
Female 268.0 16.981 11.905 6.785 9.566 13.702 19.158 85.648
Male 644.0 7.067 4.981 2.913 3.260 5.776 9.235 35.370
Alternative Weighting Methods¶
The CLI supports three adjustment methods:
- IPW (Inverse Probability Weighting): The default method, uses logistic regression to estimate propensity scores
- CBPS (Covariate Balancing Propensity Score): Balances covariates while estimating propensity scores
- Rake (Raking/Iterative Proportional Fitting): Adjusts weights iteratively to match marginal distributions
Example: CBPS Method¶
CBPS simultaneously optimizes covariate balance and propensity score estimation:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_cbps.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_cbps.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "cbps",
]
print("CLI command with CBPS method:")
print(" ".join(cmd))
subprocess.check_call(cmd)
cbps_diagnostics_df = pd.read_csv(diagnostics_path)
# Verify the method used
print("\nAdjustment method used:")
print(cbps_diagnostics_df.query("metric == 'adjustment_method'")[["var", "val"]])
CLI command with CBPS method: python -m balance.cli --input_file /tmp/tmp4k781k_q/input.csv --output_file /tmp/tmp4k781k_q/weights_cbps.csv --diagnostics_output_file /tmp/tmp4k781k_q/diagnostics_cbps.csv --covariate_columns gender,age_group,income --method cbps
INFO (2026-03-12 17:28:03,733) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:28:03,735) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:28:03,735) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:28:03,745) [cli/load_and_check_input (line 926)]: Number of rows in input file: 11000
INFO (2026-03-12 17:28:03,745) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
WARNING (2026-03-12 17:28:03,897) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:28:03,908) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:28:03,908) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:28:03,909) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-03-12 17:28:03,909) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:28:03,910) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
id str
is_respondent float64
dtype: object
INFO (2026-03-12 17:28:03,911) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
WARNING (2026-03-12 17:28:03,918) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:28:03,932) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:28:03,932) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:28:03,932) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
id int64
is_respondent int64
dtype: object
WARNING (2026-03-12 17:28:03,932) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:28:03,933) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
id str
is_respondent float64
dtype: object
INFO (2026-03-12 17:28:03,936) [cli/process_batch (line 758)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
INFO (2026-03-12 17:28:03,941) [cbps/cbps (line 537)]: Starting cbps function
INFO (2026-03-12 17:28:03,942) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:28:03,942) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:28:03,949) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:28:04,064) [cbps/cbps (line 588)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-03-12 17:28:04,065) [cbps/cbps (line 599)]: The number of columns in the model matrix: 16
INFO (2026-03-12 17:28:04,065) [cbps/cbps (line 600)]: The number of rows in the model matrix: 11000
INFO (2026-03-12 17:28:04,074) [cbps/cbps (line 669)]: Finding initial estimator for GMM optimization
INFO (2026-03-12 17:28:04,254) [cbps/cbps (line 696)]: Finding initial estimator for GMM optimization that minimizes the balance loss
INFO (2026-03-12 17:28:05,696) [cbps/cbps (line 732)]: Running GMM optimization
INFO (2026-03-12 17:28:08,514) [cbps/cbps (line 859)]: Done cbps function
INFO (2026-03-12 17:28:08,517) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:28:08,519) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using cbps
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
adjustment details:
method: cbps
design effect (Deff): 1.500
effective sample size proportion (ESSP): 0.667
effective sample size (ESS): 666.7
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
3 common variables: gender,age_group,income
INFO (2026-03-12 17:28:08,519) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:28:08,519) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:28:08,519) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:28:08,793) [sample_class/diagnostics (line 2069)]: Done computing diagnostics INFO (2026-03-12 17:28:08,797) [cli/process_batch (line 799)]: balance diagnostics object: metric val var 0 size 1000.0 sample_obs 1 size 3.0 sample_covars 2 size 10000.0 target_obs 3 size 3.0 target_covars 4 weights_diagnostics 1.5 design_effect .. ... ... ... 86 covar_main_asmd_improvement 0.205323 income 87 covar_main_asmd_adjusted 0.175443 mean(asmd) 88 covar_main_asmd_unadjusted 0.326799 mean(asmd) 89 covar_main_asmd_improvement 0.151355 mean(asmd) 90 adjustment_failure 0 NaN [91 rows x 3 columns] INFO (2026-03-12 17:28:08,800) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Adjustment method used:
var val
28 cbps 0.0
Example: Rake Method¶
Raking iteratively adjusts weights to match target marginal distributions:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.csv")
output_path = os.path.join(tmpdir, "weights_rake.csv")
diagnostics_path = os.path.join(tmpdir, "diagnostics_rake.csv")
load_data_input_df.to_csv(input_path, index=False)
cmd = [
"python",
"-m",
"balance.cli",
"--input_file", input_path,
"--output_file", output_path,
"--diagnostics_output_file", diagnostics_path,
"--covariate_columns", "gender,age_group,income",
"--method", "rake",
]
print("CLI command with rake method:")
print(" ".join(cmd))
subprocess.check_call(cmd)
rake_diagnostics_df = pd.read_csv(diagnostics_path)
# Verify the method used
print("\nAdjustment method used:")
print(rake_diagnostics_df.query("metric == 'adjustment_method'")[["var", "val"]])
CLI command with rake method: python -m balance.cli --input_file /tmp/tmp1ykvcp1o/input.csv --output_file /tmp/tmp1ykvcp1o/weights_rake.csv --diagnostics_output_file /tmp/tmp1ykvcp1o/diagnostics_rake.csv --covariate_columns gender,age_group,income --method rake
INFO (2026-03-12 17:28:11,239) [__init__/<module> (line 72)]: Using balance version 0.16.1
INFO (2026-03-12 17:28:11,241) [cli/main (line 1095)]: Running cli.main() using balance version 0.16.1
INFO (2026-03-12 17:28:11,241) [cli/main (line 1130)]: Attributes used by main() for running adjust: {'transformations': 'default', 'formula': None, 'penalty_factor': None, 'one_hot_encoding': True, 'max_de': 1.5, 'lambda_min': 1e-05, 'lambda_max': 10, 'num_lambdas': 250, 'weight_trimming_mean_ratio': 20.0, 'sample_cls': <class 'balance.sample_class.Sample'>, 'sample_package_name': 'balance', 'sample_package_version': '0.16.1'}
INFO (2026-03-12 17:28:11,251) [cli/load_and_check_input (line 926)]: Number of rows in input file: 11000
INFO (2026-03-12 17:28:11,251) [cli/load_and_check_input (line 932)]: Number of columns in input file: 7
WARNING (2026-03-12 17:28:11,406) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:28:11,417) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:28:11,417) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:28:11,418) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:28:11,418) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:28:11,419) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:28:11,420) [cli/process_batch (line 747)]: balance sample object:
balance Sample object
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
WARNING (2026-03-12 17:28:11,428) [sample_class/from_frame (line 469)]: Casting id column to string
WARNING (2026-03-12 17:28:11,441) [pandas_utils/_warn_of_df_dtypes_change (line 514)]: The dtypes of sample._df were changed from the original dtypes of the input df, here are the differences -
WARNING (2026-03-12 17:28:11,441) [pandas_utils/_warn_of_df_dtypes_change (line 525)]: The (old) dtypes that changed for df (before the change):
WARNING (2026-03-12 17:28:11,442) [pandas_utils/_warn_of_df_dtypes_change (line 528)]:
is_respondent int64
id int64
dtype: object
WARNING (2026-03-12 17:28:11,442) [pandas_utils/_warn_of_df_dtypes_change (line 529)]: The (new) dtypes saved in df (after the change):
WARNING (2026-03-12 17:28:11,443) [pandas_utils/_warn_of_df_dtypes_change (line 530)]:
is_respondent float64
id str
dtype: object
INFO (2026-03-12 17:28:11,445) [cli/process_batch (line 758)]: balance target object:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
INFO (2026-03-12 17:28:11,452) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-03-12 17:28:11,452) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:28:11,459) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-03-12 17:28:11,501) [rake/rake (line 274)]: Final covariates and levels that will be used in raking: {'age_group': ['18-24', '25-34', '35-44', '45+'], 'gender': ['Female', 'Male', '__NaN__'], 'income': ['(-0.0009997440000000001, 0.44]', '(0.44, 1.664]', '(1.664, 3.472]', '(11.312, 15.139]', '(15.139, 20.567]', '(20.567, 29.504]', '(29.504, 128.536]', '(3.472, 5.663]', '(5.663, 8.211]', '(8.211, 11.312]']}.
INFO (2026-03-12 17:28:11,522) [cli/process_batch (line 781)]: Succeeded with adjusting sample to target
INFO (2026-03-12 17:28:11,524) [cli/process_batch (line 782)]: balance adjusted object:
Adjusted balance Sample object with target set using rake
1000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
adjustment details:
method: rake
design effect (Deff): 3.774
effective sample size proportion (ESSP): 0.265
effective sample size (ESS): 265.0
target:
balance Sample object
10000 observations x 3 variables: gender,age_group,income
id_column: id, weight_column: weight,
outcome_columns: None
3 common variables: gender,age_group,income
INFO (2026-03-12 17:28:11,524) [cli/process_batch (line 784)]: Condition on which rows to keep for diagnostics: None
INFO (2026-03-12 17:28:11,524) [cli/process_batch (line 788)]: Names of columns to keep for diagnostics: None
INFO (2026-03-12 17:28:11,524) [sample_class/diagnostics (line 1826)]: Starting computation of diagnostics of the fitting
INFO (2026-03-12 17:28:11,794) [sample_class/diagnostics (line 2069)]: Done computing diagnostics INFO (2026-03-12 17:28:11,798) [cli/process_batch (line 799)]: balance diagnostics object: metric val var 0 size 1000.000000 sample_obs 1 size 3.000000 sample_covars 2 size 10000.000000 target_obs 3 size 3.000000 target_covars 4 weights_diagnostics 3.773786 design_effect .. ... ... ... 61 covar_main_asmd_improvement 0.462436 income 62 covar_main_asmd_adjusted 0.014651 mean(asmd) 63 covar_main_asmd_unadjusted 0.326799 mean(asmd) 64 covar_main_asmd_improvement 0.312147 mean(asmd) 65 adjustment_failure 0.000000 NaN [66 rows x 3 columns] INFO (2026-03-12 17:28:11,800) [cli/main (line 1184)]: Done fitting the model, writing output
balance (Version 0.16.1) loaded:
📖 Documentation: https://import-balance.org/
🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
📄 Citation:
Sarig, T., Galili, T., & Eilat, R. (2023).
balance - a Python package for balancing biased data samples.
https://arxiv.org/abs/2307.06024
Tip: You can view this message anytime with balance.help()
Adjustment method used:
var val
28 rake 0.0
Next steps¶
- Try
--method cbpsor--method rakefor alternative weighting approaches. - Use
--outcome_columnsto control which columns are treated as outcomes. - Supply
--ipw_logistic_regression_kwargsto tune the IPW model. - Use
--succeed_on_weighting_failurefor pipelines where you want null weights instead of errors. - Explore
--covariate_columns_for_diagnosticsand--rows_to_keep_for_diagnosticsto customize diagnostic output.
Session info¶
For reproducibility, here is the session information:
import session_info
session_info.show(html=False, dependencies=True)
----- balance 0.16.1 pandas 3.0.1 session_info v1.0.1 ----- 81d243bd2c585b0f4821__mypyc NA PIL 12.1.1 anyio NA arrow 1.4.0 asttokens NA attr 25.4.0 attrs 25.4.0 babel 2.18.0 certifi 2026.02.25 charset_normalizer 3.4.5 comm 0.2.3 cycler 0.12.1 cython_runtime NA dateutil 2.9.0.post0 debugpy 1.8.20 decorator 5.2.1 defusedxml 0.7.1 executing 2.2.1 fastjsonschema NA fqdn NA idna 3.11 ipykernel 7.2.0 isoduration NA jedi 0.19.2 jinja2 3.1.6 joblib 1.5.3 json5 0.13.0 jsonpointer 3.0.0 jsonschema 4.26.0 jsonschema_specifications NA jupyter_events 0.12.0 jupyter_server 2.17.0 jupyterlab_server 2.28.0 kiwisolver 1.5.0 lark 1.3.1 markupsafe 3.0.3 matplotlib 3.10.8 mpl_toolkits NA narwhals 2.18.0 nbformat 5.10.4 numpy 2.4.3 packaging 26.0 parso 0.8.6 patsy 1.0.2 platformdirs 4.9.4 plotly 6.6.0 prometheus_client NA prompt_toolkit 3.0.52 psutil 7.2.2 pure_eval 0.2.3 pydev_ipython NA pydevconsole NA pydevd 3.2.3 pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.19.2 pyparsing 3.3.2 pythonjsonlogger NA referencing NA requests 2.32.5 rfc3339_validator 0.1.4 rfc3986_validator 0.1.1 rfc3987_syntax NA rpds NA scipy 1.17.1 seaborn 0.13.2 send2trash NA six 1.17.0 sklearn 1.8.0 sphinxcontrib NA stack_data 0.6.3 statsmodels 0.14.6 threadpoolctl 3.6.0 tornado 6.5.5 traitlets 5.14.3 typing_extensions NA uri_template NA urllib3 2.6.3 wcwidth 0.6.0 webcolors NA websocket 1.9.0 yaml 6.0.3 zmq 27.1.0 zoneinfo NA ----- IPython 9.11.0 jupyter_client 8.8.0 jupyter_core 5.9.1 jupyterlab 4.5.6 notebook 7.5.5 ----- Python 3.12.12 (main, Oct 10 2025, 01:01:16) [GCC 13.3.0] Linux-6.14.0-1017-azure-x86_64-with-glibc2.39 ----- Session information updated at 2026-03-12 17:28