balance Quickstart: Analyzing and adjusting the bias on a simulated toy dataset¶

'balance' is a Python package that is maintained and released by the Core Data Science Tel-Aviv team in Meta. 'balance' performs and evaluates bias reduction by weighting for a broad set of experimental and observational use cases.

Although balance is written in Python, you don't need a deep Python understanding to use it. In fact, you can just use this notebook, load your data, change some variables and re-run the notebook and produce your own weights!

This quickstart demonstrates re-weighting specific simulated data, but if you have a different usecase or want more comprehensive documentation, you can check out the comprehensive balance tutorial.

Analysis¶

There are four main steps to analysis with balance:

  • load data
  • check diagnostics before adjustment
  • perform adjustment + check diagnostics
  • output results

Let's dive right in!

Example dataset¶

The following is a toy simulated dataset.

In [1]:
%matplotlib inline

import plotly.offline as offline
offline.init_notebook_mode()

import warnings
warnings.filterwarnings("ignore")

from balance import load_data
INFO (2026-02-09 22:35:32,742) [__init__/<module> (line 72)]: Using balance version 0.16.1
balance (Version 0.16.1) loaded:
    📖 Documentation: https://import-balance.org/
    🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
    📄 Citation:
        Sarig, T., Galili, T., & Eilat, R. (2023).
        balance - a Python package for balancing biased data samples.
        https://arxiv.org/abs/2307.06024

    Tip: You can view this message anytime with balance.help()

In [2]:
target_df, sample_df = load_data()

print("target_df: \n", target_df.head())
print("sample_df: \n", sample_df.head())
target_df: 
        id gender age_group     income  happiness
0  100000   Male       45+  10.183951  61.706333
1  100001   Male       45+   6.036858  79.123670
2  100002   Male     35-44   5.226629  44.206949
3  100003    NaN       45+   5.752147  83.985716
4  100004    NaN     25-34   4.837484  49.339713
sample_df: 
   id  gender age_group     income  happiness
0  0    Male     25-34   6.428659  26.043029
1  1  Female     18-24   9.940280  66.885485
2  2    Male     18-24   2.673623  37.091922
3  3     NaN     18-24  10.550308  49.394050
4  4     NaN     18-24   2.689994  72.304208
In [3]:
target_df.head().round(2).to_dict()
# sample_df.shape
Out[3]:
{'id': {0: '100000', 1: '100001', 2: '100002', 3: '100003', 4: '100004'},
 'gender': {0: 'Male', 1: 'Male', 2: 'Male', 3: nan, 4: nan},
 'age_group': {0: '45+', 1: '45+', 2: '35-44', 3: '45+', 4: '25-34'},
 'income': {0: 10.18, 1: 6.04, 2: 5.23, 3: 5.75, 4: 4.84},
 'happiness': {0: 61.71, 1: 79.12, 2: 44.21, 3: 83.99, 4: 49.34}}

In practice, one can use pandas loading function(such as read_csv()) to import data into the DataFrame objects sample_df and target_df.

Load data into a Sample object¶

The first thing to do is to import the Sample class from balance. All of the data we're going to be working with, sample or population, will be stored in objects of the Sample class.

In [4]:
from balance import Sample

Using the Sample class, we can fill it with a "sample" we want to adjust, and also a "target" we want to adjust towards.

We turn the two input pandas DataFrame objects we created (or loaded) into a balance.Sample objects, by using the .from_frame()

In [5]:
sample = Sample.from_frame(sample_df, outcome_columns=["happiness"])
# Often times we don't have the outcome for the target. In this case we've added it just to validate later that the weights indeed help us reduce the bias
target = Sample.from_frame(target_df, outcome_columns=["happiness"])
WARNING (2026-02-09 22:35:32,907) [input_validation/guess_id_column (line 337)]: Guessed id column name id for the data
WARNING (2026-02-09 22:35:32,918) [sample_class/from_frame (line 549)]: No weights passed. Adding a 'weight' column and setting all values to 1
WARNING (2026-02-09 22:35:32,932) [input_validation/guess_id_column (line 337)]: Guessed id column name id for the data
WARNING (2026-02-09 22:35:32,946) [sample_class/from_frame (line 549)]: No weights passed. Adding a 'weight' column and setting all values to 1

If we use the .df property call, we can see the DataFrame stored in sample. We can see how we have a new weight column that was added (it will all have 1s) in the importing of the DataFrames into a balance.Sample object.

In [6]:
sample.df.info()
<class 'pandas.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   id         1000 non-null   str    
 1   gender     912 non-null    str    
 2   age_group  1000 non-null   str    
 3   income     1000 non-null   float64
 4   happiness  1000 non-null   float64
 5   weight     1000 non-null   float64
dtypes: float64(3), str(3)
memory usage: 47.0 KB

We can get a quick overview text of each Sample object, by just calling it.

Let's take a look at what this produces:

In [7]:
sample
Out[7]:
(balance.sample_class.Sample)

        balance Sample object
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
In [8]:
target
Out[8]:
(balance.sample_class.Sample)

        balance Sample object
        10000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        

Next, we combine the sample object with the target object. This is what will allow us to adjust the sample to the target.

In [9]:
sample_with_target = sample.set_target(target)

Looking on sample_with_target now, it has the target attached:

In [10]:
sample_with_target
Out[10]:
(balance.sample_class.Sample)

        balance Sample object with target set
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
            target:
                 
	        balance Sample object
	        10000 observations x 3 variables: gender,age_group,income
	        id_column: id, weight_column: weight,
	        outcome_columns: happiness
	        
            3 common variables: gender,age_group,income
            

Pre-Adjustment Diagnostics¶

We can use .covars() and then followup with .mean() and .plot() (barplots and kde density plots) to get some basic diagnostics on what we got.

We can see how:

  • The proportion of missing values in gender is similar in sample and target.
  • We have younger people in the sample as compared to the target.
  • We have more females than males in the sample, as compared to around 50-50 split for the (non NA) target.
  • Income is more right skewed in the target as compared to the sample.
In [11]:
print(sample_with_target.covars().mean().T)
source                     self     target
_is_na_gender[T.True]  0.088000   0.089800
age_group[T.25-34]     0.300000   0.297400
age_group[T.35-44]     0.156000   0.299200
age_group[T.45+]       0.053000   0.206300
gender[Female]         0.268000   0.455100
gender[Male]           0.644000   0.455100
gender[_NA]            0.088000   0.089800
income                 6.297302  12.737608
In [12]:
print(sample_with_target.covars().asmd().T)
source                  self
age_group[T.25-34]  0.005688
age_group[T.35-44]  0.312711
age_group[T.45+]    0.378828
gender[Female]      0.375699
gender[Male]        0.379314
gender[_NA]         0.006296
income              0.494217
mean(asmd)          0.326799
In [13]:
print(sample_with_target.covars().asmd(aggregate_by_main_covar = True).T)
source          self
age_group   0.232409
gender      0.253769
income      0.494217
mean(asmd)  0.326799

Distribution diagnostics (KLD/EMD/CVMD/KS)¶

Balance also exposes distribution diagnostics for covariates. These look beyond mean differences and compare the full distributions of covariates in the weighted sample vs. the target.

  • KLD (Kullback-Leibler divergence) measures the relative entropy between two probability distributions (note: this is a divergence measure, not a symmetric distance). (See: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence)
  • EMD (Earth Mover's Distance) measures the minimum "cost" to transform one distribution into another. (See: https://en.wikipedia.org/wiki/Earth_mover%27s_distance)
  • CVMD (Cramér–von Mises distance) measures the integrated squared difference between the empirical CDFs. (See: https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93von_Mises_criterion)
  • KS (Kolmogorov–Smirnov distance) measures the maximum absolute difference between the empirical CDFs. (See: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test)

Note: Distribution diagnostics operate on the raw covariates (with NA indicators), rather than the model matrix, so categorical variables stay intact.

These diagnostics complement ASMD, which only compares means. Use these metrics when you want to check whether weighting aligns the shape of covariate distributions (not just their means).

In [14]:
print(sample_with_target.covars().kld().T)
print(sample_with_target.covars().emd().T)
print(sample_with_target.covars().cvmd().T)
print(sample_with_target.covars().ks().T)
source         self
gender     0.079889
age_group  0.277138
income     0.114895
mean(kld)  0.157307
source         self
gender     0.188900
age_group  0.743700
income     6.440306
mean(emd)  2.457635
source          self
gender      0.012658
age_group   0.061326
income      0.029834
mean(cvmd)  0.034606
source         self
gender     0.187100
age_group  0.296500
income     0.246400
mean(ks)   0.243333

Visualizing the unadjusted comparison¶

Before we adjust the sample, let's visualize how the sample compares to the target:

In [15]:
sample_with_target.covars().plot()

Adjusting Sample to Population¶

Next, we adjust the sample to the target. The default method to be used is 'ipw' (which uses inverse probability/propensity weights, after running logistic regression with lasso regularization).

In [16]:
# Using ipw to fit survey weights
adjusted = sample_with_target.adjust()
INFO (2026-02-09 22:35:34,109) [ipw/ipw (line 706)]: Starting ipw function
INFO (2026-02-09 22:35:34,111) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-02-09 22:35:34,112) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:35:34,120) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:35:34,128) [ipw/ipw (line 741)]: Building model matrix
INFO (2026-02-09 22:35:34,232) [ipw/ipw (line 767)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-02-09 22:35:34,233) [ipw/ipw (line 770)]: The number of columns in the model matrix: 16
INFO (2026-02-09 22:35:34,233) [ipw/ipw (line 771)]: The number of rows in the model matrix: 11000
INFO (2026-02-09 22:35:50,991) [ipw/ipw (line 993)]: Done with sklearn
INFO (2026-02-09 22:35:50,992) [ipw/ipw (line 995)]: max_de: None
INFO (2026-02-09 22:35:50,993) [ipw/ipw (line 1017)]: Starting model selection
INFO (2026-02-09 22:35:50,996) [ipw/ipw (line 1050)]: Chosen lambda: 0.041158338186664825
INFO (2026-02-09 22:35:50,996) [ipw/ipw (line 1068)]: Proportion null deviance explained 0.172637976731584
In [17]:
print(adjusted)
        Adjusted balance Sample object with target set using ipw
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
        adjustment details:
            method: ipw
            weight trimming mean ratio: 20
            design effect (Deff): 1.880
            effective sample size proportion (ESSP): 0.532
            effective sample size (ESS): 531.9
                
            target:
                 
	        balance Sample object
	        10000 observations x 3 variables: gender,age_group,income
	        id_column: id, weight_column: weight,
	        outcome_columns: happiness
	        
            3 common variables: gender,age_group,income
            

Evaluation of the Results¶

We can get a basic summary of the results:

In [18]:
print(adjusted.summary())
Adjustment details:
    method: ipw
    weight trimming mean ratio: 20
Covariate diagnostics:
    Covar ASMD reduction: 63.4%
    Covar ASMD (7 variables): 0.327 -> 0.120
    Covar mean KLD reduction: 92.3%
    Covar mean KLD (3 variables): 0.157 -> 0.012
Weight diagnostics:
    design effect (Deff): 1.880
    effective sample size proportion (ESSP): 0.532
    effective sample size (ESS): 531.9
Outcome weighted means:
            happiness
source               
self           53.295
target         56.278
unadjusted     48.559
Model performance: Model proportion deviance explained: 0.173
In [19]:
print(adjusted.covars().mean().T)
source                      self     target  unadjusted
_is_na_gender[T.True]   0.086776   0.089800    0.088000
age_group[T.25-34]      0.307355   0.297400    0.300000
age_group[T.35-44]      0.273609   0.299200    0.156000
age_group[T.45+]        0.137581   0.206300    0.053000
gender[Female]          0.406337   0.455100    0.268000
gender[Male]            0.506887   0.455100    0.644000
gender[_NA]             0.086776   0.089800    0.088000
income                 10.060068  12.737608    6.297302

We see an improvement in the average ASMD. We can look at detailed list of ASMD values per variables using the following call.

In [20]:
print(adjusted.covars().asmd().T)
source                  self  unadjusted  unadjusted - self
age_group[T.25-34]  0.021777    0.005688          -0.016090
age_group[T.35-44]  0.055884    0.312711           0.256827
age_group[T.45+]    0.169816    0.378828           0.209013
gender[Female]      0.097916    0.375699           0.277783
gender[Male]        0.103989    0.379314           0.275324
gender[_NA]         0.010578    0.006296          -0.004282
income              0.205469    0.494217           0.288748
mean(asmd)          0.119597    0.326799           0.207202
In [21]:
print(adjusted.covars().kld().T)
source         self  unadjusted  unadjusted - self
gender     0.005603    0.079889           0.074286
age_group  0.030191    0.277138           0.246947
income     0.000768    0.114895           0.114127
mean(kld)  0.012187    0.157307           0.145120

It's easier to learn about the biases by just running .covars().plot() on our adjusted object.

In [22]:
adjusted.covars().plot()  # you could change sizes using something like .plot(width = 1500, height = 700)

We can also use different plots, using the seaborn library, for example with the "kde" dist_type.

In [23]:
# This shows how we could use seaborn to plot a kernel density estimation
adjusted.covars().plot(library = "seaborn", dist_type = "kde")
No description has been provided for this image

Understanding the weights¶

We can look at the distribution of weights using the following call.

In [24]:
adjusted.weights().plot()
No description has been provided for this image

And get many summary statistics - including the design effect, effective sample size (ESS), and various quantiles and more, using:

In [25]:
# adjusted.weights().design_effect()
print(adjusted.weights().summary().round(2))
                                var       val
0                     design_effect      1.88
1       effective_sample_proportion      0.53
2             effective_sample_size    531.92
3                               sum  10000.00
4                    describe_count   1000.00
5                     describe_mean      1.00
6                      describe_std      0.94
7                      describe_min      0.30
8                      describe_25%      0.45
9                      describe_50%      0.65
10                     describe_75%      1.17
11                     describe_max     11.36
12                    prop(w < 0.1)      0.00
13                    prop(w < 0.2)      0.00
14                  prop(w < 0.333)      0.11
15                    prop(w < 0.5)      0.32
16                      prop(w < 1)      0.67
17                     prop(w >= 1)      0.33
18                     prop(w >= 2)      0.10
19                     prop(w >= 3)      0.03
20                     prop(w >= 5)      0.01
21                    prop(w >= 10)      0.00
22               nonparametric_skew      0.37
23  weighted_median_breakdown_point      0.21

Outcome analysis¶

In [26]:
# As we can see, the ci for unadjusted doesn't include the real value in the outcome, while the CI of the adjusted sample does include it.
# Also, the distance from the true value without adjustment is around 4 points, and after adjustment it's around 2 points.
print(adjusted.outcomes().summary())
1 outcomes: ['happiness']
Mean outcomes (with 95% confidence intervals):
source       self  target  unadjusted           self_ci         target_ci     unadjusted_ci
happiness  53.295  56.278      48.559  (52.096, 54.495)  (55.961, 56.595)  (47.669, 49.449)

Weights impact on outcomes (t_test):
           mean_yw0  mean_yw1  mean_diff  diff_ci_lower  diff_ci_upper  t_stat  p_value       n
outcome                                                                                        
happiness    48.559    53.295      4.736          1.312          8.161   2.714    0.007  1000.0

Response rates (relative to number of respondents in sample):
   happiness
n     1000.0
%      100.0
Response rates (relative to notnull rows in the target):
    happiness
n     1000.0
%       10.0
Response rates (in the target):
    happiness
n    10000.0
%      100.0

The paired t-test below evaluates whether the weights materially change the outcome by comparing y*w0 versus y*w1:

In [27]:
adjusted.outcomes().weights_impact_on_outcome_ss(method="t_test")
Out[27]:
mean_yw0 mean_yw1 mean_diff diff_ci_lower diff_ci_upper t_stat p_value n
outcome
happiness 48.559 53.295 4.736 1.312 8.161 2.714 0.007 1000.0

The estimated mean happiness according to our sample is 48 without any adjustment and 54 with adjustment. The following show the distribution of happinnes:

In [28]:
adjusted.outcomes().plot()

Comparing Adjustment Methods¶

This section demonstrates how to compare different adjustment methods using balance. We'll compare the default logistic regression method with a HistGradientBoostingClassifier (which natively supports categorical features) to see how they affect covariate balance.

Both methods aim to reduce bias by creating weights, but they may perform differently depending on your data and use case.

In [29]:
import sklearn
from sklearn.ensemble import HistGradientBoostingClassifier

_sklearn_version = tuple(int(x) for x in sklearn.__version__.split(".")[:2])

Adjust with Default Method (Logistic Regression)¶

First, let's adjust using the default IPW method with logistic regression:

In [30]:
# Adjust using default method (IPW with logistic regression)
adjusted_default = sample_with_target.adjust()
print(adjusted_default.summary())
INFO (2026-02-09 22:35:53,867) [ipw/ipw (line 706)]: Starting ipw function
INFO (2026-02-09 22:35:53,870) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-02-09 22:35:53,870) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:35:53,878) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:35:53,886) [ipw/ipw (line 741)]: Building model matrix
INFO (2026-02-09 22:35:53,989) [ipw/ipw (line 767)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-02-09 22:35:53,990) [ipw/ipw (line 770)]: The number of columns in the model matrix: 16
INFO (2026-02-09 22:35:53,991) [ipw/ipw (line 771)]: The number of rows in the model matrix: 11000
INFO (2026-02-09 22:36:10,812) [ipw/ipw (line 993)]: Done with sklearn
INFO (2026-02-09 22:36:10,813) [ipw/ipw (line 995)]: max_de: None
INFO (2026-02-09 22:36:10,813) [ipw/ipw (line 1017)]: Starting model selection
INFO (2026-02-09 22:36:10,816) [ipw/ipw (line 1050)]: Chosen lambda: 0.041158338186664825
INFO (2026-02-09 22:36:10,816) [ipw/ipw (line 1068)]: Proportion null deviance explained 0.172637976731584
Adjustment details:
    method: ipw
    weight trimming mean ratio: 20
Covariate diagnostics:
    Covar ASMD reduction: 63.4%
    Covar ASMD (7 variables): 0.327 -> 0.120
    Covar mean KLD reduction: 92.3%
    Covar mean KLD (3 variables): 0.157 -> 0.012
Weight diagnostics:
    design effect (Deff): 1.880
    effective sample size proportion (ESSP): 0.532
    effective sample size (ESS): 531.9
Outcome weighted means:
            happiness
source               
self           53.295
target         56.278
unadjusted     48.559
Model performance: Model proportion deviance explained: 0.173

Adjust with HistGradientBoostingClassifier¶

Now let's adjust using a HistGradientBoostingClassifier, which can capture non-linear relationships. We set use_model_matrix=False so the model is fit on raw covariates. String, object, and boolean columns are converted to pandas Categorical dtype, which sklearn 1.4+ can handle natively when categorical_features="from_dtype" is set.

Note on categorical handling:

  • balance converts non-numeric columns to Categorical dtype and passes a DataFrame to the estimator. Models that support categorical_features="from_dtype" (e.g., HistGradientBoostingClassifier) will treat them as unordered categoricals.
  • Models that do not support native categorical features (e.g., RandomForestClassifier) will raise an error when non-numeric columns are present. For those models, use use_model_matrix=True (the default) instead.
  • Requires scikit-learn >= 1.4 when categorical columns are present.
In [31]:
if _sklearn_version >= (1, 4):
    # Adjust using HistGradientBoostingClassifier with native categorical support
    hgb = HistGradientBoostingClassifier(random_state=0, categorical_features="from_dtype")
    adjusted_hgb = sample_with_target.adjust(model=hgb, use_model_matrix=False)
    print(adjusted_hgb.summary())
else:
    print(f"Skipping HistGradientBoosting example: requires scikit-learn >= 1.4, found {sklearn.__version__}")
INFO (2026-02-09 22:36:11,636) [ipw/ipw (line 706)]: Starting ipw function
INFO (2026-02-09 22:36:11,639) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-02-09 22:36:11,639) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:36:11,646) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:36:11,666) [ipw/ipw (line 821)]: Fitting model on raw covariates without model matrix encoding. Categorical columns are preserved as pandas Categorical dtype.
INFO (2026-02-09 22:36:11,667) [ipw/ipw (line 825)]: The number of columns: 4
INFO (2026-02-09 22:36:11,667) [ipw/ipw (line 826)]: The number of rows: 11000
INFO (2026-02-09 22:36:11,832) [ipw/ipw (line 993)]: Done with sklearn
INFO (2026-02-09 22:36:11,833) [ipw/ipw (line 995)]: max_de: None
INFO (2026-02-09 22:36:11,835) [ipw/ipw (line 1050)]: Chosen lambda: nan
INFO (2026-02-09 22:36:11,836) [ipw/ipw (line 1068)]: Proportion null deviance explained 0.2035518229273121
Adjustment details:
    method: ipw
    weight trimming mean ratio: 20
Covariate diagnostics:
    Covar ASMD reduction: 67.7%
    Covar ASMD (7 variables): 0.327 -> 0.106
    Covar mean KLD reduction: 95.7%
    Covar mean KLD (3 variables): 0.157 -> 0.007
Weight diagnostics:
    design effect (Deff): 2.539
    effective sample size proportion (ESSP): 0.394
    effective sample size (ESS): 393.8
Outcome weighted means:
            happiness
source               
self           54.326
target         56.278
unadjusted     48.559
Model performance: Model proportion deviance explained: 0.204

Adjust with HistGradientBoostingClassifier + Model Matrix¶

For comparison, we can also use HistGradientBoostingClassifier with use_model_matrix=True (the default). In this mode, balance one-hot encodes categorical columns using a model matrix before passing them to the estimator. This works with all scikit-learn versions and all estimators (no sklearn >= 1.4 requirement), but the estimator sees only numeric columns and cannot leverage native categorical support.

Comparing the two approaches lets you see whether the native categorical handling (use_model_matrix=False, sklearn >= 1.4 only) provides better covariate balance than the model-matrix route (use_model_matrix=True, any sklearn version).

In [32]:
# Adjust using HistGradientBoostingClassifier with model matrix (one-hot encoding)
hgb_mm = HistGradientBoostingClassifier(random_state=0)
adjusted_hgb_mm = sample_with_target.adjust(model=hgb_mm, use_model_matrix=True)
print(adjusted_hgb_mm.summary())
INFO (2026-02-09 22:36:12,619) [ipw/ipw (line 706)]: Starting ipw function
INFO (2026-02-09 22:36:12,621) [adjustment/apply_transformations (line 433)]: Adding the variables: []
INFO (2026-02-09 22:36:12,621) [adjustment/apply_transformations (line 434)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:36:12,629) [adjustment/apply_transformations (line 469)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2026-02-09 22:36:12,637) [ipw/ipw (line 741)]: Building model matrix
INFO (2026-02-09 22:36:12,740) [ipw/ipw (line 767)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2026-02-09 22:36:12,740) [ipw/ipw (line 770)]: The number of columns in the model matrix: 16
INFO (2026-02-09 22:36:12,741) [ipw/ipw (line 771)]: The number of rows in the model matrix: 11000
INFO (2026-02-09 22:36:12,976) [ipw/ipw (line 993)]: Done with sklearn
INFO (2026-02-09 22:36:12,976) [ipw/ipw (line 995)]: max_de: None
INFO (2026-02-09 22:36:12,979) [ipw/ipw (line 1050)]: Chosen lambda: nan
INFO (2026-02-09 22:36:12,979) [ipw/ipw (line 1068)]: Proportion null deviance explained 0.20594370965626285
Adjustment details:
    method: ipw
    weight trimming mean ratio: 20
Covariate diagnostics:
    Covar ASMD reduction: 68.8%
    Covar ASMD (7 variables): 0.327 -> 0.102
    Covar mean KLD reduction: 95.7%
    Covar mean KLD (3 variables): 0.157 -> 0.007
Weight diagnostics:
    design effect (Deff): 2.699
    effective sample size proportion (ESSP): 0.370
    effective sample size (ESS): 370.5
Outcome weighted means:
            happiness
source               
self           54.460
target         56.278
unadjusted     48.559
Model performance: Model proportion deviance explained: 0.206

Interpreting the Results¶

Both methods produce adjusted weights that reduce bias. You can compare:

  • mean(asmd): Lower values indicate better overall covariate balance
  • Individual covariates: Check which method better balances specific variables
  • Design effect: From the summary output, which affects effective sample size

The choice between methods depends on your data characteristics and whether non-linear relationships are important.

Downloading data¶

Finally, we can prepare the data to be downloaded for future analyses.

In [33]:
adjusted.to_download()
Out[33]:
Click here to download: /tmp/tmp_balance_out_3f8bd4d7-ce48-47b1-a8f5-9e266e6c1842.csv
In [34]:
# We can prepare the data to be exported as csv - showing the first 500 characters for simplicity:
adjusted.to_csv()[0:500]
Out[34]:
'id,gender,age_group,income,happiness,weight\n0,Male,25-34,6.428659499046228,26.043028759747298,6.531727983159833\n1,Female,18-24,9.940280228116047,66.88548460632677,9.617159404460358\n2,Male,18-24,2.6736231547518043,37.091921916683006,3.562973405562796\n3,,18-24,10.550307519418066,49.39405003271002,6.9521166766082425\n4,,18-24,2.689993854299385,72.30420755038209,5.129230211466446\n5,,35-44,5.995497722733131,57.28281646341816,16.42476175494607\n6,,18-24,12.63469573898972,31.663293445944596,8.19113332598'
In [35]:
# Sessions info
import session_info
session_info.show(html=False, dependencies=True)
-----
balance             0.16.1
pandas              3.0.0
plotly              6.5.2
session_info        v1.0.1
sklearn             1.8.0
-----
PIL                         12.1.0
anyio                       NA
arrow                       1.4.0
asttokens                   NA
attr                        25.4.0
attrs                       25.4.0
babel                       2.18.0
certifi                     2026.01.04
charset_normalizer          3.4.4
comm                        0.2.3
cycler                      0.12.1
cython_runtime              NA
dateutil                    2.9.0.post0
debugpy                     1.8.20
decorator                   5.2.1
defusedxml                  0.7.1
executing                   2.2.1
fastjsonschema              NA
fqdn                        NA
idna                        3.11
ipykernel                   7.2.0
isoduration                 NA
jedi                        0.19.2
jinja2                      3.1.6
joblib                      1.5.3
json5                       0.13.0
jsonpointer                 3.0.0
jsonschema                  4.26.0
jsonschema_specifications   NA
jupyter_events              0.12.0
jupyter_server              2.17.0
jupyterlab_server           2.28.0
kiwisolver                  1.4.9
lark                        1.3.1
markupsafe                  3.0.3
matplotlib                  3.10.8
matplotlib_inline           0.2.1
mpl_toolkits                NA
narwhals                    2.16.0
nbformat                    5.10.4
numpy                       2.4.2
packaging                   26.0
parso                       0.8.6
patsy                       1.0.2
platformdirs                4.5.1
prometheus_client           NA
prompt_toolkit              3.0.52
psutil                      7.2.2
pure_eval                   0.2.3
pydev_ipython               NA
pydevconsole                NA
pydevd                      3.2.3
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.19.2
pyparsing                   3.3.2
pythonjsonlogger            NA
referencing                 NA
requests                    2.32.5
rfc3339_validator           0.1.4
rfc3986_validator           0.1.1
rfc3987_syntax              NA
rpds                        NA
scipy                       1.17.0
seaborn                     0.13.2
send2trash                  NA
six                         1.17.0
sphinxcontrib               NA
stack_data                  0.6.3
statsmodels                 0.14.6
threadpoolctl               3.6.0
tornado                     6.5.4
traitlets                   5.14.3
typing_extensions           NA
uri_template                NA
urllib3                     2.6.3
wcwidth                     0.6.0
webcolors                   NA
websocket                   1.9.0
yaml                        6.0.3
zmq                         27.1.0
zoneinfo                    NA
-----
IPython             9.10.0
jupyter_client      8.8.0
jupyter_core        5.9.1
jupyterlab          4.5.3
notebook            7.5.3
-----
Python 3.12.12 (main, Oct 10 2025, 01:01:16) [GCC 13.3.0]
Linux-6.11.0-1018-azure-x86_64-with-glibc2.39
-----
Session information updated at 2026-02-09 22:36