balance Quickstart: Analyzing and adjusting the bias on a simulated toy dataset¶

'balance' is a Python package that is maintained and released by the Core Data Science Tel-Aviv team in Meta. 'balance' performs and evaluates bias reduction by weighting for a broad set of experimental and observational use cases.

Although balance is written in Python, you don't need a deep Python understanding to use it. In fact, you can just use this notebook, load your data, change some variables and re-run the notebook and produce your own weights!

This quickstart demonstrates re-weighting specific simulated data, but if you have a different usecase or want more comprehensive documentation, you can check out the comprehensive balance tutorial.

Analysis¶

There are four main steps to analysis with balance:

  • load data
  • check diagnostics before adjustment
  • perform adjustment + check diagnostics
  • output results

Let's dive right in!

Example dataset¶

The following is a toy simulated dataset.

In [1]:
%matplotlib inline

import plotly.offline as offline
offline.init_notebook_mode()

import warnings
warnings.filterwarnings("ignore")

from balance import load_data
INFO (2025-12-16 18:47:55,935) [__init__/<module> (line 72)]: Using balance version 0.14.0
balance (Version 0.14.0) loaded:
    📖 Documentation: https://import-balance.org/
    🛠️ Help / Issues: https://github.com/facebookresearch/balance/issues/
    📄 Citation:
        Sarig, T., Galili, T., & Eilat, R. (2023).
        balance - a Python package for balancing biased data samples.
        https://arxiv.org/abs/2307.06024

    Tip: You can view this message anytime with balance.help()

In [2]:
target_df, sample_df = load_data()

print("target_df: \n", target_df.head())
print("sample_df: \n", sample_df.head())
target_df: 
        id gender age_group     income  happiness
0  100000   Male       45+  10.183951  61.706333
1  100001   Male       45+   6.036858  79.123670
2  100002   Male     35-44   5.226629  44.206949
3  100003    NaN       45+   5.752147  83.985716
4  100004    NaN     25-34   4.837484  49.339713
sample_df: 
   id  gender age_group     income  happiness
0  0    Male     25-34   6.428659  26.043029
1  1  Female     18-24   9.940280  66.885485
2  2    Male     18-24   2.673623  37.091922
3  3     NaN     18-24  10.550308  49.394050
4  4     NaN     18-24   2.689994  72.304208
In [3]:
target_df.head().round(2).to_dict()
# sample_df.shape
Out[3]:
{'id': {0: '100000', 1: '100001', 2: '100002', 3: '100003', 4: '100004'},
 'gender': {0: 'Male', 1: 'Male', 2: 'Male', 3: nan, 4: nan},
 'age_group': {0: '45+', 1: '45+', 2: '35-44', 3: '45+', 4: '25-34'},
 'income': {0: 10.18, 1: 6.04, 2: 5.23, 3: 5.75, 4: 4.84},
 'happiness': {0: 61.71, 1: 79.12, 2: 44.21, 3: 83.99, 4: 49.34}}

In practice, one can use pandas loading function(such as read_csv()) to import data into the DataFrame objects sample_df and target_df.

Load data into a Sample object¶

The first thing to do is to import the Sample class from balance. All of the data we're going to be working with, sample or population, will be stored in objects of the Sample class.

In [4]:
from balance import Sample

Using the Sample class, we can fill it with a "sample" we want to adjust, and also a "target" we want to adjust towards.

We turn the two input pandas DataFrame objects we created (or loaded) into a balance.Sample objects, by using the .from_frame()

In [5]:
sample = Sample.from_frame(sample_df, outcome_columns=["happiness"])
# Often times we don'y have the outcome for the target. In this case we've added it just to validate later that the weights indeed help us reduce the bias
target = Sample.from_frame(target_df, outcome_columns=["happiness"])
WARNING (2025-12-16 18:47:56,224) [util/guess_id_column (line 346)]: Guessed id column name id for the data
WARNING (2025-12-16 18:47:56,233) [sample_class/from_frame (line 504)]: No weights passed. Adding a 'weight' column and setting all values to 1
WARNING (2025-12-16 18:47:56,240) [util/guess_id_column (line 346)]: Guessed id column name id for the data
WARNING (2025-12-16 18:47:56,254) [sample_class/from_frame (line 504)]: No weights passed. Adding a 'weight' column and setting all values to 1

If we use the .df property call, we can see the DataFrame stored in sample. We can see how we have a new weight column that was added (it will all have 1s) in the importing of the DataFrames into a balance.Sample object.

In [6]:
sample.df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   id         1000 non-null   object 
 1   gender     912 non-null    object 
 2   age_group  1000 non-null   object 
 3   income     1000 non-null   float64
 4   happiness  1000 non-null   float64
 5   weight     1000 non-null   float64
dtypes: float64(3), object(3)
memory usage: 47.0+ KB

We can get a quick overview text of each Sample object, but just calling it.

Let's take a look at what this produces:

In [7]:
sample
Out[7]:
(balance.sample_class.Sample)

        balance Sample object
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
In [8]:
target
Out[8]:
(balance.sample_class.Sample)

        balance Sample object
        10000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        

Next, we combine the sample object with the target object. This is what will allow us to adjust the sample to the target.

In [9]:
sample_with_target = sample.set_target(target)

Looking on sample_with_target now, it has the target atteched:

In [10]:
sample_with_target
Out[10]:
(balance.sample_class.Sample)

        balance Sample object with target set
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
            target:
                 
	        balance Sample object
	        10000 observations x 3 variables: gender,age_group,income
	        id_column: id, weight_column: weight,
	        outcome_columns: happiness
	        
            3 common variables: gender,age_group,income
            

Pre-Adjustment Diagnostics¶

We can use .covars() and then followup with .mean() and .plot() (barplots and kde density plots) to get some basic diagnostics on what we got.

We can see how:

  • The proportion of missing values in gender is similar in sample and target.
  • We have younger people in the sample as compared to the target.
  • We have more females than males in the sample, as compared to around 50-50 split for the (non NA) target.
  • Income is more right skewed in the target as compared to the sample.
In [11]:
print(sample_with_target.covars().mean().T)
source                     self     target
_is_na_gender[T.True]  0.088000   0.089800
age_group[T.25-34]     0.300000   0.297400
age_group[T.35-44]     0.156000   0.299200
age_group[T.45+]       0.053000   0.206300
gender[Female]         0.268000   0.455100
gender[Male]           0.644000   0.455100
gender[_NA]            0.088000   0.089800
income                 6.297302  12.737608
In [12]:
print(sample_with_target.covars().asmd().T)
source                  self
age_group[T.25-34]  0.005688
age_group[T.35-44]  0.312711
age_group[T.45+]    0.378828
gender[Female]      0.375699
gender[Male]        0.379314
gender[_NA]         0.006296
income              0.494217
mean(asmd)          0.326799

In [13]:
print(sample_with_target.covars().asmd(aggregate_by_main_covar = True).T)
source          self
age_group   0.232409
gender      0.253769
income      0.494217
mean(asmd)  0.326799
In [14]:
sample_with_target.covars().plot()

Adjusting Sample to Population¶

Next, we adjust the sample to the target. The default method to be used is 'ipw' (which uses inverse probability/propensity weights, after running logistic regression with lasso regularization).

In [15]:
# Using ipw to fit survey weights
adjusted = sample_with_target.adjust()
INFO (2025-12-16 18:47:56,933) [ipw/ipw (line 622)]: Starting ipw function
INFO (2025-12-16 18:47:56,936) [adjustment/apply_transformations (line 469)]: Adding the variables: []
INFO (2025-12-16 18:47:56,937) [adjustment/apply_transformations (line 470)]: Transforming the variables: ['gender', 'age_group', 'income']
INFO (2025-12-16 18:47:56,953) [adjustment/apply_transformations (line 507)]: Final variables in output: ['gender', 'age_group', 'income']
INFO (2025-12-16 18:47:56,962) [ipw/ipw (line 656)]: Building model matrix
INFO (2025-12-16 18:47:57,059) [ipw/ipw (line 678)]: The formula used to build the model matrix: ['income + gender + age_group + _is_na_gender']
INFO (2025-12-16 18:47:57,059) [ipw/ipw (line 681)]: The number of columns in the model matrix: 16
INFO (2025-12-16 18:47:57,060) [ipw/ipw (line 682)]: The number of rows in the model matrix: 11000
INFO (2025-12-16 18:48:13,869) [ipw/ipw (line 843)]: Done with sklearn
INFO (2025-12-16 18:48:13,869) [ipw/ipw (line 845)]: max_de: None
INFO (2025-12-16 18:48:13,870) [ipw/ipw (line 867)]: Starting model selection
INFO (2025-12-16 18:48:13,873) [ipw/ipw (line 900)]: Chosen lambda: 0.041158338186664825
INFO (2025-12-16 18:48:13,874) [ipw/ipw (line 918)]: Proportion null deviance explained 0.17265121909892267
In [16]:
print(adjusted)
        Adjusted balance Sample object with target set using ipw
        1000 observations x 3 variables: gender,age_group,income
        id_column: id, weight_column: weight,
        outcome_columns: happiness
        
        adjustment details:
            method: ipw
            weight trimming mean ratio: 20
            design effect (Deff): 1.880, eff. sample size proportion: 0.532, eff. sample size: 531.8
                
            target:
                 
	        balance Sample object
	        10000 observations x 3 variables: gender,age_group,income
	        id_column: id, weight_column: weight,
	        outcome_columns: happiness
	        
            3 common variables: gender,age_group,income
            

Evaluation of the Results¶

We can get a basic summary of the results:

In [17]:
print(adjusted.summary())
Adjustment details:
    method: ipw
    weight trimming mean ratio: 20
Covariate diagnostics:
    Covar ASMD reduction: 63.4%
    Covar ASMD (7 variables): 0.327 -> 0.119
    Covar mean KLD reduction: 95.3%
    Covar mean KLD (3 variables): 0.071 -> 0.003
Weight diagnostics:
    design effect (Deff): 1.880
    effective sample size proportion (ESSP): 0.532
    effective sample size (ESS): 531.8
Outcome weighted means:
            happiness
source               
self           53.297
target         56.278
unadjusted     48.559
Model performance: Model proportion deviance explained: 0.173
In [18]:
print(adjusted.covars().mean().T)
source                      self     target  unadjusted
_is_na_gender[T.True]   0.086866   0.089800    0.088000
age_group[T.25-34]      0.307309   0.297400    0.300000
age_group[T.35-44]      0.273676   0.299200    0.156000
age_group[T.45+]        0.137604   0.206300    0.053000
gender[Female]          0.406342   0.455100    0.268000
gender[Male]            0.506792   0.455100    0.644000
gender[_NA]             0.086866   0.089800    0.088000
income                 10.060502  12.737608    6.297302

We see an improvement in the average ASMD. We can look at detailed list of ASMD values per variables using the following call.

In [19]:
print(adjusted.covars().asmd().T)
source                  self  unadjusted  unadjusted - self
age_group[T.25-34]  0.021676    0.005688          -0.015988
age_group[T.35-44]  0.055738    0.312711           0.256973
age_group[T.45+]    0.169759    0.378828           0.209069
gender[Female]      0.097907    0.375699           0.277792
gender[Male]        0.103798    0.379314           0.275516
gender[_NA]         0.010260    0.006296          -0.003965
income              0.205436    0.494217           0.288781
mean(asmd)          0.119494    0.326799           0.207304
In [20]:
print(adjusted.covars().kld().T)
source                  self  unadjusted  unadjusted - self
age_group[T.25-34]  0.000233    0.000016          -0.000217
age_group[T.35-44]  0.001580    0.055329           0.053749
age_group[T.45+]    0.015864    0.095205           0.079341
gender[Female]      0.004830    0.074156           0.069327
gender[Male]        0.005364    0.072046           0.066682
gender[_NA]         0.000053    0.000020          -0.000033
income              0.000773    0.114895           0.114122
mean(kld)           0.003360    0.071273           0.067913

We can also use KL divergence to summarize how far the sample covariates are from the target distribution across both numeric and categorical variables. The helper below aggregates over one-hot encoded categories and compares the adjusted sample to the original unadjusted sample.

In [21]:
print(adjusted.covars().kld(aggregate_by_main_covar=True).T)
source         self  unadjusted  unadjusted - self
age_group  0.005893    0.050183           0.044291
gender     0.003416    0.048741           0.045325
income     0.000773    0.114895           0.114122
mean(kld)  0.003360    0.071273           0.067913

It's easier to learn about the biases by just running .covars().plot() on our adjusted object.

In [22]:
adjusted.covars().plot()  # you could change sizes using something like .plot(width = 1500, height = 700)

We can also use different plots, using the seaborn library, for example with the "kde" dist_type.

In [23]:
# This shows how we could use seaborn to plot a kernel density estimation
adjusted.covars().plot(library = "seaborn", dist_type = "kde")