balance.sample_class

class balance.sample_class.Sample(responders: SampleFrame | None = None, target: SampleFrame | None = None)[source]

A class used to represent a sample.

Sample is the main object of balance. It contains a dataframe of unit’s observations, associated with id and weight.

Sample inherits from both BalanceFrame and SampleFrame. Without a target it behaves like a SampleFrame; with a target (after set_target()) it behaves like a BalanceFrame.

MRO: Sample → BalanceFrame → SampleFrame → object

id_column

a column representing the ids of the units in sample

Type:

pd.Series

weight_series

a column representing the weights of the units in sample

Type:

pd.Series

classmethod from_frame(df: Any, id_column: str | None = None, covar_columns: list[str] | None = None, weight_column: str | None = None, outcome_columns: list[str] | tuple[str, ...] | str | None = None, predicted_outcome_columns: list[str] | tuple[str, ...] | str | None = None, ignored_columns: list[str] | tuple[str, ...] | str | None = None, check_id_uniqueness: bool = True, standardize_types: bool = True, use_deepcopy: bool = True, id_column_candidates: list[str] | tuple[str, ...] | str | None = None) Self[source]

Create a Sample from a pandas DataFrame.

Thin wrapper around SampleFrame.from_frame() that builds a SampleFrame and then wraps it in a Sample via _create().

Parameters:
  • df – DataFrame containing the sample data.

  • id_column – Column name for respondent ids (must be unique).

  • covar_columns – Explicit covariate column names. If None, covariates are inferred by exclusion.

  • weight_column – Column to treat as weight.

  • outcome_columns – Columns to treat as outcomes.

  • predicted_outcome_columns – Columns to treat as predicted outcomes.

  • ignored_columns – Columns to ignore (excluded from covariates).

  • check_id_uniqueness – Whether to verify id uniqueness.

  • standardize_types – Whether to convert int types to float.

  • use_deepcopy – Whether to deepcopy the input DataFrame.

  • id_column_candidates – Candidate id column names when id_column is not provided.

Returns:

A new Sample.

to_balance_frame() Any[source]

Convert this Sample (with target) to a BalanceFrame.

The Sample must have a target set. If the Sample is adjusted, the adjustment state is preserved in the BalanceFrame.

Returns:

A new BalanceFrame mirroring this Sample’s data,

target, and adjustment state.

Return type:

BalanceFrame

Raises:

ValueError – If this Sample does not have a target set.

Examples

>>> import pandas as pd
>>> from balance.sample_class import Sample
>>> s = Sample.from_frame(
...     pd.DataFrame({"id": [1, 2], "x": [10.0, 20.0], "weight": [1.0, 1.0]}))
>>> t = Sample.from_frame(
...     pd.DataFrame({"id": [3, 4], "x": [15.0, 25.0], "weight": [1.0, 1.0]}))
>>> bf = s.set_target(t).to_balance_frame()
>>> bf.is_adjusted
False
to_sample_frame() Any[source]

Convert this Sample to a SampleFrame.

Preserves all data and column roles (id, weight, outcomes, ignored columns). The returned SampleFrame is independent of the original Sample.

Returns:

A new SampleFrame mirroring this Sample.

Return type:

SampleFrame

Examples

>>> import pandas as pd
>>> from balance.sample_class import Sample
>>> s = Sample.from_frame(
...     pd.DataFrame({"id": [1, 2], "x": [10.0, 20.0], "weight": [1.0, 2.0]}))
>>> sf = s.to_sample_frame()
>>> list(sf.df_covars.columns)
['x']