API¶
-
class
cem.CEM(data: pandas.core.frame.DataFrame, treatment: str, outcome: str, H: Optional[int] = None, measure: str = 'l1', lower_H: int = 1, upper_H: int = 10)¶ The CEM class allows users to experiment with different coarsening schemas on a single DataFrame. The “imbalance” and “match” methods return the multivariate imbalance (pre or post matching) and individual observation weights post-matching, respectively.
Parameters: - data (pandas.DataFrame) – A dataframe containing the observations
- treatment (str) – Name of column in dataframe containing the treatment variable
- outcome (str) – Name of column in dataframe containing the outcome variable
- H (int, optional) – The number of bins to use for the continuous variables when calculating imbalance. If None, H will be calculated using a heuristic (i.e. The integer value between lower_H and upper_H that produced the median L1 imbalance)
- measure (str, optional) – Multivariate imbalance measure to use (only L1 and L2 imbalance supported)
- lower_H (int, optional) – If H is not provided, the lower end of the range for the automatic H search.
- upper_H (int, optional) – If H is not provided, the upper end of the range for the automatic H search.
-
data¶ Type: pandas.DataFrame
-
treatment¶ Type: str
-
outcome¶ Type: str
-
H¶ Type: int
-
imbalance_schema¶ Independent coarsening schema used to calculate multivariate imbalance (pre or post matching)
Type: dict
-
measure¶ Multivariate imbalance measure
Type: str
-
imbalance(coarsening: Optional[dict] = None) → float¶ Calculate the multivariate imbalance remaining after matching the data using some coarsening schema
Parameters: coarsening (dict) – Defines the strata. If None, the returned value is the imbalance prior to performing CEM. Keys are the covariate/column names and values are tuples of (func, kwargs). “func” is the name of the Pandas function to use for grouping the covariate (only “cut” and “qcut” are supported) “kwargs” is a dict of arguments to be passed to the Pandas cut function (along with the covariate data) Returns: The residual imbalance Return type: float
-
match(coarsening: Optional[dict] = None) → pandas.core.series.Series¶ Perform coarsened exact matching using some coarsening schema and return the weights for each observation
Parameters: coarsening (dict) – Defines the strata. If None, the returned value is the imbalance prior to performing CEM. Keys are the covariate/column names and values are tuples of (func, kwargs). “func” is the name of the Pandas function to use for grouping the covariate (only “cut” and “qcut” are supported) “kwargs” is a dict of arguments to be passed to the Pandas cut function (along with the covariate data) Returns: The weight to use for each observation of the provided data given the coarsening schema provided Return type: pandas.Series