Experiment Initialization
Before actual analysis, you have to define your experiment. Here is how you can do it:
from abacus.auto_ab.abtest import ABTest
from abacus.auto_ab.params import ABTestParams, DataParams, HypothesisParams
df = pd.read_csv('./data/ab_data.csv')
data_params = DataParams(
id_col='user_id',
group_col='groups',
control_name='control',
treatment_name='treatment',
target='check_rub_campaign',
)
hypothesis_params = HypothesisParams(
alpha=0.01,
beta=0.2,
alternative='greater',
metric_type='continuous',
metric_name='95th quantile',
metric=lambda x: np.quantile(x, 0.95)
)
ab_params = ABTestParams(data_params, hypothesis_params)
ab_test = ABTest(df, ab_params)
As you can see, you just need to describe data and your hypothesis.
For data, you have to define columns and their purposes. Required attributes are:
id_colis observation id. It can be user_id or any other id for your rows. Note that if your observations are somehow dependent (e.g. several checks per user), they must have the same id_col.group_colcontains group names. If your data have two groups, then there mush be only two unique values in this column.control_nameandtreatment_nameare group names e.g. ‘control’, ‘treatment’, ‘A’, ‘B’, ‘control group’, ‘send sms’, ‘do not send sms’, etc.targetis obviously target column containing metric of interest.
Hypothesis is described with:
alpha— type I error.beta— type II error.alternative— alternative of hypothesis (two-sided, less, or greater.metric_type— metric type. There are three of them: continuous, binary, and ratio.metric_name— metric name, either default (‘mean’ or ‘median’) or customer (e.g. ‘95th percentile’).metric— function for metric calculation ifmetric_nameis not default.