Splitter is a core instrument that allows you to get ‘equal’ groups for your experiment. Groups of an experiment are equal in the sense of users’ desired characteristic of experiment are equal.

It is a crucial part of any experiment design - to get approximately equal groups. Splitter in ABacus not only allows you to split your observations into groups, but also assesses the quality of this split.

df = pd.read_csv('./data/ab_data.csv')

split_builder_params = SplitBuilderParams(
        'control': 20_000,
        'target': 30_000
    main_strata_col = "city",
    split_metric_col = "check_rub_campaign",
    id_col = "user_id",
    cols = ["check_rub_pre_campaign"],
    n_bins = 6,
    min_cluster_size = 500

split_builder = SplitBuilder(df, split_builder_params)
split = split_builder.collect()


After the application of splitter to your data, you will see two additional columns to your data — strata and group_name:

Splitter example
  • strata: strata of observation created by clustering algorithm (HDBSCAN).

  • group_name: groups of experiment. Control group has the same group name - control, and the treatment is called treatment.