Skip to content

DiabetesDataset2019

Dataset class for the Diabetes 2019 dataset that contains sensitive attributes among feature columns. Source and broad description: https://www.kaggle.com/datasets/tigganeha4/diabetes-dataset-2019/data

Parameters

  • subsample_size (int) – defaults to None

    Subsample size to create based on the input dataset

  • subsample_seed (int) – defaults to None

    Seed for sampling using the sample() method from pandas

  • with_nulls (bool) – defaults to True

    Whether to keep nulls in the dataset or drop rows with any nulls. Default: True.