chainer_chemistry.iterators.BalancedSerialIterator

class chainer_chemistry.iterators.BalancedSerialIterator(dataset, batch_size, labels, repeat=True, shuffle=True, batch_balancing=False, ignore_labels=None, logger=<Logger chainer_chemistry.iterators.balanced_serial_iterator (WARNING)>)[source]

Dataset iterator that serially reads the examples with balancing label.

Parameters:
  • dataset – Dataset to iterate.
  • batch_size (int) – Number of examples within each minibatch.
  • labels (list or numpy.ndarray) – 1d array which specifies label feature of dataset. Its size must be same as the length of dataset.
  • repeat (bool) – If True, it infinitely loops over the dataset. Otherwise, it stops iteration at the end of the first epoch.
  • shuffle (bool) – If True, the order of examples is shuffled at the beginning of each epoch. Otherwise, the order is permanently same as that of dataset.
  • batch_balancing (bool) – If True, examples are sampled in the way that each label examples are roughly evenly sampled in each minibatch. Otherwise, the iterator only guarantees that total numbers of examples are same among label features.
  • ignore_labels (int or list or None) – Labels to be ignored. If not None, the example whose label is in ignore_labels are not sampled by this iterator.
__init__(dataset, batch_size, labels, repeat=True, shuffle=True, batch_balancing=False, ignore_labels=None, logger=<Logger chainer_chemistry.iterators.balanced_serial_iterator (WARNING)>)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(dataset, batch_size, labels[, …]) Initialize self.
finalize() Finalizes the iterator and possibly releases the resources.
next() Returns the next batch.
reset()
serialize(serializer) Serializes the internal state of the iterator.
show_label_stats()

Attributes

epoch_detail
previous_epoch_detail