Classifiers

Lazyflow includes some predefined operators for training a classifier and predicting results with it. These operators are defined in lazyflow.operators.classifierOperators.py. Most applications should use the high-level operators OpTrainClassifierBlocked and OpClassifierPredict, which instantiate internal pipelines for retrieving blocks of image data, caching it, and using it with the classifier.

The lazyflow classifiers operators support two families of classifiers:

1. Most classifiers are so-called “vectorwise” classifiers, which can be trained with a simple feature matrix. When training these classifiers, the features for each label can be extracted from the input images and cached in a small feature matrix.

2. But some classifiers require the training data in its full 2D or 3D context, in which case it is trained with full 2D/3D feature images. In lazyflow, these are called “pixelwise” classifiers.

The classifier operators are made to be agnostic with respect to the particular type of classifier you want to use (e.g. RandomForest, SVM, etc.) Any type of classifier can be used as long as it is made to adhere to two special abstract interfaces: a ‘classifier factory’ interface and a ‘classifier’ interface.

To add support for a new classifier in lazyflow, you must define two classes:

  1. A “classifier factory”, which inherits from either LazyflowVectorwiseClassifierFactoryABC or LazyflowPixelwiseClassifierFactoryABC
  2. A “classifier” which inherits from either LazyflowVectorwiseClassifierABC or LazyflowPixelwiseClassifierABC`

The factory is used to create and train a classifier. The factory must be pickle-able and the classifier must be serializable in hdf5.

For clear example implementations of vectorwise and pixelwise classifiers, see vigraRfLazyflowClassifier.py and vigraRfPixelwiseClassifier.py. A slightly more complicated “production” classifier can be found in parallelVigraRfLazyflowClassifier.py.

Finally, most classifiers used in sklearn can be used in lazyflow via the classes implemented in sklearnLazyflowClassifier.py.

Note

In the ilastik pixel classification workflow, the classifier type can be changed via the “Advanced” menu, which is only visible in “debug” mode.

Reference: Classifier ABCs

class lazyflow.classifiers.lazyflowClassifier.LazyflowVectorwiseClassifierFactoryABC[source]

Defines an interface for vector-wise classifier ‘factory’ objects, which lazyflow classifier operators use to construct new vector-wise classifiers. A “vector-wise” classifier is trained with a 2D feature matrix and a 1D label vector.

create_and_train(X, y, feature_names=None)[source]

Create a new classifier and train it with the feature matrix X and label vector y.

description

Return a human-readable description of this classifier.

estimated_ram_usage_per_requested_predictionchannel()[source]

Return the RAM (in bytes) needed by the classifier to run classification. The amount of RAM should be relative to the number of output channels (label classes).

class lazyflow.classifiers.lazyflowClassifier.LazyflowVectorwiseClassifierABC[source]

Defines an interface for “vector-wise” classifier objects that can be used by the lazyflow classifier operators. A “vector-wise” classifier is trained with a 2D feature matrix and a 1D label vector.

All scikit-learn classifiers already satisfy this interface.

classmethod deserialize_hdf5(h5py_group)[source]

Class method. Deserialize the classifier stored in the given h5py.Group object, and return it.

feature_count

Return the number of features used to train this classifier.

feature_names

Return a list of the names of the features used to train this classifier. len(self.feature_names) == self.feature_count

known_classes

Returns the list of label classes known to this classifier (i.e. the classes it was trained with).

predict_probabilities(X)[source]

For each sample in the feature matrix X, predict the probabilities that the sample belongs to each label class the classifier was trained with.

Returns: A multi-channel vector (each channel corresponds to a different label class).

serialize_hdf5(h5py_group)[source]

Serialize the classifier as an hdf5 group.

class lazyflow.classifiers.lazyflowClassifier.LazyflowPixelwiseClassifierFactoryABC[source]

Defines an interface for pixel-wise classifier ‘factory’ objects, which lazyflow classifier operators use to construct new pixel-wise classifiers. A “pixel-wise” classifier is trained with a list of ND feature images (with M feature channels), and a list of corresponding ND label images, with 1 channel each.

Note: It is assumed here that ‘channel’ is always the last axis of the image.

create_and_train_pixelwise(feature_images, label_images, axistags=None, feature_names=None)[source]

Create a new classifier and train it with the given list of feature images and the given list of label images. Generally, it is assumed that the channel dimension is the LAST axis for each image. (The label image must include a singleton channel dimension.) Each pair of corresponding feature and label images must have matching shapes (except for the channel dimension).

description

Return a human-readable description of this classifier.

estimated_ram_usage_per_requested_predictionchannel()[source]

Return the RAM (in bytes) needed by the classifier to run classification. The amount of RAM should be relative to the number of output channels (label classes).

get_halo_shape(data_axes='zyxc')[source]

Return the halo dimensions required for optimal classifier performance. For example, for a classifier that performs an internal 3D convolution with sigma=1.5 and window_size = 2.0, halo_shape = (3, 3, 3, 0).

Clients are not required to provide the halo during training. (For example, it may not be possible for labels near the image border.)

data_axes: A string representing the axis order of the data that will be used for training/prediction.
Examples: ‘yxc’, ‘zyxc’, or ‘tzyxc’.
class lazyflow.classifiers.lazyflowClassifier.LazyflowPixelwiseClassifierABC[source]

Defines an interface for “pixel-wise” classifier objects that can be used by the lazyflow classifier operators. A “pixel-wise” classifier expects its input be given as a list of ND feature images (with M feature channels). (It was trained with a list of ND label images, with 1 channel each.)

Note: It is assumed here that ‘channel’ is always the last axis of the image.

(This interface is typically used with classifiers that must generate their own features internally, and thus require the knowledge of the image structure and context around each training/prediction point.)

classmethod deserialize_hdf5(h5py_group)[source]

Class method. Deserialize the classifier stored in the given h5py.Group object, and return it.

feature_count

Return the number of features used to train this classifier.

get_halo_shape(data_axes='zyxc')[source]

Same as LazyflowPixelwiseClassifierFactoryABC.get_halo_shape(). See that function for details.

known_classes

Returns the list of label classes known to this classifier (i.e. the classes it was trained with).

predict_probabilities_pixelwise(feature_image, roi, axistags=None)[source]

For each pixel in the given feature_image, predict the probabilities that the pixel belongs to each label class the classifier was trained with.

feature_image: An ND image. Last axis must be channel. roi: The region of interest (start, stop) within feature_image to predict (e.g. without the halo region)

Note: roi parameter should not include channel.
For example, a valid roi for a zyxc image could be ((0,0,0), (10,20,30))

axistags: Optional. A vigra.AxisTags object describing the feature_image.

Returns: A multi-channel image (each channel corresponds to a different label class).
The result image size is determined by the roi parameter.
serialize_hdf5(h5py_group)[source]

Serialize the classifier as an hdf5 group