RadiusNeighborsClassifier

simbsig.neighbors.RadiusNeighborsClassifier.RadiusNeighborsClassifier(radius=1.0, *, weights='uniform', outlier_label=None, metric='euclidean', p=2, metric_params=None, feature_weights=None, device='cpu', mode='arrays', n_jobs=0, batch_size=None, verbose=True, **kwargs)

Vote-based classifier among neighbors located in a user-defined radius.

Parameters

Parameters

radius – float, default=1.0 Dimension of the neighboring space in which to search for radius_neighbors() queries.
sample_weights – str or callable, default=’uniform’ Options supported are: [‘uniform’,’distance’] or callable Defines the weights to be applied to the nearest neighbors identified in the training set. If weights=’uniform’, all points in each neighborhood are weighted equally. If weights=’distance’, weight is proportional to the distance to the query point, such that neighbors closer to the query point have a greater influence on the prediction. If weight=’callable’, a user-defined function should be passed. It requires to take array of distances as inputs and to return an equal-size array of weights.
p – int, default=2 Parameter to be used when metric=’minkowski’. Note that if p=1 or p=2, it is equivalent to using metric=‘manhattan’ (L1) or metric=‘euclidean’ (L2), respectively. For any other arbitrary p, minkowski distance (L_p) is used.
metric – str or callable, default=’minkowski’ The distance metric used to quantify similarity between objects, with default metric being minkowski. Other available metrics include [‘euclidean’, ‘manhattan’, ‘minkowski’,’fractional’,’cosine’,’mahalanobis’]. When metric=’precomputed’, provide X as a distance matrix which will be square during fit.
metric_params – dict, default=None Additional metric-specific keyword arguments.
feature_weights – np.array of floats, default=None Vector giving user-defined weights to every feature. Must be of similar length as the number of features n_features_in. If feature_weights=None, uniform weights are applied.
device – str, default=’cpu’ Which device to use for distance computations. Options supported are: [‘cpu’,’gpu’]
mode – str, default=’arrays’ Whether the input data is in memory (as lists, arrays or tensors) or on disk as hdf5 files. The latter should be favored for big datasets. Options supported are: [‘arrays’,’hdf5’]
n_jobs – int, default=None Number of jobs active in torch.dataloader.
batch_size – str, default=None Batch size of data chunks that are processed at once for distance computations. Should be optimized for dataset when using device=’gpu’. If batch_size=None, the entire dataset is loaded and processed at once, which may return an error when using device=’gpu’.
verbose – bool, default=True Logging information. If True, progression updates are produced.

simbsig.neighbors.RadiusNeighborsClassifier.RadiusNeighborsClassifier.fit(self, X, y=None)

Fit classifier based on the radius neighbors from the training dataset.

Parameters

Parameters

X – Training data passed in an array-like or h5py file format. Should be of shape (n_samples, n_features) or (n_samples, n_samples) if metric=’precomputed’.
y – Target values from the training data passed in an array-like or sparse matrix format. Should be of shape (n_samples,) or (n_samples, n_outputs)

Returns

Return self: RadiusNeighborsClassifier The fitted radius neighbors classifier.

simbsig.neighbors.RadiusNeighborsClassifier.RadiusNeighborsClassifier.predict(self, X=None)

Predict the class labels for the testing dataset.

Parameters

Parameters: X – Test samples passed in an array-like or h5py file format. Should be of shape (n_queries, n_features) or (n_queries, n_indexed) if metric == ‘precomputed’.

Returns

Return y: Predicted class labels for each sample returned as an ndarray of shape (n_queries,) or (n_queries, n_outputs).

simbsig.neighbors.RadiusNeighborsClassifier.RadiusNeighborsClassifier.predict_proba(self, X=None)

Return probability estimates for each class for each sample from the testing datatset.

Parameters

Parameters: X – Test samples passed in an array-like or h5py file format. Should be of shape (n_queries, n_features) or (n_queries, n_indexed) if metric == ‘precomputed’.

Returns

Return p: Predicted class probabilities for each sample returned as an ndarray of shape (n_queries, n_classes) or a list of n_outputs of such arrays if n_outputs > 1. Note that classes are returned according to lexicographic order.