RadiusNeighborsRegressor

simbsig.neighbors.RadiusNeighborsRegressor.RadiusNeighborsRegressor(radius=1.0, *, weights='uniform', p=2, metric='euclidean', metric_params=None, feature_weights=None, device='cpu', mode='arrays', n_jobs=0, batch_size=None, verbose=True, **kwargs)

Regression based on neighbors located in a fixed, user-defined neighboring space. The target is predicted based on the nearest neighbors’ target identified in the training set.

Parameters

Parameters

radius – float, default=1.0 Dimension of the neighboring space in which to search for radius_neighbors() queries.
sample_weights – str or callable, default=’uniform’ Options supported are: [‘uniform’,’distance’] or callable Defines the weights to be applied to the nearest neighbors identified in the training set. If weights=’uniform’, all points in each neighborhood are weighted equally. If weights=’distance’, weight is proportional to the distance to the query point, such that neighbors closer to the query point have a greater influence on the prediction. If weight=’callable’, a user-defined function should be passed. It requires to take array of distances as inputs and to return an equal-size array of weights.
p – int, default=2 Parameter to be used when metric=’minkowski’. Note that if p=1 or p=2, it is equivalent to using metric=‘manhattan’ (L1) or metric=‘euclidean’ (L2), respectively. For any other arbitrary p, minkowski distance (L_p) is used.
metric – str or callable, default=’minkowski’ The distance metric used to quantify similarity between objects, with default metric being minkowski. Other available metrics include [‘euclidean’, ‘manhattan’, ‘minkowski’,’fractional’,’cosine’,’mahalanobis’]. When metric=’precomputed’, provide X as a distance matrix which will be square during fit.
metric_params – dict, default=None Additional metric-specific keyword arguments.
feature_weights – np.array of floats, default=None Vector giving user-defined weights to every feature. Must be of similar length as the number of features n_features_in. If feature_weights=None, uniform weights are applied.
device – str, default=’cpu’ Which device to use for distance computations. Options supported are: [‘cpu’,’gpu’]
mode – str, default=’arrays’ Whether the input data is in memory (as lists, arrays or tensors) or on disk as hdf5 files. The latter should be favored for big datasets. Options supported are: [‘arrays’,’hdf5’]
n_jobs – int, default=None Number of jobs active in torch.dataloader.
batch_size – str, default=None Batch size of data chunks that are processed at once for distance computations. Should be optimized for dataset when using device=’gpu’. If batch_size=None, the entire dataset is loaded and processed at once, which may return an error when using device=’gpu’.
verbose – bool, default=True Logging information. If True, progression updates are produced.

simbsig.neighbors.RadiusNeighborsRegressor.RadiusNeighborsRegressor.fit(self, X, y=None)

Fit regressor based on the radius neighbors from the training dataset.

Parameters

Parameters

X – Training data passed in an array-like or h5py file format. Should be of shape (n_samples, n_features) or (n_samples, n_samples) if metric=’precomputed’.
y – Target values from the training data passed in an array-like or sparse matrix format. Should be of shape (n_samples,) or (n_samples, n_regressions)

Returns

Return self: RadiusNeighborsRegressor The fitted radius neighbors regressor.

simbsig.neighbors.RadiusNeighborsRegressor.RadiusNeighborsRegressor.predict(self, X=None)

Predict the target for the testing dataset.

Parameters

Parameters: X – Test samples passed in an array-like or h5py file format. Should be of shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’.

Returns

Return y: Predicted target values of dtype=double returned as an ndarray of shape (n_queries,) or (n_queries, n_regressions).