jsonvectorizer.vectorizers.TimestampVectorizer¶

class jsonvectorizer.vectorizers.TimestampVectorizer(bins, min_f=1)¶

Vectorizer for timestamps

Bins data into the specfied number of equiprobable bins, or using the provded bin edges, and uses one-hot encoding to create a binary feature matrix. After binning, the resulting bins are processed from left to right, and are merged into their right neighbor until all bins contain at least the specified number of items. If necessary,

Parses and converts strings to unix timestamps, bins results into the specified number of equiprobable bins, or using the provided bin edges. and uses one-hot encoding to create a binary feature matrix. After binning, the resulting bins are processed from left to right, and are merged into their right neighbor until all bins contain at least the specified number of items. If necessary, the right-most bin is then merged into its left neighbor. Also, if at least min_f items are not valid timestamps, an additional bin (feature) is created for invalid timestamps.

Parameters:	bins : int or list Number of bins to generate, or a list of timestamps to use as bin edges (excluding -inf and inf). min_f : int or float, optional (default=1) Minimum number of samples in each generated bin. An integer is taken as an absolute count, and a float indicates the proportion of n_total passed to the `fit()` method.
Raises:	ValueError If min_f is not a positive number.
Attributes:	feature_names_ : list of str

Methods

`fit`(self, values[, n_total])	Fit vectorizer to the provided data
`fit_transform`(self, values, \\fit_params)	Fit vectorizer to the provided data, then transform it
`get_params`(self[, deep])	Get parameters for this estimator.
`set_params`(self, \\params)	Set the parameters of this estimator.
`transform`(self, values)	Transform values and return the resulting feature matrix

fit(self, values, n_total=None, **kwargs)¶

Fit vectorizer to the provided data

Parameters:	values : array-like, [n_samples] n_total : int or None, optional (default=None) Total Number of documents that values are extracted from. If None, defaults to `len(values)`. **kwargs Ignored keyword arguments.
Returns:	self or None Returns self if at least two bins are generated, otherwise returns None.

fit_transform(self, values, **fit_params)¶

Fit vectorizer to the provided data, then transform it

Parameters:	values : array-like, [n_samples] **fit_params Keyword arguments, passed to the `fit()` method.
Returns:	X : ndarray, [n_samples, n_features]

get_params(self, deep=True)¶

Get parameters for this estimator.

Parameters:	deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : mapping of string to any Parameter names mapped to their values.

set_params(self, **params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:	**params : dict Estimator parameters.
Returns:	self : object Estimator instance.

transform(self, values)¶

Transform values and return the resulting feature matrix

Parameters:	values : array-like, [n_samples]
Returns:	X : sparse matrix, [n_samples, n_features]
Raises:	NotFittedError If the vectorizer has not yet been fitted.