pyconstruct.learners.SSG¶

class pyconstruct.learners.SSG(domain=None, model=None, *, inference='loss_augmented_map', eta0=1.0, power_t=0.5, learning_rate='optimal', radius=1000.0, projection='l2', init_w='normal', **kwargs)¶

Learner implementing the standard subgradient algorithm.

This learner performs the standard Stochastic Subgradient descent from [1]. It also includes the options for:

Training with the Pegasos update scheme [2]; simply set alpha greater than zero to regularize the model.

Project onto an L2 or an L1 ball of a given radius, the latter using the projection algorithm from [3].

Boost the model with the method from [4]; simply use a different training loss that hinge.

Adaptive step size techniques and such (coming soon).

Parameters:

domain (BaseDomain) – The domain of the data.
inference (str in ['map', 'loss_augmented_map']) – Which type of inference to perform when learning.
alpha (float) – The regularization coefficient.
train_loss (str in ['hinge', 'logistic', 'exponential']) – The training loss. The derivative of this loss is used to rescale the margin of the examples when making an update.
structured_loss (function (y, y) -> float) – The structured loss to compute on the objects.
eta0 (float) – The initial value of the learning rate.
power_t (float) – The power of the iteration index when using an invscaling learning_rate.
learning_rate (str in ['constant', 'optimal', 'invscaling']) – The learning rate strategy. The constant learning multiplies the updates for eta0; the invscaling divides the updates by the iteration number raised to the power_t; the optimal strategy finds the best rate depending on alpha and train_loss (similar to Scikit-learn’s SGDRegressor optimal learning rate).
radius (float) – The radius of the ball enclosing the parameter space.
projection (None or str in ['l1', 'l2']) – If None, no projection is applied, otherwise, if ‘l1’ or ‘l2’ are given, the weights are projected back onto an L1 or an L2 ball respectively.
init_w (str in ['zeros', 'uniform', 'normal', 'laplace']) – Initialization strategy for the parameter vector. ‘zeros’ initializes the vector to all zero values; ‘uniform’, ‘normal’ and ‘laplace’ initialize the vector with random weights from, respectively, a uniform, normal, or Laplace distribution.

References

[1]	Ratliff, Nathan D., J. Andrew Bagnell, and Martin A. Zinkevich. “(Online) Subgradient Methods for Structured Prediction.” Artificial Intelligence and Statistics. 2007.

[2]	Shalev-Shwartz, Shai, et al. “Pegasos: Primal estimated sub-gradient solver for svm.” Mathematical programming 127.1 (2011): 3-30.

[3]	Efficient Projections onto the .1-Ball for Learning in High Dimensions John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. International Conference on Machine Learning (ICML 2008) http://www.cs.berkeley.edu/~jduchi/projects/DuchiSiShCh08.pdf

[4]	Parker, Charles, Alan Fern, and Prasad Tadepalli. “Gradient boosting for sequence alignment.” Proceedings of the 21st national conference on Artificial intelligence-Volume 1. AAAI Press, 2006.

Methods

`decision_function`(X, Y, **kwargs)
`fit`(X, Y, **kwargs)	Fit a model with data (X, Y).
`get_params`([deep])	Get parameters for this estimator.
`loss`(X, Y, Y_pred, **kwargs)
`partial_fit`(X, Y[, Y_pred, Y_phi, Y_pred_phi])	Updates the current model with a mini-batch (X, Y).
`phi`(X, Y, **kwargs)	Computes the feature vector for the given input and output objects.
`predict`(X, args, *kwargs)	Computes the prediction of the current model for the given input.
`score`(X, Y[, Y_pred])	Compute the score as the average loss over the examples.
`set_params`(**params)	Set the parameters of this estimator.