pyconstruct.learners.SSG

class pyconstruct.learners.SSG(domain=None, model=None, *, inference='loss_augmented_map', eta0=1.0, power_t=0.5, learning_rate='optimal', radius=1000.0, projection='l2', init_w='normal', **kwargs)

Learner implementing the standard subgradient algorithm.

This learner performs the standard Stochastic Subgradient descent from [1]. It also includes the options for:

  • Training with the Pegasos update scheme [2]; simply set alpha greater than zero to regularize the model.
  • Project onto an L2 or an L1 ball of a given radius, the latter using the projection algorithm from [3].
  • Boost the model with the method from [4]; simply use a different training loss that hinge.
  • Adaptive step size techniques and such (coming soon).
Parameters:
  • domain (BaseDomain) – The domain of the data.
  • inference (str in ['map', 'loss_augmented_map']) – Which type of inference to perform when learning.
  • alpha (float) – The regularization coefficient.
  • train_loss (str in ['hinge', 'logistic', 'exponential']) – The training loss. The derivative of this loss is used to rescale the margin of the examples when making an update.
  • structured_loss (function (y, y) -> float) – The structured loss to compute on the objects.
  • eta0 (float) – The initial value of the learning rate.
  • power_t (float) – The power of the iteration index when using an invscaling learning_rate.
  • learning_rate (str in ['constant', 'optimal', 'invscaling']) – The learning rate strategy. The constant learning multiplies the updates for eta0; the invscaling divides the updates by the iteration number raised to the power_t; the optimal strategy finds the best rate depending on alpha and train_loss (similar to Scikit-learn’s SGDRegressor optimal learning rate).
  • radius (float) – The radius of the ball enclosing the parameter space.
  • projection (None or str in ['l1', 'l2']) – If None, no projection is applied, otherwise, if ‘l1’ or ‘l2’ are given, the weights are projected back onto an L1 or an L2 ball respectively.
  • init_w (str in ['zeros', 'uniform', 'normal', 'laplace']) – Initialization strategy for the parameter vector. ‘zeros’ initializes the vector to all zero values; ‘uniform’, ‘normal’ and ‘laplace’ initialize the vector with random weights from, respectively, a uniform, normal, or Laplace distribution.

References

[1]Ratliff, Nathan D., J. Andrew Bagnell, and Martin A. Zinkevich. “(Online) Subgradient Methods for Structured Prediction.” Artificial Intelligence and Statistics. 2007.
[2]Shalev-Shwartz, Shai, et al. “Pegasos: Primal estimated sub-gradient solver for svm.” Mathematical programming 127.1 (2011): 3-30.
[3]Efficient Projections onto the .1-Ball for Learning in High Dimensions John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. International Conference on Machine Learning (ICML 2008) http://www.cs.berkeley.edu/~jduchi/projects/DuchiSiShCh08.pdf
[4]Parker, Charles, Alan Fern, and Prasad Tadepalli. “Gradient boosting for sequence alignment.” Proceedings of the 21st national conference on Artificial intelligence-Volume 1. AAAI Press, 2006.

Methods

decision_function(X, Y, **kwargs)
fit(X, Y, **kwargs) Fit a model with data (X, Y).
get_params([deep]) Get parameters for this estimator.
loss(X, Y, Y_pred, **kwargs)
partial_fit(X, Y[, Y_pred, Y_phi, Y_pred_phi]) Updates the current model with a mini-batch (X, Y).
phi(X, Y, **kwargs) Computes the feature vector for the given input and output objects.
predict(X, *args, **kwargs) Computes the prediction of the current model for the given input.
score(X, Y[, Y_pred]) Compute the score as the average loss over the examples.
set_params(**params) Set the parameters of this estimator.