pyconstruct.learners.SSG¶
-
class
pyconstruct.learners.
SSG
(domain=None, model=None, *, inference='loss_augmented_map', eta0=1.0, power_t=0.5, learning_rate='optimal', radius=1000.0, projection='l2', init_w='normal', **kwargs)¶ Learner implementing the standard subgradient algorithm.
This learner performs the standard Stochastic Subgradient descent from [1]. It also includes the options for:
- Training with the Pegasos update scheme [2]; simply set alpha greater than zero to regularize the model.
- Project onto an L2 or an L1 ball of a given radius, the latter using the projection algorithm from [3].
- Boost the model with the method from [4]; simply use a different training loss that hinge.
- Adaptive step size techniques and such (coming soon).
Parameters: - domain (BaseDomain) – The domain of the data.
- inference (str in ['map', 'loss_augmented_map']) – Which type of inference to perform when learning.
- alpha (float) – The regularization coefficient.
- train_loss (str in ['hinge', 'logistic', 'exponential']) – The training loss. The derivative of this loss is used to rescale the margin of the examples when making an update.
- structured_loss (function (y, y) -> float) – The structured loss to compute on the objects.
- eta0 (float) – The initial value of the learning rate.
- power_t (float) – The power of the iteration index when using an invscaling learning_rate.
- learning_rate (str in ['constant', 'optimal', 'invscaling']) – The learning rate strategy. The constant learning multiplies the updates for eta0; the invscaling divides the updates by the iteration number raised to the power_t; the optimal strategy finds the best rate depending on alpha and train_loss (similar to Scikit-learn’s SGDRegressor optimal learning rate).
- radius (float) – The radius of the ball enclosing the parameter space.
- projection (None or str in ['l1', 'l2']) – If None, no projection is applied, otherwise, if ‘l1’ or ‘l2’ are given, the weights are projected back onto an L1 or an L2 ball respectively.
- init_w (str in ['zeros', 'uniform', 'normal', 'laplace']) – Initialization strategy for the parameter vector. ‘zeros’ initializes the vector to all zero values; ‘uniform’, ‘normal’ and ‘laplace’ initialize the vector with random weights from, respectively, a uniform, normal, or Laplace distribution.
References
[1] Ratliff, Nathan D., J. Andrew Bagnell, and Martin A. Zinkevich. “(Online) Subgradient Methods for Structured Prediction.” Artificial Intelligence and Statistics. 2007. [2] Shalev-Shwartz, Shai, et al. “Pegasos: Primal estimated sub-gradient solver for svm.” Mathematical programming 127.1 (2011): 3-30. [3] Efficient Projections onto the .1-Ball for Learning in High Dimensions John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. International Conference on Machine Learning (ICML 2008) http://www.cs.berkeley.edu/~jduchi/projects/DuchiSiShCh08.pdf [4] Parker, Charles, Alan Fern, and Prasad Tadepalli. “Gradient boosting for sequence alignment.” Proceedings of the 21st national conference on Artificial intelligence-Volume 1. AAAI Press, 2006. Methods
decision_function
(X, Y, **kwargs)fit
(X, Y, **kwargs)Fit a model with data (X, Y). get_params
([deep])Get parameters for this estimator. loss
(X, Y, Y_pred, **kwargs)partial_fit
(X, Y[, Y_pred, Y_phi, Y_pred_phi])Updates the current model with a mini-batch (X, Y). phi
(X, Y, **kwargs)Computes the feature vector for the given input and output objects. predict
(X, *args, **kwargs)Computes the prediction of the current model for the given input. score
(X, Y[, Y_pred])Compute the score as the average loss over the examples. set_params
(**params)Set the parameters of this estimator.