On expected error of randomized Nyström Kernel Regression

Aleksandar Trokicic, Branimir Todorovic


Kernelmethodsareaclassofmachinelearningalgorithmswhichlearnanddiscoverpatternsin ahigh(possiblyinfinite)dimensionalfeaturespaceobtainedbyoftennonlinear, possiblyinfinitemapping of an input space. A major problem with kernel methods is their time complexity. For a data set with n input points a time complexity of a kernel method is O(n3), which is intractable for a large data set. A method based on a random Nyström features is an approximation method that is able to reduce the time complexity to O(nl2 + l3) where l is the number of randomly selected input data points. A time complexity of O(l3) comes from the fact that a spectral decomposition needs to be performed on a l×l Gram matrix, and if l is a large number even an approximate algorithm is time consuming. In this paper we will apply the randomized SVD method instead of the spectral decomposition and further reduce the time complexity. An input parameters of a randomized SVD algorithm are l×l Gram matrix and a number m < l. In this case time complexity is O(nm2 +l2m+m3), and linear regression is performed on a m-dimensional random features. We will prove that the error of a predictor, learned via this method is almost the same in expectation as the error of a kernel predictor. Aditionally, we will empirically show that this predictor is better than the one that uses only Nyström method.


  • There are currently no refbacks.