Download Instance based Learning in the Machine Learning | CSI 5325 and more Study notes Computer Science in PDF only on Docsity! Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Greg Hamerly Some content from Tom Mitchell. 1 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Outline 1 Advanced topics in instance-based learning 2 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning PCA (2) PCA is a linear transformation of the data into a new basis. It assumes that there is some orthogonal vector basis of the original data which captures the variance of the data in a better way. In the transformed data, the first dimension has the largest variance, the second dimension has the second largest variance, etc. It finds a new set of orthogonal basis vectors in the original space that point in the directions of maximum variance. 5 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning PCA (3) Based on finding the eigenvalues of the covariance of the data: [v, lambda] = eig(cov(x)); Matlab makes it even easier than this: [v, z, lambda] = princomp(x); If we find the principal components and use only the top few, we have reduced the dimension of our data while keeping the majority of the information (aka variance). 6 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning Random linear projections Basic idea: choose k random vectors in the original space, where k d linearly project all the data to the k new vectors learn in this lower-dimensional space Even though it sounds crazy, amazing properties. Johnson-Lindenstrauss lemma: if we project n points into k = O(log n) dimensions, then there is a high probability that the O(n2) inter-point distances will remain nearly the same. note – no dependence on the original d! 7 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning Multidimensional scaling Given a set of pairwise distances (not original points), find a set of vectors which preserve those distances. Given: d(xi , xj), pairwise distances for n points (1 ≤ i , j ≤ n). Find the minimum of S(y1, y2, . . . , yn) = ∑ i 6=j (||yi − yj || − d(xi , xj))2 where yi are vectors in some space of some chosen dimension. want sufficient dimension for yi , but not too high! method: minimize S by gradient descent on yi vectors. 10 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning Nonlinear manifold learning Basic idea: what if the data lives in some non-linear subspace? The ‘swiss roll’: 11 / 15 Intro. to machine learning (CSI 5325) Lecture 19: Instance-based learning Advanced topics in instance-based learning Nonlinear manifold learning (2) The ‘swiss roll’: Common approaches (ISOMAP, etc.): construct a nearest-neighbor graph of all n points use this graph to estimate pairwise distances (e.g. using shortest path length) use MDS to obtain a new representation in a vector space 12 / 15