Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Matlab Script for Experiment 1 - Matlab Script for Experiment 1 | MAP 6938, Study Guides, Projects, Research of Mathematics

Material Type: Project; Class: Special Topics; Subject: Mathematics Applied; University: University of Central Florida; Term: Unknown 2010;

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 11/08/2009

koofers-user-s8e
koofers-user-s8e 🇺🇸

5

(1)

10 documents

1 / 5

Toggle sidebar

Related documents


Partial preview of the text

Download Matlab Script for Experiment 1 - Matlab Script for Experiment 1 | MAP 6938 and more Study Guides, Projects, Research Mathematics in PDF only on Docsity! Subhabrata Bhattacharya MAP 6398 PROJECT 04 1. Theory The method of diffusion maps attempts to provide some parametrization of an abstract mathematical space bearing a complex global structure whose each data-point has a Euclidean neighborhood associated with it. In other words, diffusion maps are used to establish relationship between the members of a multidimensional data-set through fusion of more than one dimensions into one. This is achived through an approximation technique used by Laplace-Beltrami operator. The role of the kernelfunction used in this method is to provide an idea about how connected each member in a data-set is with the others in the same data-set. 2. Application We are required to perform two numerical experiments using diffusion maps. In both of these experiments we detect the true dimension of given datasets. For both the experiments we synthetically generate the data-sets as directed. (a) In this experiment we generate a data-set 3X1000 that has data-points varying linearly. A normally distributed noise term is then added to the data-points. Then we apply the diffusion map algorithm to find the six most important eigenvectors. We compare the representation with significant eigenvectors obtained by Principal Component Analysis. We select the second smallest eigenvalue and its corresponding eigenvectors because the smallest is the most trivial representation. This representation tallies with the representation derived from the largest eigenvalue of PCA. We plot both the results. Listing 1: Matlab script for experiment 1 1 c l e a r a l l ; c l o s e a l l ; c l c ; 2 % Construct ing X 3 n = 1000 ; 4 seed = 10 ; 5 A = [ 3 ; 2 ; 1 . 4 ] ; 6 B = [ 0 ; 4 ; 2 ] ; 7 p i = 3 .141519 ; 8 sigma = 1 ; 9 10 T = seed * rand (1 , n ) ; 11 X = ze ro s (3 , n ) ; 12 f o r i c n t =1:n 13 X( : , i c n t )= T( i c n t ) . *A + B; 14 end 15 % Adding the Gaussian no i s e term 16 X = X + randn (3 , n ) ; 17 18 % Plot the r e s u l t s 19 c o l o r l i s t = so r t ( rand (1 , n ) , ' descend ' ) ; 20 f i g u r e ; 21 subplot ( 1 , 2 , 1 ) ; 22 s c a t t e r 3 (X( 1 , : ) ' , X( 2 , : ) ' , X( 3 , : ) ' , 1 0 ) ; 23 x l ab e l ( 'x ' ) ; 24 y l ab e l ( 'y ' ) ; 25 z l a b e l ( ' z ' ) ; 26 27 subplot ( 1 , 2 , 2 ) ; 28 s c a t t e r 3 (X( 1 , : ) ' , X( 2 , : ) ' , X( 3 , : ) ' , 10 , c o l o r l i s t ) ; 29 x l ab e l ( 'x ' ) ; 30 y l ab e l ( 'y ' ) ; 31 z l a b e l ( ' z ' ) ; 32 co l o rba r ; 33 34 % X i s our data−s e t generated with some l i n e a r equat ion with some add i t i v e 35 % Gaussian n o i s e . X i s 3 x 1000 matr ix . Our job i s to r ep r e s en t X −> Y such 36 % that Y i s 3 x m where m << 1000 . This could be d iv ided in to the f o l l ow i ng 37 % s t e p s . 38 39 % Let us assume X i s a graph o f 1000 nodes . we c r e a t e a co s t adjacency 40 % matrix (W) f o r X whose e n t r i e s are weights between edges o f nodes in X. 41 % Wij = exp (− | | x i − xj | | 2 / ep s i l o n ) . e p s i l o n i s data dependent , we s e l e c t 1 Subhabrata Bhattacharya MAP 6398 PROJECT 04 42 % ep s i l o n 43 sumW = 0 ; 44 f o r i c n t =1:n 45 f o r j cn t =1:n 46 W( icnt , j cn t ) = norm(X( : , i c n t ) − X( : , j cn t ) , 2 ) ; 47 end 48 sumW = sumW + min( nonzeros (W( icnt , : ) ) ) ; 49 end 50 ep s i l o n = sumW/n ; 51 W = exp(−W/ ep s i l o n ) ; 52 53 % Now we c r ea t e the norma l i za t i on matrix D and the Laplac ian matrix L. D i s 54 % a diagona l matrix where each o f i t s d iagona l element i s equal to the sum 55 % of a l l rows in a p a r t i c u l a r column. 56 v = ones (n , 1 ) ; 57 D = diag (W*v ) ; 58 %L = D − W; 59 S = Dˆ−0 . 5 *W*Dˆ−0 . 5 ; 60 61 % Computing the Eigenva lues and e i g env e c t o r s o f the g en e r a l i z e d Eigenvalue 62 % problem L*y = lambda*D*y (y being column vec to r s o f Y) . We would need to 63 % f i nd the 6 most important e i g env e c t o r s i . e , the sma l l e s t 6 e i g env e c t o r s 64 % from the d i s t r i b u t i o n 65 k = 6 ; 66 %[ Evect Eval ] = e i g s (S , k , 'sm ' ) ; 67 [ Evect Eval ] = e i g s (S ) ; 68 Evect = Dˆ−0 . 5 * Evect ; 69 Evect = Evect . /Evect ( 1 , 1 ) ; 70 71 % Applying a PCA algor i thm to f i nd the s i x l a r g e s t e i g env e c t o r s o f the 72 % datase t 73 data = S ; 74 mData = mean( data , 1 ) ; 75 76 f o r i c n t = 1 : s i z e ( data , 1 ) 77 data ( i cnt , : ) = data ( i cnt , : ) − mData ; 78 end 79 [ pEvect S ] = svd ( data , ' econ ' ) ; 80 pEvect = pEvect . /pEvect ( 1 , 1 ) ; 81 82 % p lo t the po in t s us ing the 2nd sma l l e s t e i g enve c t o r 83 c l d i f f = Evect ( : , 2 ) ; 84 f i g u r e ; 85 subplot ( 1 , 2 , 1 ) ; 86 s c a t t e r 3 (X( 1 , : ) ' , X( 2 , : ) ' , X( 3 , : ) ' , 10 , c l d i f f ) ; 87 x l ab e l ( 'x ' ) ; 88 y l ab e l ( 'y ' ) ; 89 z l a b e l ( ' z ' ) ; 90 co l o rba r ; 91 92 % p lo t the po in t s us ing the l a r g e s t e i g enve c t o r obta ined from PCA 93 c lpca = pEvect ( : , 1 ) ; 94 95 subplot ( 1 , 2 , 2 ) ; 96 s c a t t e r 3 (X( 1 , : ) ' , X( 2 , : ) ' , X( 3 , : ) ' , 10 , c lpca ) ; 97 x l ab e l ( 'x ' ) ; 98 y l ab e l ( 'y ' ) ; 99 z l a b e l ( ' z ' ) ; 100 co l o rba r ; (b) Here, we generate a data-set 3X1000 that has data-points varying non-linearly using sinusoidal functions. A normally distributed noise term is also added to the data-points. With the help of methods similar to that of problem 1, we compare the representations. 2
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved