Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Support Vector Machine 2-Computer Sciences Applications-Project Report, Study Guides, Projects, Research of Applications of Computer Sciences

This project report is part of degree completion in computer science at Ambedkar University, Delhi. Its main points are: Support, Vendor, Machine, Pattern, Recognition, Regression, Classification, Insensitive, Predictive, Modeling

Typology: Study Guides, Projects, Research

2011/2012

Uploaded on 07/16/2012

sameer
sameer 🇮🇳

4.4

(59)

85 documents

1 / 16

Toggle sidebar

Related documents


Partial preview of the text

Download Support Vector Machine 2-Computer Sciences Applications-Project Report and more Study Guides, Projects, Research Applications of Computer Sciences in PDF only on Docsity! 0 Section 1 SUPPORT VECTOR MACHINE 1.1 Introduction There are many pattern recognition problems which can be solved by using ANN. Pattern classification and regression are two major problems in the field of pattern recognition [1]. The task of the classifier is to classify the input patterns to a particular class. The classification task is not very simple because perfect classification performance is not possible. Regression is used to find some functional description of data with the aim of predicting values for the new patterns. Traditional ways of solving regression problems includes the development of mathematical models to approximate an unknown function of single or many variables. ANN has shown natural tendency to solve regression problems as discussed in the previous report. But ANN has got several disadvantages. One of a major disadvantage is that it lacks generalization capability. A machine learning technique is said to be generalized if it gives similar performance on novel data samples as given on training samples. Another disadvantage is that sometimes backpropagation algorithm traps into local minima which cause improper updating of synaptic weights and thus causes improper learning. Support Vector Machine (SVM) is another machine learning technique that can be used to solve both pattern classification and regression problems. In case of support vector classification (SVC), the input vectors are first mapped into feature space of higher dimensionality with the selection of the kernel function. In feature space a hyperplane or set of hyperplanes are constructed to classify the data into two or multiple classes respectively. In case of support vector regression (SVR), the aim is to find the functional form of the data so that it can be used for the prediction of values for those samples not presented to SVR before. This machine learning technique has its roots from statistical learning theory [2]. This machine learning has two advantages over ANN. First, it has docsity.com 1 better generalization capability and second, there is no problem of trapping into local minima. Like ANN, SVM is also first trained on training data samples and then it is tested for those data samples not present in the training dataset. 1.2 Support Vector Machine for Nonlinear Regression Support Vector Machines can also be applied to solve both linear and nonlinear function approximation problems [3-4]. An important concept in support vector regression is the concept of Є-insensitive loss function proposed by Vapnik [5, 6]. Є - Insensitive Loss Function The concept of loss function is not new in optimization theory. Various types of loss functions were proposed by various researchers. For example, quadratic loss function has been used in the optimization of radial-basis function networks [7]. The main reason for using loss functions is computational convenience. Usually, in ANN, least-squares estimators are used. The problem with least-square estimator is that it performs poorly in the presence of outliers. In addition to it, the performance of the model is compromised when the underlying distribution of the additive noise has a long tail. It means that it is sensitive towards small changes in the model. To overcome this problem, a robust estimator that is insensitive to small changes in the model is needed. To achieve this goal, Huber [8-9] proposed a loss function which can be used as a robust estimator. The Huber loss function has the form ydydL ),( (1.1) where d is the desired response and y is the estimator output. Vapnik [5,6] proposed an extension of Huber Loss Function called Є-insensitive loss function to use it in the construction of support vector machines for approximating the desired output. Є-insensitive loss function has the form       otherwise ydforyd ydL ,0 , ),( (1.2) docsity.com 4     N i iiii 1 '' where the i and ' i are the Lagrangian multipliers. The last term on the right-hand side of Eq. 1.12, containing i and ' i , is included to ensure the optimality constraints on the Lagrangian multipliers i and ' i assume variable forms. It is required to minimize ),,,,,,( ''' wJ with respect to the weight vector w and slack variables i and ' i ; it must also be maximized with respect to i and ' i , and also with respect to i and ' i . By doing this optimization, we have in respective ways:       N i ii 1 i ' xw (1.13) ii C   (1.14) And '' ii C   (1.15) The optimization of ),,,,,,( ''' wJ is the primal problem for regression. We can formulate the corresponding dual problem by substituting Eqs. 1.13 – 1.15 in Eq. 1.12. This gives the convex functional (after simplification) given by    N i ii N i iiidQ 1 ' 1 '' )()(),(  )())(( 2 1 , 1 '' 1 ji N j jjii N i K xx    (1.16) where ),( jiK xx is the inner product kernel defined in accordance with the Mercer‟s Theorem [7]: )()(),( ji T jiK xxxx  docsity.com 5 Thus, the solution to our constrained optimization problem is obtained by maximizing ),( 'Q with respect to the Lagrange multipliers i and ' i , subject to the set of new constraints in which the constant C is incorporated which is present in Eq. 1.11. Now the dual problem for the nonlinear regression using a support vector machine can be stated as Given the training sample   N iii d 1 ,  x , find the Lagrangian multipliers  N ii 1  and  N ii 1 '   that maximize the objective function    N i ii N i iiidQ 1 ' 1 '' )()(),(  )())(( 2 1 , 1 '' 1 ji N j jjii N i K xx    Subject to the following constraints: (1). 0)( 1 '   N j ii  (2). NiC NiC i i ,...,2,1,0 ,...,2,1,0 '     where C is a user-specified parameter. Constraint (1) arises from optimization of the Lagrangian with respect to the bias 0wb  for 1)(0 x . Thus having obtained the optimum values of i and ' i , Eq. 1.13 can be used to determine the optimum value of weight vector w for a prescribed map )(x . Note that only some of the coefficients in the expansion of Eq. 1.13 have values not equal to zero. The data points for which 'ii   , define the support vectors for the regression machine. The two parameters Є and C are free parameters that control the VC Dimension of the approximating function docsity.com 6    ji N i ii T KF xxxwwx ,),( 1 '    1.17 Both Є and C must be selected by the user. They served as complexity control parameters. The control of complexity is a difficult problem for regression due to following two reasons. 1. The parameters Є and C must be tuned simultaneously. 2. Regression is intrinsically a difficult problem. The selection of Є and C is still an open research area [7]. A support vector regression machine may be implemented by using Kernel Functions. A list of few Kernel Functions is given in Table 1.1. Table 1.1: Summary of Inner Product Kernels SVM Type Inner Product Kernel Comments Polynomial Learning Machine  piT 1xx Power p is specified a priori by a user. Radial Basis Function        2 22 1 exp ixx  The width 2 is specified by user 1.3 Applications of Support Vector Machine Following are the main applications of SVM. The research area under SVM increases day by day and its applications plays an important role in our daily life [10].  Support vector machines-based generalized predictive control  Dynamic Reconstruction of Chaotic Systems from Inter-spike Intervals Using Least Squares Support Vector Machines  SVM for Geo- and Environmental Sciences  SVM for Protein Fold and Remote Homology Detection  Content based image retrieval  DATA Classification using SVM  DTREG - SVM and Decision Tree Predictive Modeling docsity.com 9 Figure 2.1: ANN used for Lattice Constant Prediction After the construction of neural network as shown in Figure 2.11, it is trained on the training dataset. The neural network shown in the figure is trained by using Levenberg- Marquardt algorithm for backpropagation neural network. Once the neural network is trained, it is used for testing on the test dataset. The performance measure used by Chonghe is the percentage of absolute difference. Three such models of ANN were created for lattice constants a, b and c. The input pattern remains the same for all the three models. They differ only in the target values used in the training phase. 2.2 Support Vector Machine (SVM) This proposed methodology mainly consists of three major modules involving dataset generation module, Support Vector Regression (SVR) training module and SVR testing module. The functional block diagram of our proposed methodology is shown in Figure 2.2. Three separate models are developed for the prediction of lattice constants a, b and c. docsity.com 10 Figure 2.2: Block Diagram of Proposed Methodology Percentage of Absolute Difference is the performance measure used to evaluate the predictive accuracy and is defined as: 100(%)    erimentalexp predictederimentalexp PAD docsity.com 11 Section 3 CLASSIFICATION RESULTS OF ANN 3.1 ANN Methodology In our implementations, we have used Backpropogation neural network for prediction of lattice constant of perovskites materials. We have used one hidden layer with four neurons in our backpropagation neural network. 3.1.1 Training The total dataset comprises of 98 samples. The dataset was divided into two parts: one of 73 instances for training and the other of 25 instances for cross validation. The training parameters are as follow:  Number of hidden layers: 1  Activation function: LogSig  Learning function: purelin  Performance goal: 0.01  Maximum Epochs: 100 The training graph of neural network‟s performance with each epoch is shown below in figure3.1. docsity.com 14 Figure 3.3. Individual RMSE values for testing instances. The individual PAD values for testing instances are shown in figure 3.4. Figure3.4: Individual PAD values for testing instances. docsity.com 15 REFERENCES [1] Richard O. Duda, Peter E. Hart, David G. Stork, “Pattern Classification”, New York, (2001) [2] S. R. Gunn, “Support Vector Machines for Classification and Regression”, (1998) [3] A.J. Smola. “Regression estimation with support vector learning machines”, Master‟s thesis, Technische Universit¨at M¨unchen, (1996) [4] V. Vapnik, S. Golowich, and A. Smola. “Support vector method for function approximation, regression estimation, and signal processing”, In M. Mozer, M. Jordan, and T. Petsche, editors, “Advances in Neural Information Processing Systems 9”, pages 281– 287, Cambridge, MA, (1997) [5] V. Vapnik, “The Nature of Statistical Learning Theory”, Springer Verlag N.Y., (1995) [6] V. Vapnik. “Statistical Learning Theory”, Springer, N.Y., (1998) [7] S. Haykin, “Neural Networks A Comprehensive Foundation”, 2nd ed., Pearson Education, Canada, (1999) [8] Huber, P. J., “Robust estimation of a location parameter", “Annals of Mathematical Statistics”, vol. 35, pp. 73-101, (1964) [9] Huber, P. J., “Robust Statistics”, New York: Wiley, (1981) [10] URL: http://www.clopinet.com/isabelle/Projects/SVM/applist.html. docsity.com
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved