Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Department of Computer Science University of Leicester ..., Exercises of Computer science

Computer science

“Learning Program Models by Observing Their Behavior”. Submitted by ... Leicester, UK, for the financial support in the form of postgraduate scholarship.

Typology: Exercises

2021/2022

Uploaded on 09/27/2022

jacksonhh 🇬🇧

4.2

(23)

34 documents

1 / 50

Partial preview of the text

Download Department of Computer Science University of Leicester ... and more Exercises Computer science in PDF only on Docsity! 20 Following our BMI example, the model inferred during the first iteration is fed with the misclassifications obtained during the testing process and retrained. The iterative improvement of the inferred model is shown in Fig 13. With respect to the tree generated it is obvious in the Figure that the size of the tree increases as the iteration increases. The number in brackets at the end of each leaf is the number of instances that were classified in that particular class. At iteration 2 there is a significant change in the tree inferred from that obtained during iteration 1. The learner had information in the training sample to differentiate between Normal and Underweight category. This was obtained from the testing sample which was used to retrain the model. Figure 13. Iterative Improvement of BMI example During iteration 4, there were no contradictions predicted when testing the model with independent test samples and that is the point when the model was close to being perfect to the subject system. The tree structure did not change during iteration 5 because of the reason that the model did not infer anything new from the training sample. The BMI example was considerably a small system and hence the number of iterations taken to infer a perfect model was less. The case studies (Chapter 5) deals with more complex and larger systems and we will see how the inferred model depends on the number iterations taken. The ILUSTRATOR technique proved to be a good learning process for the model to learn from its own mistakes. A Learning curve graph (Fig. 14.) was plotted for the BMI example to see how the inference process took place. The graph was plotted with the Accuracy Vs No. of test samples used. 21 Figure 14. Learning Curve for BMI example The graph is a classic S-Curve which is highly considered in the machine learning community. It is very evident that the learning process was consistently improving as the number of test sample increased. The graph was plotted without using any particular statistical curve-fitting algorithm. At a certain limit in the graph, the learning curve attained its peak threshold (Maximum Learning) and it was consistently stable after irrespective of the number of test samples supplied or tested with. There are number of other information obtained while generating and testing the model using WEKA. These will be discussed in detail in the case study in Chapter 5. In the next chapter we will see how other similar methodologies namely MELBA and Category partitioning compares to the ILUSTRATOR technique. Options: -C 0.25 -M 2 48 pruned tree personal == 5416 | age «= 64 | | married = true H ¢F7.0) | | married =false:L (119.0) | age> 64 | | age <=T4 | | [| married = true: ¥¢10.0) | | | married = false: P (14.0) | | age = 74.7 (53.0) personal > 5416 | blind = true: T (290.0) | blind = false | | ages=T4 | | | married = true: ¥ (5.0) | | | | married = false: P (5.0) | age = 74:7 (32.0) Number ofLeaves : 9 Size ofthe tree. 17 Time taken to build model: 0.29 seconds Time taken to test model on training data: 0.13 seconds === Error on training data === Correctly Classified Instances 605 100% Incorrectly Classified Instances 0 o % Kappa statistic 1 Mean absolute error a Root mean squared error 0 Relative absolute error ) Root relative squared error 0 % Coverage of cases (0.95 level) 100 % Meanrel. region size (0.95 leveh 20 % Total Number of Instances 605 === Confusion Matrix === abcde =-classified as Figure 16. Output for Tax Calculator 24 25 More detailed performance of the inference is provided after the tree size and number of nodes. Since the subject system we use has nominal class variables as final output, we are most concerned about the first two lines of the final section. The first line shows the number and percentage of instances that were correctly predicted. The nest line shows the misclassifications. Since the Fig is the output of the final iteration and since the model inferred was perfect, the number of correctly classified instances is 100% (all 605 out of 605). This was achieved as a result of teaching the model from its own mistakes through a series on continuous testing and retraining. During the first couple of iterations the number of misclassified instances varied from 20% to 70%. The number of misclassifications reduced at a steady rate as the number of training samples increased. The time taken to infer the model from the training sample provided also increased with increasing sample size. But the overall inference was quicker and not more than a minute. A Graph was plotted to show the time line of the inference and is shown in Fig. 17. Figure 17. Time Graph for Tax System The third line in the final section shows the Kappa statics, which measures the agreement of predictions with the actual class. This is not very informative as the static can have low values even when there are high levels of agreement. They are best appropriate in cases of testing whether the agreement exceeds the chance levels, i.e. the predicted and actual classes are correlated. We also obtained 100% coverage of all the cases. We also obtained the mean absolute error, and the root mean-squared error of the class probability estimates 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 5 10 20 35 55 80 110 145 185 230 280 335 395 460 530 605 T im e in S e co n d s No. Of Samples 26 assigned by the tree. The root mean squared error is the square root of the average quadratic loss [28]. The mean absolute error is calculated in a similar way using the absolute instead of the squared difference [29]. The relative error is calculated based on the prior probabilities i.e., if the inference is performed with some other learning algorithm. It gives the relative error comparing both the results obtained. The final part of the output shows a confusion matrix which describes the result of the experiment in an easy way. The columns represent the predictions, and the rows represent the actual class. It shows that 605 instances were correctly predicted. The correct predictions are always the diagonal running from left to right of the table. These cases are called the “True Positives”. In the table we can see 375 instances were classified in the Class value T, 19 instances in the class value P and so on. Since we do not have any misclassifications in our output there were no values in the rows. These values are also known as “False Positives”. When there are misclassification the confusion matrix is represented with them so has to give us a number for instances that were misclassified. One should in general adjust misclassification costs and threshold levels so that sufficient accuracy and sensitivity in the desired class is obtained [29]. Figure 18. Learning Curve for Tax Calculator 29 having to look into the code. Fig.19 shows the part of the tree obtained by applying the ILUSTRATOR technique. This was the closest inference of the perfect model as there were no contradictions when tested with random test samples. Two different tests (Test 1 and Test 2) were performed with random test samples. To see the techniques ability to infer models and correct misclassifications, we used different sized training and testing sets. During the test 1, we used training and testing sets with size 20 and during test 2, the size was increased to 50. Both the test took more than 20 iterations to infer a perfect model. Due to technical reasons when performing the tests and shortage of memory to store the training sample, the test had to be stopped after 25 iterations. But provided larger memory and a faster and technically advanced system, ILUSTRATOR will infer a perfect model. Figure 20. Learning Curve Credit Rating System The accuracy of the tree obtained after 4 iterations in both the tests were more than 97% and was increasing on a steady phase with increasing size of the training sample. The graph shows a learning curve that is obtained by iterations. The initial models inferred from the small training samples had minimum accuracy and the model was refined more after 200 odd training 30 samples i.e., Iterations 5 or 6. The model learned from its own mistakes and had a steeper learning process from there on. Fig. 21 shows a graph for the time taken to infer models from the training samples. Since this subject system was more complex and larger, the time taken to infer models was longer compared to Case study 1. ILUSTRATOR still managed to infer perfect models from training samples in less than a minute. Figure 21. Time Graph for Credit Rating System In both the case studies, ILUSTRATOR technique inferred perfect model that captured the behavior of the subject system. The training sample obtained at the end of inference was a finite set of test cases that exercised the system to the fullest. The scalability of the technique was under scrutiny but it performed well and the results obtained proved the same. We achieved 99.14% accuracy in the credit rating system and 100% accuracy in the Tax calculator system. The average time to infer a perfect model by retraining was less than 5 minutes. The next chapter discusses other related works that helped in refining the ILUSTRATOR further and compare key results obtained. -5 0 5 10 15 20 25 30 5 0 1 0 0 2 0 0 3 5 0 5 5 0 8 0 0 1 1 0 0 1 4 5 0 1 8 5 0 2 3 0 0 2 8 0 0 3 3 5 0 3 9 5 0 4 6 0 0 5 3 0 0 6 0 5 0 6 8 5 0 7 7 0 0 8 6 0 0 9 5 5 0 1 0 5 5 0 1 1 6 0 0 9 2 7 0 0 1 0 0 7 0 0 1 8 0 7 0 0 Ti m e in S ec o n d s No. Of Samples 31 Chapter 5 5. Related Work We will discuss two areas of work related to the ILUSTRATOR technique in this section, namely the MELBA methodology and Category Partitioning. We will also discuss results obtained from our technique and how our work differs from the others. 5.1. Category Partitioning Proposed by Thomas J. Ostrand and Marc J. Balcer in their paper “The category- partition method for specifying and generating functional tests” [31], it specifies rules to generate test suites from choices and categories. There are a number of research activities that have taken place using Category-Partition (CP) method. CP requires us to identify properties of a system that will affect the execution behavior of a system and hence its output. The basic idea is to identify the categories of a given system and their corresponding choices. For example in our BMI example, the properties may correspond to how is BMI value different for the different classes like Underweight and Overweight. This is called the “Category” and each category has a distinct set of “choices”. For example, taking how BMI value differs between Underweight and Overweight, the choices could be BMI >= 24 is Overweight and BMI <=19 is Underweight. In addition to categories and choices, CP also uses “Properties” and “Selectors” to be identified for identifying the various interdependencies between the choices and hence can be used to find impossible combinations of choices in all the categories [32]. Apart from categories and choices that are used to describe the inputs for the subject system, CP also requires the environment conditions that may affect the programs behavior like load on the network, contents of the Database etc. CP can be used to capture both functional and non-functional behavior. Following are the steps [31] involved in CP method: A. Analyze the specification – Identify parameters of individual functional units and their characteristics, objects in the environment that might affect the system and their characteristics. This is collectively known as the categories. B. Partition the categories into choices – Identify different cases that might occur in the above categories. 34 more contradictions identified in the rules learnt by the machine learning algorithm. The drawback of technique is that it assumes that the initial test suite or test specification used at the beginning of the technique is already executed and failures have been corrected. In short the initial test specification wouldn’t have any contradictions when tested. But as the training sample is added in the next iterations, failures can arise and faults can be detected. Though this is similar to ILUSTRATOR technique, another major pitfall of MELBA is that to define the categories and choices, the human tester has to have a certain degree of understanding about the subject system domain. The technique argues that this is however not a pitfall as the tester would anyway know these information when trying to re engineer the test suites. The paper also argues that there is no way one can reuse test suites without understanding the relationships between inputs and outputs. Though MELBA provides tool support to convert test cases into abstract test cases and provides heuristics to identify and improve potential contradictions, the methodology still requires a certain level of human intervention: the tester has to understand the system domain first and discover categories and choices so as to create an abstract test suite which is not the case in our technique. Some attempt to cover the possible pitfalls of MELBA and category partition method is done using our technique and are discussed in the next sections with some results obtained. 5.3. Our Results As discussed in the previous section, ILUSTRATOR tries to overcome some of the drawbacks of the other existing automated black-box testing techniques. We have applied the technique on a number of small and large scale complex systems and noted that the technique is flexible and the results are positive. The biggest advantage of the technique is that it is fully automated and does not require any human intervention. The model generation, training, testing and retraining misclassification are achieved by executing a series of batch files. Comparing to the MELBA methodology which requires a certain degree of human involvement, ILUSTRATOR proves to be far effective though the number of iterations are more. An iterative learning model that learns from its own mistakes using random test samples is more precise as the system is exercised to its full limits. This is because the test cases generated are random and are not based on the systems domain. When applying the MELBA methodology, extracting test cases from the model would only pave way to certain test 35 situations, arguably the most expected test set that the system is designed for. Since the test cases generated are random when applying our technique, it exercises the model inferred to a wide range of input data that the system may or may not be designed to handle. The contradictions that rise when testing the inferred model will give us the answer to what data could the system not handle. Narrowing the test samples to test the model will only result in exercising the system to limited possible inputs which is not a good testing technique. To summarize, the following are the key aspects of ILUSTRATOR that makes it different from the above related work: 1. It generates model from non-specific training samples which are generated randomly and improved iteratively. 2. Test cases are generated randomly which exercises the subject system to the fullest rather than testing it to specific test samples. 3. The model generated learns from its own mistakes and adds the misclassified samples to its training set and retrains itself. In short more the mistakes the model makes, more it learns. 4. The final training sample obtained once there are no more contradictions, acts as a finite set of test cases with a broader range of inputs that test the system to its full limits. 5. The entire technique is automated and does not require any human intervention hence removing the human component from the testing cycle. 36 Chapter 6 6. Conclusion This paper introduced ILUSTRATOR technique, a fully automated iterative methodology based on novel machine learning techniques that helps developers understand the program functionalities and test unfamiliar black-box systems effectively. The technique was based on iterative learning using random test generators. The technique infers models from initial training samples that are generated randomly and then tested with more random test cases. The model is perfected by retraining it with misclassifications that was predicted and testing it again, hence creating a virtuous loop of training and testing. The process comes to a halt when there are no more misclassifications and the model obtained is deemed as a perfect model- the closest representation of the subject system. We used C4.5 learning algorithm to infer models from training samples which are represented in the form of Decision Trees (DT). We have shown how ILUSTRATOR technique is applied on real scale systems (BMI calculator, Tax Calculator and the Credit Evaluation System) and evaluated the effectiveness of the test cases generated by them. The case study showed us how iterative learning improves the inference process and how the model corrects itself by learning from its own mistakes. We evaluated the results of each case study with learning graphs that were obtained by plotting the accuracy of inference with the number of training samples. We also evaluated the time taken for inference of a perfect model. The end product of each study was a finite set of test cases that the system was exercised on. These test cases along with the model inferred can be used to improve the understandability of the subject system without having to look into the code. We also discussed other similar black-box specifications like category partitioning and methodologies like MELBA based on them for generating test suites for unfamiliar systems. We compared results obtained from both the technique and pointed out some pitfalls that ILUSTRATOR overcame. The main advantage of ILUSTRATOR was that the entire process was automated and did not require any human tester component. Also since the test cases are generated using a random test generator the range of test cases produced is wider and the system is exercised to all possible values rather than specific test suites. This enables us to infer the systems behavior in detail and hence understand the systems functionality. However using random test case 39 [17] S. Muggleton, Learning from positive data, Proceedings of the Sixth International Workshop on Inductive Logic programming, 1997. [18] Ian H. Witten; Eibe Frank, Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann, San Francisco, 2005. [19] Attribute Relation File Format, April 1st, 2002, [online] http://www.cs.waikato.ac.nz/~ml/weka/arff.html [Accessed 15 March, 2011] [20] Quinlan, J. R. C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993. [21] Quinlan, J. R, Induction of Decision Trees, Machine Learning 1, 1986. [22] Ross J. Quinlan: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, Singapore, 343- 348, 1992. [23] S.B. Kotsiantis, Supervised Machine Learning: A Review of Classification Techniques, Informatics 31(2007) 249-268, 2007. [24] Ian H. Witten, Eibe Frank, Mark A.Hall, Data Mining: Practical Machine Learning Tools and Techniques (Third Edition) [25] Payam Refaeilzadeh, Lei Tang, Huan Liu, Arizona state University, Cross validation. [26] Kohavi, Ron. "A study of cross-validation and bootstrap for accuracy estimation and model selection". Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. [27] Sebastian Danicic, Chris Fox, Mark Karman, Rob Hierons, ConSIT: A Conditional Program Slicer. [28] Nikulin, M.S. (2001), "Loss function", in Hazewinkel, Michiel, Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104, http://eom.springer.de/L/l060900.htm [29] The Explorer Interface - Classification, 1999, [Online] http://wekadocs.com/node/13 [Accessed 4, April 2011] [30] UCI, Machine Learning Repository, Statlog (German Credit Data) Data Set, 1994, [Online] http://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Dat a) [Accessed 28 April, 2011] [31] Thomas J. Ostrand and Marc J. Balcer. “The Category-Partition Method for Specifying and Generating Functional Tests”, Communications of the ACM, 31(6), June 1988. [32] L.C. Briand et al., Using machine learning to refine Category- Partition test specifications and test suites, Inform. Softw.Technol. (2009), doi:10.1016/j.infsof.2009.06.006 40 Appendix 1. UK Income Taxation Calculator – Source Code main() { int age, blind, widow, married, income scanf("%d",&age); scanf("%d",&blind); scanf("%d",&married); scanf("%d",&widow); scanf("%d",&income); if (age>=75) personal = 5980; else if (age>=65) personal = 5720; else personal = 4335; if ((age>=65) && income >16800) { t = personal - ((income-16800)/2) ; if (t>4335) personal = t; else personal = 4335; } if (blind) personal = personal + 1380 ; if (married && age >=75) pc10 = 6692; else if (married && (age >= 65)) pc10 = 6625; else if (married || widow) pc10 = 3470; else pc10 = 1500; if (married && age >= 65 && income > 16800) { t = pc10-((income-16800)/2); if (t>3470) pc10 = t; else pc10 = 3470; } if (income <= personal) tax = 0; else { income = income - personal ; if (income <= pc10) tax = income / 10; else { tax = pc10 / 10; income = income - pc10; if (income <= 28000) tax = ((tax + income) * 23) / 100 ; else { tax = ((tax + 28000) * 23) / 100 ; income = income - 28000; tax = ((tax + income) * 40) / 100; } } } if (!blind && !married && age < 65) code = ’L’ ; else if (!blind && age < 65 && married) code = ’H’; else if (age >= 65 && age < 75 && !married && !blind) code = ’P’; else if (age >= 65 && age < 75 && married && !blind) code = ’V’; else code = ’T’; } . 41 2. German Credit Rating System – Source Code if (Status == "'>=200'") Lending = "'Good'"; else if (Status == "'no checking'") Lending = "'Good'"; else if (Status == "'0<=X<200'") { if (LoanAmount > 9800) Lending = "'Bad'"; else {if (Savings == "'>=1000'") Lending = "'Good'"; else if (Savings == "'no known savings'") Lending = "'Good'"; else if (Savings == "'500<=X<1000'") Lending = "'Good'"; else if (Savings == "''100<=X<500'") { if (Purpose != "'business'") Lending = "'Good'"; else { if (HousingType == "'own'") Lending = "'Good'"; else if (HousingType == "'for free'") Lending = "'Good'"; else { if (ExistingCredit <= 1) Lending = "'Good'"; else Lending = "'Bad'"; } } } else if (Savings == "'<100'") { if (Parties == "'co applicant'") Lending = "'Good'"; else if (Parties == "'guarantor'") { if (Purpose == "'new car'") Lending = "'Bad'"; else Lending = "'Good'"; } else if (Parties == "'none'") { if (Duration > 42) Lending = "'Bad'"; else { if (MartStatus == "'female single'") Lending = "'Good'"; else if (MartStatus == "'male single'") Lending = "'Good'"; else if (MartStatus == "'male mar/wid'") { if (Duration > 10) Lending = "'Bad'"; else Lending = "'Good'"; } else if (MartStatus == "'male div/sep'") Lending = "'Bad'"; else if (MartStatus == "'female div/dep/mar'") { if (Purpose == "'business'") { if (ResidenceSince > 2)

Documents

questions

Department of Computer Science University of Leicester ..., Exercises of Computer science

Related documents

Partial preview of the text