Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Homework 4 | Introduction to Artificial Intelligence | CS 5804, Assignments of Computer Science

Virginia Polytechnic Institute and State University (Virginia Tech)Computer Science

Prof. Narendran Ramakrishnan

Material Type: Assignment; Professor: Ramakrishnan; Class: Intro Artificial Intelligence; Subject: Computer Science; University: Virginia Polytechnic Institute And State University; Term: Fall 2006;

Typology: Assignments

Pre 2010

Uploaded on 02/13/2009

koofers-user-u58-2 🇺🇸

10 documents

1 / 1

Partial preview of the text

Download Homework 4 | Introduction to Artificial Intelligence | CS 5804 and more Assignments Computer Science in PDF only on Docsity! CS 5804: Homework #4 Assigned: October 16, 2006 Date Due: November 6, 2006 1. (100 points) For the dataset given in file chem.dataset, implement a boosting approach to learn to classify the given chemical reaction systems. • Given the handout of Mon, Oct 16, 2006, first gain an overall understanding of the dataset. Write some basic code to parse a chemical reaction signature and infer simple properties of it (that you will then use as input features to your learning algorithms). • Next, implement three machine learning algorithms: a decision tree algorithm with infor- mation gain as the tree growing metric, a decision stump algorithm, and an alternating decision tree algorithm. As described in class, you really only have to implement the first, the second and third are minor variants thereof. The deicision stump algorithm merely learns only one node (i.e., the root). The alternating decision tree algorithm is simply a way to compress multiple decision trees into a single structure, again via boost- ing. See: [Y. Freund and L. Mason, The Alternating Decision Tree Learning Algorithm, in Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), pages 124-133, 1999.] If you read this paper, you will notice that this algorithm already comes with boosting built in (see next point below). • Finally, implement the boosting algorithm. Use the AdaBoost algorithm given in your textbook for the decision tree and decision stump approaches. For the alternating de- cision tree approach, the boosting is multiplexed with the tree learning, as explained in the [Freund and Mason, 1999] paper. • Observe that the dataset has nearly 36,000 instances, but the distribution of positive and negatives is rather skewed. For instance, there are only about 10% positive examples! (What this means is that, by default, you can always predict negative and you get a weak learner!) You should split the given dataset yourself into training and test sets. This split must be made so as to keep the distribution of positives/negatives the same across training and test sets (only then the conditions of PAC learning theory apply). Learning from such skewed datasets is a common theme in practical machine learning. You might be interested in reading [M.V. Joshi, R.C. Agarawal, and V. Kumar, Pre- dicting Rare Classes: Can Boosting make any Weak Learner Strong?, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 297-306, 2002] just to understand the underlying context, not for the gory mathematical details. Track the performance of boosting as a function of the number of weak learners combined. Explain if/how the performance of boosting correlates with the choice of the underlying learning algorithm.

Documents

questions

Homework 4 | Introduction to Artificial Intelligence | CS 5804, Assignments of Computer Science

Related documents

Partial preview of the text