Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

The Role of Experimentation in Computer Science - Lecture Slides | CMSC 435, Study notes of Software Engineering

Material Type: Notes; Professor: Zelkowitz; Class: Software Engineering; Subject: Computer Science; University: University of Maryland; Term: Spring 2009;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-rqj
koofers-user-rqj 🇺🇸

10 documents

1 / 29

Toggle sidebar

Related documents


Partial preview of the text

Download The Role of Experimentation in Computer Science - Lecture Slides | CMSC 435 and more Study notes Software Engineering in PDF only on Docsity! 1 The role of experimentation in computer science Marvin Zelkowitz Notes based upon a talk 2/10/2009 1 given in 2002-2004 So what is science? How does science move from conjectures to established theories? 2/10/2009 Experimentation in CS 2 2 Essence of science “All we can ask of a theory is to predict the results of events that can be measured. This sounds like an obvious point, but forgetting it leads to the so-called paradoxes that popular writers of our culture are fond of exploiting.” d l h 2/10/2009 Experimentation in CS 3 – Leon Le erman, Nobe Laureate p ysicist Scientific theory a set of rules that relates quantities to observations we make an idea, model, or explanation that has been tested and accepted by the scientific community A good theory is characterized by 2/10/2009 Experimentation in CS 4 making predictions that can be disproved or falsified by observations 5 Scientific truth? Nothing said yet about “truth” in science. Science deals with obtaining the best predictions possible from observed data Future developments may change the Science doesn’t deal with truth. 2/10/2009 Experimentation in CS 9 underlying model as long as the observed relationships are maintained So where does computer science come in? Computer science needs to operate within hi i ifi d l f h f i t s sc ent c mo e o t eory ormat on and experimental validation Software engineering’s main laboratory is industrial developments – Investigating new technologies means to work with developers using new 2/10/2009 Experimentation in CS 10 technologies – Goal is to transfer new technologies to industry 6 Speed of tech transfer influenced by The nature of the communication channels used to increase awareness and knowledge of the technology The nature of the social system in which the potential user operates The extent of efforts to diffuse the technology throughout an organization The technology’s attributes – relative advantage – compatibility 2/10/2009 Experimentation in CS 11 – complexity – trialability – observability Delphi technique A roup of experts receives the specification plus an g estimation form. The experts discuss product and estimation issues. The experts produce individual estimates. The estimates are tabulated and returned to the experts. An expert is made aware only of his or her own estimate; the sources of the remaining estimates remain anonymous. 2/10/2009 Experimentation in CS 12 The experts meet to discuss the results. The estimates are revised. The experts cycle through steps 1 to 7 until an acceptable degree of convergence is obtained. 7 Perceived strengths and weaknesses of the Delphi Delphi technique -2 Technique Weaknesses Strengths can wrongly influence an individual and the impact of a dominant individual experts with different backgrounds/perspectives depends upon knowledge/expertise of individuals group discussion can correct mistakes risk of erroneous assumptions reconsideration group discussion made little difference to the result (consensus group) uses expert judgment 2/10/2009 Experimentation in CS 13 high variability in predictions median better than mean inappropriate target, should use for more detailed problems provides comparison with other estimates anonymity/independence combined with group benefits Technology transfer Players: Innovators Early adopters Early majority Late majority Early majority 34% Late majority 34%Early 2/10/2009 Experimentation in CS 14 Laggards adopters 13.5% Laggards 16% Innovators 2.5% Early market Mainstream market 10 But how does new technology get validated? Lots of technology development Rapid change today within our technological society But software failures are all too common Why such failures? We need research laboratories for software engineering 2/10/2009 Experimentation in CS 19 – NASA Software Engineering Laboratory (1976-2002) one such example Technology transfer experience from the NASA Software Engineering Laboratory Background of NASA/GSFC SEL: Began in 1976 to study software development Typical applications: ground support software for unmanned spacecraft Characteristics: Size from 10K to 500K source lines 1970-1990 :FORTRAN dominant language; 1990-2000; C and C++ Typically 10-15 people for 18--24 months Mi t f t t d t l 2/10/2009 Experimentation in CS 20 x ure o con rac or an governmen personne Over 125 projects; 500MB Oracle database Many studies of effects of process changes on development in SEL environment 11 NASA technology transfer process 2/10/2009 Experimentation in CS 21 Software engineering technology transfer Technology transfer is generally product oriented: − In most engineering disciplines, the process is centered in a product. Software engineering does not yet achieve that – Processes describing actions to take are as important as the tools that are used. For example, many of the technologies explored by the SEL are procedures only and not tools: − Object oriented technology 2/10/2009 Experimentation in CS 22 − Goals/Question/Metrics model − Measurement − Cleanroom − Inspections 12 Examples of technology infusion 2/10/2009 Experimentation in CS 23 Examples of transferred technologies Survey of software professionals – What 10 technologies (out of a list of over 100) have helped your productivity the most? TOTAL REPLIES 44 FROM NASA 12 Workstations,pcs 27 Object oriented 12 Object oriented 21 Networks 10 GUIs 17 Workstations,pcs 8 Process models 16 Process models 7 Networks 16 Measurement 5 C and C++ 8 GUIs 4 CASE tools 8 Structured design 3 2/10/2009 Experimentation in CS 24 Databases 8 Databases 2 Desktop publish 8 Desktop publish 2 Inspections 7 Development meth 2 Email 7 Reuse 2 Measurement 6 Cost estimation 2 Comm. Software 2 15 Classes of methods Controlled method – Multiple instances of an observation in order to provide for statistical validity of the results . (Usually an active method.) Observational method – Collect relevant data as it develops. In general, there is relatively little control over the development process. (Weakly active, although may be passive.) Historical method – Collect data from completed projects. (Passive methods.) 2/10/2009 Experimentation in CS 29 These three basic methods have been classified into 12 data collection models. (We will also consider one theoretical validation method, yielding 13 validation methods) Controlled methods Replicated – Several projects are observed as they develop (e g in industry) in order to determine the effects of the . ., independent variable. Due to the high costs of such experiments, they are extremely rare. Synthetic environments – These represent replicated experiments in an artificial setting, e.g., often in a university. Dynamic analysis – The project is replicated using real project data. Simulation – The project is replicated using artificial project 2/10/2009 Experimentation in CS 30 data. The first 2 of these generally apply to process experiments while the last two generally apply to product experiments. 16 Observational methods Project monitoring – Collect data on a project with no d f h b d d preconceive notion o w at is to e stu ie . Case study – Data collected as a project develops by individuals who are part of the development group. (Often used in SEL.) 2/10/2009 Experimentation in CS 31 Field Study – An outside group collects data on a development. (A weaker form of case study.) Historical methods Literature search – Review previously published papers in order to arrive at a conclusion. (e.g., Meta-analysis - combining results from separate related studies) Legacy data – Data from a completed project is studied in order to determine results. Lessons-learned data – Interviews with project personnel and a study of project documentation from a completed project can be used to 2/10/2009 Experimentation in CS 32 determine qualitative results. (A weak form of legacy data.) Static analysis – Artifacts of a completed project are processed to determine characteristics. 17 But list of methods is incomplete Assertions: What do software engineers often do? F h l lid i f i f– or a new tec no ogy va at on o ten cons sts o : “I tried it and I like it” – Validation often consists of a few trivial examples of using the technology to show that it works. – Added this validation as a weak form of case study under the “Observational Method:” Assertion – A simple form of case study that does not meet 2/10/2009 Experimentation in CS 33 rigorous scientific standards of experimentation. Theoretical validation – A form of validation based upon mathematical proof. Summary of validation methods Summary: 13 methods – 11 experimental methods – assertion (weak experimental validation) – theoretical validation 2/10/2009 Experimentation in CS 34 20 Lessons learned Static analysis 1995 1990 Classification of 612 papers Replicated Synthetic Dynamic analysis Simulation Project monitoring Case study Assertion Field study Literature search Legacy data 1985 2/10/2009 Experimentation in CS 39 0% 5% 10% 15% 20% 25% 30% 35% 40% Not applicable Theoretical No experimentation Quantitative observations Most prevalent validation mechanisms were lessons learned and case studies each about 10% , Simulation was used in about 5% of the papers, while the remaining techniques were each used in under 3% of the papers About one-fifth of the papers had no experimental validation Assertions (a weak form of validation) were about one-third f th 2/10/2009 Experimentation in CS 40 o e papers But percentages of no experimentation dropped from 26.8% in 1985 to 19.0% in 1990 to only 12.2% in 1995. (Perhaps a favorable trend?) 21 Qualitative observations We were able to classify every paper according to our 13 categories although somewhat subjective (e g assertion , . ., versus case study). Some papers can apply to 2 categories. We chose what we believed to be the major evaluation category. Authors often fail to clearly state what their paper is about. Its hard to classify the validation if one doesn't know what is being validated. A th f il t t t h th t lid t th i 2/10/2009 Experimentation in CS 41 u ors a o s a e ow ey propose o va a e e r hypotheses. Terms (e.g., experiment, case study, controlled experiment, lessons learned) are used very informally. Major caveat The papers that appear in a publication are fl d b h d f h bl in uence y t e e itor o t at pu ication or program committee. The editors and program committees from 1985, 1990, and 1995 were all different. This then imposes a confounding factor in our analysis process that may have affected our outcome. 2/10/2009 Experimentation in CS 42 22 Overall observations Many papers have no experimental validation at all (about one fifth) but fortunately this number seems to be - , , dropping. BUT too many papers use an informal (assertion) form of validation. Better experimental design needs to be developed and used. Lessons learned and case studies each are used about 10% of the time, the other techniques are used only a few percent at most 2/10/2009 Experimentation in CS 43 . Terminology of how one experiments is sloppy. We hope a classification model, such as this one, can help to encourage more precision in the describing of empirical research. Comparison to other fields We decided to look at several other disciplines for comparison, An informal study. No attempt at choosing the “b t” j l i h fi ldes ourna n eac e . Journals: J 1 – Measurement Science and Technology, (Devices to perform measurements) J 2 – American Journal of Physics, (Theory and application of new physical theories) J 3 – Journal of Research of NIST, (Research on measurement and standardization issues) 2/10/2009 Experimentation in CS 44 J 4 – Management Science, (Queueing theory and scheduling problems) J 5 – Behavior Therapy, (Clinical therapies) J 6 – Journal of Anthropological Research, (Study of human cultures) 25 Basic results Validation rose from 29% to 70 65% No experimentati on dropped from 27% to 16% Assertions dropped from 35% to 19%10 20 30 40 50 60 Pe rc en t Validation Assertion Theory No experimentation Trend continues to improve 2/10/2009 Experimentation in CS 49 0 1985 1990 1995 2000 2005 Year Why doesn’t industry “buy” this validation? Industry: l f h l lIgnores resu ts rom arc iva journa s Believes in unsubstantiated rumors Research community: Doesn’t require validation Doesn’t perform validations as thorough as 2/10/2009 Experimentation in CS 50 necessary There is a “disconnect” between these 2 cultures 26 Industrial methods Method Type Method Type Case study Observ Literature search Hist Demonstrator projects Contrl Pilot study Contrl Education Hist Project monitoring Observ External Informal Replicated project Contrl Expert opinions Hist Synthetic benchmark Contrl Feature benchmark Hist Theoretical analysis Formal 2/10/2009 Experimentation in CS 51 Field study Observ Vendor opinion Informal Legacy data Hist Based on paper by Binkley, Wallace and Zelkowitz Industrial methods-1 Additional methods often used by industry: E t i i th i i f t Thi xper op n on – use e op n on o exper s. s can take the form of hired consultants brought in to teach a new technology or attendance of a trade show where various vendors demonstrate their products. Edicts – changes required by an outside agent. 2/10/2009 Experimentation in CS 52 Feature analysis – a study of the features of the new technology and a subjective evaluation of its impact on the development process. Often used to compare two alternatives. 27 Industrial methods - 2 Compatibility studies – studies used to test whether various technologies can be combined or if they interfere with one another. model problems – narrowly defined problems that the technology can address. demonstrator study – scaled-up application development, with some attributes (e.g., f d t ti ) d d i d 2/10/2009 Experimentation in CS 53 per ormance, ocumen a on re uce n or er to limit costs or development time. Pilot study - This is a full-scale implementation using the new technology. Relationship between methods Research exploratory methods Industrial confirmatory methods Assertion Vendor opinion Case study Case study Dynamic analysis Synthetic benchmarks Field study Field study Legacy data Legacy data Lessons learned Expert opinion Literature search Literature search Project monitoring Project monitoring Replicated Replicated project Simulation Pilot study 2/10/2009 Experimentation in CS 54 Static analysis Feature benchmark Synthetic Demonstrator projects Theoretical analysis Theoretical analysis None Education None External
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved