Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

experiment design Scientific method in one minute, Study notes of Design

In our experiment no one surpassed 25 wpm after several hours of practice.” • Gain further insights, stimulate thinking and creativity. Back to our example. • ...

Typology: Study notes

2022/2023

Uploaded on 03/01/2023

carlick
carlick 🇺🇸

4.2

(11)

44 documents

1 / 10

Toggle sidebar

Related documents


Partial preview of the text

Download experiment design Scientific method in one minute and more Study notes Design in PDF only on Docsity! 1 Per Ola Kristensson Research Methods M.Phil. Advanced Computer Science University of Cambridge Michaelmas Term, 2009 Experimenting: experiment design Scientific method in one minute 1. Use experience and observations to gain insight about a phenomenon 2. Construct a hypothesis 3. Use hypothesis to predict outcomes 4. Test hypothesis by experimenting 5. Analyse outcome of experiment 6. Go back to step 1 Typical computer science scenario • A particular task needs to be solved by a software system • This task is currently solved by an existing system (a baseline) • You propose a new, in your opinion, better system • You argue why your proposed system is better than the baseline • You support your arguments by providing evidence that your system indeed beats the baseline Running example in this lecture • Text entry on a Tablet PC A. Handwriting recognition B. Software keyboard 2 Why experiments? • Substantiate claims – A research paper needs to provide evidence to convince other researchers of the paper’s main points • Strengthen or falsify hypotheses – “My system/technique/algorithm is [in some aspect] better than previously published systems/techniques/algorithms” • Evaluate and improve/revise/reject models – “The published model predicts users will type at 80 wpm on average after 40 minutes of practice with a thumb keyboard. In our experiment no one surpassed 25 wpm after several hours of practice.” • Gain further insights, stimulate thinking and creativity Back to our example • Why this experiment? – Despite decades of research there is no empirical data of text entry performance of handwriting recognition – An inappropriate study of handwriting (sans recognition) from 1967 keeps getting cited in the literature, often through secondary or tertiary sources (handbooks, etc.) – Based on these numerous citations in research papers, handwriting recognition is perceived to be rather slow – However, there is no empirical evidence that supports this claim Different kinds of experiments • Surveys • Field studies • Simulations and computational experiments • Controlled experiments • … and quasi-experiments, and many more… Controlled experiments and hypotheses • A controlled experiment tests the validity of one or more hypothesis • Here we will consider the simplest case: – One method vs. another method – Each method is referred to as a condition • The null hypothesis H0 states there is no difference between the conditions • Our hypothesis H1 states there is a difference between the conditions • To show a statistically significant difference the null hypothesis H0 needs to be rejected 5 Between-subjects design • Each participant is exposed to only one condition • One of the simplest experimental designs • Advantages: – No risk of confounds or skill-transfer from one condition to the other – Therefore no need to do counter-balancing or check for asymmetrical skill-transfer effects • Disadvantages: – Variance is not controlled within the participant – Therefore demands more participants than a within-subjects design to show a statistically significant difference Within-subjects design • Each participant is exposed to all conditions • One of the most common experimental designs in practice • Advantages: – Variance is controlled within the participant – Therefore requires fewer participants than a between-subjects design • Disadvantages: – More involved, requires counter-balancing of start condition to avoid transfer effects – Risk of asymmetrical skill transfer Mixed designs • It is also possible to combine within- and between-subjects experimental designs • Such designs are called mixed designs • These are difficult to design because they are more difficult to control • A mixed design can be a symptom of no clear set of hypotheses, or lack of ability to prioritise among them • Often a mixed design can be broken down into smaller studies that study isolated phenomena separately Single session vs. longitudinal • Do you believe participants will improve significantly over time? • If so, how much will they improve? • How are previous related studies set up in the literature? 6 Pilot study • A pilot study (sometimes called “formative study”) is a small study conducted before the actual controlled experiment • A pilot study may be designed as the controlled experiment and typically requires much fewer participants (perhaps only one participant) • A pilot study is important for many reasons: – Provides some idea of the feasibility that the null hypothesis will be rejected – Enables you to ensure the apparatus and software is working correctly – Enables you to ensure instructions to participants are clear – Can inform certain parameters of the controlled experiment, such as appropriate session length Explaining what you did • An experiment needs to be reproducible by others • It is your responsibility to ensure that you explained your experimental procedure in enough detail • Choices made in the experimental design needs to be motivated • This part of a research paper is typically referred to as the Method section Method • Participants • Apparatus • Procedure Participants • How many? • How is the sample constructed? – Is it representative of the population we believe will use the interface? – Are potential problematic confounds taken care off? • Did participants receive any compensation? • Was the study approved by the university ethics committee? [if applicable] 7 Participants, our example We recruited 12 volunteers from the university campus. We intentionally wanted a rather broad sample and recruited participants from many different departments with many different backgrounds. Six were men and six were women. Their ages ranged between 22-37 (mean = 27, sd = 4). Participants were screened for dyslexia and repetitive strain injury (RSI). Seven participants were native English speakers and five participants had English as their second language. No participant had used a handwriting recognition interface before. One participant had used a software keyboard before. No participant had regularly used a software keyboard before. Participants were compensated £10 per session. Participants, our example We recruited 12 volunteers from the university campus. We intentionally wanted a rather broad sample and recruited participants from many different departments with many different backgrounds. Six were men and six were women. Their ages ranged between 22-37 (mean = 27, sd = 4). Participants were screened for dyslexia and repetitive strain injury (RSI). Seven participants were native English speakers and five participants had English as their second language. No participant had used a handwriting recognition interface before. One participant had used a software keyboard before. No participant had regularly used a software keyboard before. Participants were compensated £10 per session. Apparatus • Which equipment and which software? – Needs to be described in sufficient detail to enable other researchers to replicate your experiment • Typical information: – Physical and logical screen size – Sensor device characteristics – CPU clock speed – Computer brand/model • Choices that are not obvious need to be motivated Apparatus, our example We used a Dell Latitude XT Tablet PC running Windows Vista Service Pack 1. The 12.1" color touch-screen had a resolution of 1280 × 800 pixels and a physical screen size of 261 × 163 mm. Participants used a capacitance-based pen to write directly onto the screen in both conditions. … Both the handwriting recognizer and the software keyboard were docked to the lower part of the screen. The dimensions of the software keyboard were 1266 × 244 pixels and 257 × 50 mm. The dimensions of the handwriting recognizer writing area measured 1266 × 264 pixels and 257 × 55 mm.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved