Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

The Music Notepad: A System for 2D Gestural Input in Music Notation, Papers of Computer Science

The music notepad is a computer system designed for entering music notation using 2d gestural input. It aims to replicate the look-and-feel of sketching music with paper and pencil, allowing users to create and edit notation, format scores, and synthesize instrumental sounds. The system interprets user gestures with a stylus to create notation and perform editing operations.

Typology: Papers

Pre 2010

Uploaded on 11/08/2009

koofers-user-kjx
koofers-user-kjx 🇺🇸

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download The Music Notepad: A System for 2D Gestural Input in Music Notation and more Papers Computer Science in PDF only on Docsity! The Music Notepad Andrew Forsberg, Mark Dieterich, and Robert Zeleznik Brown University Department of Computer Science Providence, RI 02912 (401) 863-7693; fasf,mkd,bczg@cs.brown.edu ABSTRACT We present a system for entering common music notation based on 2D gestural input. The key feature of the system is the look-and-feel of the interface which approximates sketch- ing music with paper and pencil. A probability-based inter- preter integrates sequences of gestural input to perform the most common notation and editing operations. In this paper, we present the user’s model of the system, the components of the high-level recognition system, and a discussion of the evolution of the system including user feedback. KEYWORDS: user interface, interaction, music notation, gestural input, gesture recognition, handwriting recognition, direct displays. INTRODUCTION There are a number of situations that revolve around infor- mally notating music 1, such as when a composer wants to jot down an idea, when a teacher explains theory to students, or when a musician wants to visualize (i.e., notate) a musical idea. Despite the many advantages of applying computers to music notation (e.g., for synthesizing sound, neatly format- ting and rendering notation), people in fact resort to using just paper and pencil for many tasks even when computer so- lutions are available. This paradox derives from the nature of the interfaces for typical music applications. Most computerized music notation systems employ standard windows, icons, menus, and point-and-click (WIMP) user in- terfaces (UI’s) as well as a transition to keyboard “hotkeys” for frequently used functions. In some cases, these systems offer advantages over paper and pencil notation such as rapid data entry, editing flexibility, automatic formating, synthe- sized sound, and high-quality printing. However, the user’s model for computerized systems is very different from the model of paper and pencil notation. Based on discussions 1See [1] and [13] for further information on musical terms and common music notation with a number of musicians and composers we believe a fun- damentally different music notation interface based on a pen- based UI will be more desirable and of equal or greater value than a WIMP-based UI. The Music Notepad attempts to be an interactive electronic sheet of music paper. Unlike WIMP UIs, the Music Notepad is characterized by a portable display surface (a Wacom PL- 300 Display Tablet) that can be directly drawn upon with a stylus. To support what can be done with pencil and paper interfaces, the Music Notepad interprets gestures specified by the user with a stylus to create notation. Moreover, ges- tures can also be used to perform more powerful editing op- erations, to professionally format notation, and to synthesize instrumental sounds based on the notation. The following sections present previous work, the user’s model of the system, the details of our recognition methodology, and a discussion of the formative design of the system through user feedback. PREVIOUS WORK There are two common approaches to music notation: pa- per and pencil, and computer-based systems. Paper and pen- cil has many advantages– notably low cost, simplicity, and portability. Music can be notated by drawing symbols on inexpensive paper. Sheets of paper can be copied and dis- tributed very easily. However, producing high quality pub- lishable documents by hand requires great skill and editing operations that are difficult to perform. Other desirable con- cepts such as automatically performing written notes are not possible. 2 There are two main flavors of computer-based systems: se- quencer and notation systems. The goal of sequencers is to enter and perform synthesized music, whereas notation programs are intended only to produce high-quality printed scores. Both are successful in addressing some aspects of the problems of paper and pencil systems such as improved edit- ing operations (e.g., editing individual or groups of symbols and transposing), and synthesizing or printing a high-quality version of the music that has been entered. 2Written music can be performed by one or more skilled performers. However, there are often significant barriers to becoming a skilled performer such as years of practice and expense. However, the handwriting techniques for creating standard music notation [13] learned by many musicians bears little resemblance to the music software UIs. Instead, they tend to use the well-established WIMP-based UIs. These UIs are sometimes augmented with a MIDI input device such as a pi- ano keyboard. The primary advantage of a WIMP interface is its simplicity and learnability. In addition, many applications such as Finale [6] or Cakewalk [3] provide mechanisms that allow users to transition from using the WIMP interface to using keyboard “shortcuts.” Shortcuts are typically learned over time by displaying each shortcut key next to the equiv- alent WIMP interface command. Gradually, users tend to transition to using only the shortcuts resulting in very fast, although indirect, user input. The use of MIDI devices for input to a notation system at first seems appropriate, but is still not ideal because nearly all input from MIDI devices requires editing. This problem is rooted in the need to be skilled performer of a particular MIDI device. There have been several research systems for music notation. The Mockingbird system [12] was a pioneer in the use of a graphical UI and MIDI keyboard. However, because this system relies heavily on MIDI keyboard input nearly all input requires skilled performance and editing. The system also depends on a WIMP interface-style to edit notation. Buxton [2] developed a system which included a set of ges- tures for specifying notes and rests (see Figure 1). While this is an effective set of gestures for very basic note entry, the system does not provide gestures for other fundamental notations such as stem direction, accidentals, and beams. Al- though we have integrated Buxton’s gestures in the Music Notepad, there are some situations where the gesture scheme is cumbersome. For example, multiple short notes often ap- pear in long sequences, but the gesture for creating short notes is unfortunately relatively complicated. Figure 1: A set of gestures developed in [2] for creating notes of various durations. Rests are created by mirroring these same gestures around the horizontal axis. A similar gestural component is embedded in an otherwise conventional WIMP interface in both the NoteWriter and the NoteAbility systems [14]. These systems provide simple gestural alphabets that are related to Buxton’s gestures both in functionality and limitations. As we prepared the final version of this paper, we learned about a similar system GSCORE [17]. GSCORE provided both WIMP based and gestural based methods for music no- tation. Although both systems share many comparable fea- tures, the Music Notepad provides unique functionality, such as allowing the user to retain a “non-finished” look (making it appear closer to a pen and paper look), extensive support for editing notations, and score playback. SYSTEM DESCRIPTION This section describes the user model of the system followed by the details of the components used to integrate the various types of input. drawn gestures interactive playback sheet manipulation moving symbols Figure 2: The stylus has four buttons for controlling all Notepad operations. User Model Music Notepad is intended to appear to the user as an interac- tive sheet of music paper. Thus the user accesses all function- ality gesturally with a stylus. There are four classes of gestu- ral operations that correspond to the four buttons of the stylus (see Figure 2). Marking gestures, drawn with just the tip of the stylus, leave ink trails on the display; sequences of these gestures are interpreted as either handwritten commands or as operations for creation and deletion of musical notations, marking menu [9] invocation and selection, as well as region selection. The lower button of the pen is used to perform di- rect manipulation operations for changing note pitches and graphical placement of symbols. The second lowest button allows the user to slide the “music paper” across the display screen. Finally, the eraser button of the pen is used for play- back of the entire score or for interactive playback of regions of musical notation. Creating notation symbols The most basic operation in the Music Notepad is the creation of notes. Users can create notes using the gestures shown in Figure 3. Gestures convey both spatial and symbolic information. Note creation ges- tures consisting of only distinct line segments are centered on the first point of the gesture. Creation gestures involving drawn noteheads (called “scribbled” noteheads) are placed the Calligrapher [4] system for handwriting recognition, and the In-Cube speech recognition system [7]. Additional sup- port code translates the interpreted input to our token data structures which are then posted to the accumulator. DISCUSSION and USER FEEDBACK The design of the Music Notepad has undergone a number of iterations guided by formative evaluations by small groups of musicians and composers. The following discussion presents some of the results of our user experiences and a description of how our designs changed in response to that feedback. A variety of user gestures Although all users were instructed with both a demonstra- tion and a description of the gestures used in the system, we found wide variation between individual performances of the same gesture. Figure 8 illustrates some of the varieties of four common, simple gestures. Figure 8: Different user styles for drawing the same ges- ture. Clockwise from top left: Erasing notes by squiggling on them, selecting notes with a lasso, scribbling to create noteheads, erasing notes with a single gesture lasso. In response to the range of gesture styles, we were able to re- design our gestures and develop more robust gesture recog- nition algorithms. By analyzing the actual drawn gestures from a number of users, we identified what types of features were important in different situations and used this informa- tion to fine tune the recognition probabilities in specific ges- ture RECOGs. Based on informal interviews of users who had tried the Mu- sic Notepad, we found no negative reactions to learning and using the gestural style of interaction. In addition, some users, including musicians, indicated they would prefer us- ing a completed version of the Music Notepad over existing alternatives. The PalmPilot reflects similar learnability issues and has gained widespread acceptance despite its use of the idiosyncratic grafitti alphabet and the time required to mas- ter it. Based on these reactions and the similar idiosyncratic style of gesturing between the two systems, we believe the Music Notepad would have similar acceptance. Accurate placement of symbols Some music notation symbols such as noteheads must be ac- curately positioned. If the user specifies a position with a sin- gle mouse sample (e.g., by pointing and clicking), we found they often misplace the notehead. Since users want staffs to be drawn at a standard printed size, the spacing between staff lines is relatively small. Consequently, as predicted by Fitt’s law, users have increasing difficulty in placing notes accu- rately as they work faster. Each time a note is misplaced, the user must perform at least one additional editing operation to correct the mistake. In response to our early user experiences with gestures like those of Buxton (see Figure 1), we developed an alternate method for entering notes. With this method, users position a notehead by “scribbling in” a gesture that looks like a note- head. There are several differences in this method for enter- ing notes. First, this gesture is more accurate than the point- and-click approach because the position of the note is speci- fied by the average position of the multiple samples defining the scribbled notehead. Second, this gesture maps directly to how a notehead is drawn on paper. Third, since this gesture can act as the image for a notehead, we can avoid the distrac- tion of replacing the drawn gesture with a different image. Last, this technique can be slower than the point and click technique and it does not convey the note duration. Visual Representation In addition to supporting What-you-see-is-what-you-get (WYSI- WYG), the Music Notepad also supports What-you-see-is- what-you-entered (WYSIWYE). The goal of WYSIWYE is to minimize the time a user spends understanding the ef- fects of an action. There are two instances of WYSIWYE in the Music Notepad: sketchy noteheads and delayed auto- formatting. Sketchy noteheads When notes are entered by sketching a notehead, the notehead is represented by exactly the line the user sketched instead of replacing the line with a perfect notehead. Delayed auto-formatting In our evaluation of existing mu- sic software systems, we found that each one reformats some subset of the notation every time a new symbol is created. While automatic reformatting is a useful feature, it can also be distracting– especially since the look of the document is often irrelevant during informal music entry. In response to the reaction of some users, we delay this automatic format- ting until the user specifically requests it. A major issue with WYSIWYE is ensuring that gestures are initially interpreted correctly. If not, this may lead to refor- matting errors that must be painstakingly located by the user. Filled and hollow notes, for example, can be difficult to dis- tinguish even by humans from a sketched notehead. We think it may be effective to incorporate feedback for marking am- biguous notes in a similar way to Microsoft Word’s “squiggly underline” technique for highlighting misspelled words. Marking Menus and Direct Input There are many advantages to a direct draw environment, however, one disadvantage is that the user’s hand can block part of the display. This is a problem when using mark- ing menus in the traditional manner because candidate menu items are obstructed by the user’s hand. We propose two solutions to this problem: first, allow the user to lift their hand and the stylus tip from the tablet to view the choices of the radial menu. After viewing the choices, the user can select an item by touching the item with the sty- lus tip or cancel the operation with a different gesture. The second solution is not to display items underneath the user’s hand. This requires sensing where the user’s hand is and might be accomplished with a Wacom tablet by using the data that reports the orientation of the stylus. Mouse versus Stylus Input In order to validate the need for a stylus-based gestural in- terface, we also prototyped our system using a three-button mouse for input. We found that although some aspects of the mouse-based interface proved beneficial (e.g., the user’s hand does not occlude the display), users considered most gestural interactions to be more difficult. Users had particu- lar difficulty drawing gestures with the mouse that involved curved lines, especially handwriting and lassoing. Despite these difficulties, users found that the character of pencil and paper sketching was still preserved. FUTURE WORK There are many areas of future work for the Music Notepad:  approaches to learning the gestural interface  accounting for many different styles of drawn gestures  more extensive use of natural speech  apply framework to other 2D applications  incorporate MIDI keyboard interface, voice input  greater music functionality in order to perform user studies supporting multiple voices per staff visual management of score better playback that incorporates notation / dynamics  user studies comparison with traditional GUI (e.g., paper and pen- cil and Finale)  apply to specific area of music: e.g., jazz CONCLUSIONS The Music Notepad demonstrates a paper and pencil look- and-feel to a powerful computer music notation engine. Thus users can informally jot down music as well as edit, profes- sionally format, and synthesize the music. Although the sys- tem is still incomplete, it has benefited from multiple stages of formative evaluation and development. Musicians and composers provided feedback that was used to redesign our gesture recognition algorithms and to improve access to the available functionality. ACKNOWLEDGMENTS This work is supported in part by the NSF Graphics and Visu- alization Center, Advanced Networks and Services, Alias/Wavefront, Autodesk, Microsoft, Sun Microsystems, and TACO. REFERENCES 1. Ammer, C., “The Harper Collins Music Dictionary,” New York: Harper Peremial, 1991. 2. Buxton, W., Sniderman, R., Reeves, W., Patel, S., and Baecker R., “The Evolution of the SSSP Score Editing Tools,” Computer Music Journal, Issue No. 12, 3:(4), pp. 14-25, 1979. 3. Cakewalk Pro Audio, Cakewalk, Inc., http://www.cakewalk.com/. 4. Calligrapher, ParaGraph International, Inc., http://www.paragraph.com/. 5. “Common Music Notation,” A free western music no- tation package written in Common Lisp, The Stanford University Center for Computer Research in Music and Acoustics (CCRMA), http://ccrma-www.stanford.edu/. 6. Finale, Coda Music Technology, http://www.codamusic.com/. 7. The In-Cube User Guide, available from Command Corporation, Inc., Atlanta, GA, 1997. 8. Johnston, M., Cohen, P. R., McGee D., Oviatt, S. L., Pittman, J. A., Smith I., “Unification-based Multimodal Integration”, 35th Annual Meeting of the Assoc. for Computational Linguistics and 8th Conference of the European Chapter of the Assoc. for Computational Lin- guistics, Madrid, Spain, July 7-12, 1997. 9. Kurtenbach, G. and Buxton, W. “User learning and performance with marking menus,” In Proceedings of ACM CHI ’94 Conference on Human Factors in Com- puting Systems, pp. 258-264, 1994. 10. MacKenzie, S., Sellen, A., and Buxton, W., “A com- parison of input devices in elemental pointing and drag- ging tasks,” ACM SGICHI - Human Factors in Comput- ing Systems, pp. 161-166, 1991. 11. Mackinlay, J. D., Robertson, G. G., Card, S. K., “The Perspective Wall: Detail and Context Smoothly Inte- grated,” In Proceedings of ACM CHI’91 Conference on Human Factors in Computing Systems, pp. 173-179, 1991. 12. Maxwell, J. T. and Ornstein, S. M., “Mockingbird: A Composer’s Amanuensis,” Available as technical report from Xerox Corporation, January 1983. 13. McGrain, M., “Music Notation,” Berklee Press Publi- cations, Milwaukee, 1986. 14. Personal conversation with Keith Hamel, Opus1 Music Software, http://debussy.music.ubc.ca/ opus1/. 15. Rekimoto, J., “Pick-And-Drop: A Direct Manipula- tion Technique for Multiple Computer Environment,” In Proceedings of UIST ’97, pp. 31-39, October, 1997. 16. Rubine, D., “Specifying Gestures by Example” In Pro- ceedings of ACM SIGGRAPH ’91, pp. 329-337, July, 1991. 17. Rubine, D., The Automatic Recognition of Gestures, PhD Thesis, School of Computer Science, Carnegie Mellon University, December, 1991. 18. M.T. Vo and C. Wood, “Building an Application Framework for Speech and Pen Input Integration in Multimodal Learning Interfaces,” In Proceedings ICASSP’96, Atlanta, GA, May 1996.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved