Download File Systems: Organizing Data on Secondary Storage - Prof. Geoffrey M. Voelker and more Study notes Computer Science in PDF only on Docsity! 1 Lecture 12: File Systems Geoffrey M. Voelker November 12, 2001 CSE 120 – Lecture 12 – File Systems 2 Last time we talked about the physical characteristics of disks and their performance Today we’ll talk about file systems Files Directories Sharing Protection File System Layouts File Buffer Cache Read Ahead 2 November 12, 2001 CSE 120 – Lecture 12 – File Systems 3 File systems Implement an abstraction (files) for secondary storage Organize files logically (directories) Permit sharing of data between processes, people, and machines Protect data from unwanted access (security) November 12, 2001 CSE 120 – Lecture 12 – File Systems 4 A file is data with some properties Contents, size, owner, last read/write time, protection, etc. A file can also have a type Understood by the file system » Block, character, device, portal, link, etc. Understood by other parts of the OS or runtime libraries » Executable, dll, souce, object, text, etc. A file’s type can be encoded in its name or contents Windows encodes type in name » .com, .exe, .bat, .dll, .jpg, etc. Unix encodes type in contents » Magic numbers, initial characters (e.g., #! for shell scripts) 5 November 12, 2001 CSE 120 – Lecture 12 – File Systems 9 Unix Directories implemented in files Use file ops to create dirs C runtime library provides a higher-level abstraction for reading directories opendir(name) readdir(DIR) seekdir(DIR) closedir(DIR) NT Explicit dir operations CreateDirectory(name) RemoveDirectory(name) Very different method for reading directory entries FindFirstFile(pattern) FindNextFile() November 12, 2001 CSE 120 – Lecture 12 – File Systems 10 !" Let’s say you want to open “/one/two/three” What does the file system do? Open directory “/” (well known, can always find) Search for the entry “one”, get location of “one” (in dir entry) Open directory “one”, search for “two”, get location of “two” Open directory “two”, search for “three”, get location of “three” Open file “three” Systems spend a lot of time walking directory paths This is why open is separate from read/write OS will cache prefix lookups for performance » /a/b, /a/bb, /a/bbb, etc., all share “/a” prefix 6 November 12, 2001 CSE 120 – Lecture 12 – File Systems 11 File sharing has been around since timesharing Easy to do on a single machine PCs, workstations, and networks get us there (mostly) File sharing is incredibly important for getting work done Basis for communication and synchronization Two key issues when sharing files Semantics of concurrent access » What happens when one process reads while another writes? » What happens when two processes open a file for writing? Protection November 12, 2001 CSE 120 – Lecture 12 – File Systems 12 File systems implement some kind of protection system Who can access a file How they can access it More generally… Objects are “what”, subjects are “who”, actions are “how” A protection system dictates whether a given action performed by a given subject on a given object should be allowed You can read and/or write your files, but others cannot You can read “/etc/motd”, but you cannot write it 7 November 12, 2001 CSE 120 – Lecture 12 – File Systems 13 # Access Control Lists (ACL) For each object, maintain a list of subjects and their permitted actions Capabilities For each subject, maintain a list of objects and their permitted actions rwrwCharlie r-wBob rw-rwAlice /three/two/one Subjects Objects ACL Capability November 12, 2001 CSE 120 – Lecture 12 – File Systems 14 $ % The approaches differ only in how the table is represented What approach does Unix use? Capabilities are easier to transfer They are like keys, can handoff, does not depend on subject In practice, ACLs are easier to manage Object-centric, easy to grant, revoke To revoke capabilities, have to keep track of all subjects that have the capability – a challenging problem ACLs have a problem when objects are heavily shared The ACLs become very large Use groups (e.g., Unix) 10 November 12, 2001 CSE 120 – Lecture 12 – File Systems 19 & Applications exhibit significant locality for reading and writing files Idea: Cache file blocks in memory to capture locality This is called the file buffer cache Cache is system wide, used and shared by all processes Reading from the cache makes a disk perform like memory Even a 4 MB cache can be very effective Issues The file buffer cache competes with VM (tradeoff here) Like VM, it has limited size Need replacement algorithms again (LRU usually used) November 12, 2001 CSE 120 – Lecture 12 – File Systems 20 * On a write, some applications assume that data makes it through the buffer cache and onto the disk As a result, writes are often slow even with caching Several ways to compensate for this “write-behind” » Maintain a queue of uncommitted blocks » Periodically flush the queue to disk » Unreliable Battery backed-up RAM (NVRAM) » As with write-behind, but maintain queue in NVRAM » Expensive Log-structured file system » Always write next block after last block written » Complicated 11 November 12, 2001 CSE 120 – Lecture 12 – File Systems 21 # Many file systems implement “read ahead” FS predicts that the process will request next block FS goes ahead and requests it from the disk This can happen while the process is computing on previous block » Overlap I/O with execution When the process requests block, it will be in cache Compliments the disk cache, which also is doing read ahead For sequentially accessed files, can be a big win Unless blocks for the file are scattered across the disk File systems try to prevent that, though (during allocation) November 12, 2001 CSE 120 – Lecture 12 – File Systems 22 & Files Operations, access methods Directories Operations, using directories to do path searches Sharing Protection ACLs vs. capabilities File System Layouts Unix inodes File Buffer Cache Strategies for handling writes Read Ahead 12 November 12, 2001 CSE 120 – Lecture 12 – File Systems 23 !) + Read Sections 6.3.8 and 5.4.1