Notes 7
Secondary Memory Devices --- Tapes, Disks, and
Drums
TABLE: Overview of Storage Media
Organization of a Magnetic Tape:
- Tapes are suitable only for sequential files.
- Magnetic tapes for use as secondary memory typically have nine tracks: eight data tracks and
one parity track.
- Older tapes may have only eight tracks total.
- Other layouts are possible, but at best uncommon.)
- In each location of the tape:
- Eight data tracks store a byte of data.
- The ninth track, the parity track, stores a 0 or 1 according as the number of 1's in the data
tracks is even or odd (assuming even parity --- with odd parity, the situation is reversed).
- There may also be a "longitudinal" or "horizontal" parity bit at the end of each block, or the
end of the tape.
- Physical organization:
- The tape has physical markers at both start (load-point marker) and end (end-of-tape
marker), but is otherwise physically homogeneous.
- There is a dedicated block at the start of the tape for volume label and similar information,
and there may be a label block at the end with directory information.
- The rest of the tape consists of a number of files (plus possibly some unused space, usually at
the end).
- File organization:
- Each file in turn consists of a file header and file trailer label block (see Miller for contents),
with alternating data blocks and gaps.
- The gaps allow the tape to stop and restart after reading a block, by inserting unused space
which will be traversed as the tape slows down, and then speeds back up to operating speed.
- In the normal case, in which the data blocks are of uniform length, no block-header
information is needed.
- It is also possible to have variable-length blocks at the cost of additional overhead (a header
per block).
Tape parameters:
- A tape has the following parameters:
- space parameters:
- density (bytes/inch)
- gap size (inches, typically 0.5-1.0)
- length (inches)
- header/trailer size (bytes*)
- file header/trailer size (bytes*)
- time parameters:
- transfer rate (inches/sec)
- stop time
- start time
- fly time (secs)
- other parameters:
- parity (even/odd*)
These can be used for time and space computations as indicated in both texts. (We will ignore the
factors labelled * in these computations.)
The stop time is the time it takes for the tape to slow from running speed to a stop; the
start time is the time it takes to speed back up from a stop to running speed. To access a
block if we stop between each pair of blocks requires a start, transfer of the data, and a stop. The
fly time is the time it takes to cross the inter-block gap at running speed, possibly as a
result of a skip command (see below).
A given homogeneous file on a given tape also has the following derived parameters:
- blocking factor (integer)
- block size (bytes)
- data density (percent)
Disks and Drums:
In a very loose sense, the difference between a disk and a drum is the difference between the
early Edison cylinders, and more recent records or CDs.
- Data, however, is organized in concentric (for disks) or parallel (for drums) regions
("tracks"), not as a single thread of data as in either of the two musical media
- If the medium were organized in the latter fashion, disks and drums would, like tapes, largely
be capable of providing only sequential files and processing.
- The tracks themselves are frequently formatted into sectors of fixed size.
- There is, moreover, one other key difference between a disk and a phonograph or CD
platter:
- Both old-fashioned Edison cylinders and drums have only a single surface, the outside of the
cylinder.
- Phonograph records and music CDs are single-platter media. Records use both sides, while
CDs currently use one side only.
- "floppy" disks and portable CDs tend to have one physical platter, and either one or two
usable surfaces.
- Disks in general consist of a set of parallel platters connected by a central post (see
illustrations in texts), often as many as 10.
- In most cases, both sides ("surfaces") of a platter can be written.
- In disks with a large number of platters, frequently one or both of the external surfaces are
not written for reasons of safety.
- For similar reasons, the outer few tracks are likewise not written.
- Even optical (CD-related) media standardly used both sides of a platter.
- Since an inner track has less area than an outer track, but tracks are all of the same width and
must hold the same amount of data, the further toward the center of the disk, the higher the data
density.
- This (moreso than even the need to support the central post) renders the inner third or so of
each surface unusable.
- Likewise, we can see why platters are not extended forever --- the data density toward the
rim would be exceptionally low.
- Drums, on the other hand, have uniform density --- one of their attractions.
- There are typically 50--1000 usable tracks on a surface of a disk (the number on a drum is
more variable).
- Frequently several tracks (usually the outermost or innermost data tracks) are not part of
basic addressable space, but can be used to remap bad blocks or bad tracks
- One track may also be set aside for system information.
Fixed and Movable disks and read/write heads:
- Information on a disk or a drum is accessed by a read-write head, located a small
distance away from the track containing the data.
- Disks and drums are either fixed or (re)movable --- the name reflects a number of related
design issues.
- Movable disks typically have one read-write head per surface, which must be moved to the
appropriate track (inward toward the center or out toward the edge) to access data.
- The read-write arm typically moves all of the heads at once.
- Since the arm has been designed to move, it's relatively easy to design in the capacity to
swing the arm free, allowing the disk to be removed.
- The downside is that moving the arm takes a significant amount of time.
- In addition, it's in general still not as easy to remove and store or transport a multi-platter
disk as for a floppy disk or a tape, which is one significant reason why tapes are still in wide use.
- Fixed disks/drums, in contrast, have one read-write head per track.
- The read-write arm doesn't move at all once the disk is installed, and it's a major job to
remove a fixed disk.
- Fixed disks tend to have fewer platters than movable disks.
- For drums, data density is constant (why?), and the drum can pretty easily be removed
regardless of the arrangement of read-write heads (again, why?).
- Nonetheless, drums may not use a fixed arm, but instead a movable arm with a number of
heads capable of reading consecutive tracks, so that the head has to move less frequently.
- Drums are probably used less frequently than they were 15 years ago.
Use of disks and drums:
- All three of these alternatives (drum, fixed disk, removable disk) support all of the file
organizations we have discussed, are are subject to roughly the same constraints.
- For speed and ease, the general rule is: drum beats fixed beats removable, but of course
costs and/or physical space are in reverse order.
- In general:
- Drums would be used for moderately sized files which need frequent and quick access, and
change but change slowly and without changing much in size (such as systems files or compilers).
- Fixed disks are used for long-lived applications, for files for which access time is significant.
- Removable disks are used when access time is less important, for very large files, or when an
application will not be accessed as frequently (allowing for the possibility of changing disks).
Cylinders:
- On movable disks, the cost of moving the arms is large enough that it is preferable to read as
much data as possible from the same track location.
- Thus, the preferred layout for a file is not as consecutive tracks on a single surface, but in the
same track location on adjacent surfaces.
- A cylinder on a disk consists of the same track on each surface; a sector on a disk is
addressed as
( cylinder, track, sector ).
- This is still true for fixed disks, but the gain is not so dramatic. Since drums have only a
two-dimensional data space, there is nothing like a cylinder on a drum.)
Fixed sectors versus blocks:
- On most disks, the tracks are organized into a fixed number of sectors of fixed size.
- On the plus side, this allows for easy hardware allocation, addressing, and synchronization.
- On the minus side, we've already seen how the difference between sector size and record size
can result in wasted space, can require a record to be split across multiple sectors, or can require
blocking.
- The first merely entails a space overhead for storing a file, but the latter two will typically
require software handling.
- In addition, fixed size sectors and blocking are not necessarily optimal for data bases of
records widely varying in size.
- The alternative is to allow for variable-sized blocks in a track. There are two possibilities:
- Let each application define a block size, but have all blocks in the file be of the same size.
- The size is declared via program declarations or using a job control language.
- The file header consists of a home address area, and a header record, and
stores information on the block size, number of records, etc.
- The remaining blocks each include a distinguishable address marker which won't be
used in the rest of the file; a count area, which gives the format and address of the block;
an optional key area, specifying the length of the key area (if the file is keyed); and a
data area, consisting of the file's proper data and then the keys.
- There are gaps between each area, and between consecutive blocks; these are long enough
for stop-start, and also synchronize the next field to an accessible boundary. See illustration of
file layout .
- This description assumes the blocks themselves are unblocked; blocking would complicate
the description.
- The alternative is to let the blocks in a file actually have different lengths. This will
significantly complicate the software access, and really makes sense only if the record can be
broken into several pieces, where each indicates whether another piece is coming or not, (this is
almost like the nil/non-nil distinction for linked lists), or if several records of possibly
unknown length can be fit into a single block of fixed length.
Space and time parameters for a disk:
- Space parameters:
- surfaces
- tracks/surface
- sectors/track
- bytes/sector
- Equivalently, but changing the order of addressing: tracks/cylinder, cylinders, sectors/track,
bytes/sector
- Time parameters:
- seek time
- time to move the read/write arm
- zero for fixed disks/drums
- worst case, average, and best-case all of interest
- best-case time represents switching to an adjacent cylinder
- activation time
- time to select the proper head
- usually negligible
- revolutions/sec
- equivalently, time per revolution (msec)
- transfer rate
- latency
= the time before the data comes under the active read/write head
- = half a revolution on average (msec)
- varies from 0 to one full revolution.
(See the text on space and time computations.)
Note that we can assume that most accesses to successive cylinders for a file have only a best-
case seek time cost, since these cylinders will typically be adjacent.
Synchronization and hopscotching:
- Under appropriate circumstances, most of the latency cost for sequential access can be
eliminated through use of gaps or careful placement of records.
- On sector-organized disks/drums, one can make an assumption on typical processing time
(say, that it is almost always going to be less than twice the I/O time), and store data in such a
way that gaps are not needed.
- In this case, we could store consecutive blocks in every third sector.
- By the time the current buffer has been processed, and new data is needed, the sector with
the new block is just coming under the read/write head.
- Advantages are: get space efficiency by eliminating gaps; allow chain reads by trading off
processing time and latency.
- Disadvantages include: the need for a more sophisticated addressing scheme; greater need
for application information.
Multi-processing:
- The above assumptions on synchronization and on seek time between cylinders (and in fact
time computations in general) are invalidated if multiple processes or users need to access
different parts of the disk at the same time.
- While all operating systems will prevent another access to a device currently in use, it is
possible that another application's requests will intervene between every pair of requests from the
current application, in which case each access will require an (average-case) seek and latency.
- Some operating systems, however, will grant applications priorities, and may allow an
application to reserve a resource (most often the CPU, but sometime other resources, particularly
for time-constrained systems) for a longer interval.
-