Notes #9 --- Direct Access Files

Direct access files allow rapid look-up of a single keyed record (typically O(1) average cost). The main issue is the method for converting the key to an address. There are three families of schemes:

  1. Use the key as the address.
  1. Compute the address using the key --- hash tables.
  1. Use lookup --- dictionary techniques.

 

Questions:

Given a file of n records from a space of N possible keys, and assuming that

  1. any address computations are unit time,
  2. keys can be stored with BF = m, and
  3. the hash table has c cells with blocking factor b:

Additional considerations:

  1. Direct vs. indirect storage: store pointers to data rather than data.
  1. Reorganization

Modifications of hash tables.

  1. Perfect hashing.
  1. Dynamic hashing.

 

HOMEWORK.

Consider a disk with sector size 256 bytes, and a data set of records of 2788 bytes, with possible keys 0 to 999, where records with the following keys are received, and are to be inserted in an empty structure, in order:

87, 214, 711, 62, 500, 915, 443, 459, 902,

817, 715, 377, 91, 350, 882, 266, 281.

  1. Suppose we want to use relative record addressing. How many sectors are needed, if
  1. Suppose we want to use a dictionary, where each dictionary record takes 32 bytes, and the dictionary are maintained in key order. Now suppose the keys of problem #1 are stored in the dictionary.
  1. Enter these keys in a hash table of size 29, using probe = key mod 29 as primary key, and resolving collisions using
  1. Now consider the following (single) sequence of accesses to the table of problem #3.

insert 451, insert 606, find 91, insert 817,

find 300, delete 459, insert 727, insert 701,

delete 4, insert 973, insert 240, find 843

 

Answers. \\Mathsci\sys\PROF\MARLOWTO\COURSES\C2113\C2113N9A.DOC

PROGRAMMING

Preferably in C (but optionally, for less credit, in Pascal) write program(s) to manage the hash table in parts (c) and (d). The table will be an array of size 29 of records/structures consisting of a key, a pointer to a data record (which we won't actually use), and some flags (e.g., for initialization or delete). For external chaining, the structure will need an additional pointer for overflow. You may work in teams.