TITLE:Back-to-Basics: Hash-test and Istat

ISSUE:Multi-value Solutions Jul '98

AUTHOR:Nathan Rector

COMPANY:Natec Systems

EMAIL:nater@northcoast.com

HTTP:www.northcoast.com/~nater/

Syntax

hash-test file.reference {itemlist} {sellist} {(options)}

test modulo:modulo

istat file.reference {itemlist} {(options}

As a general rule, MultiValue data is saved in a group of frames or blocks. Each block of data is so many characters in size allowing the operating system to use a special mathematical algorithm called "Hashing" to find the data quickly.

When working with a MultiValue database, it is important to have your files sized correctly. By having the correct file size, accessing data is faster. It also helps with system overhead. A file sized too small causes the operating system to look into overflow to find the data you want. A file sized too big causes the hashing algorithm to be less accurate.

HASH-TEST is a TCL command that can help you size your files correctly. It produces a "histogram" which displays where and how many items are found within each frame of a file. The command is useful when performing file reallocation because it provides insight into how the item-ids are placed in a file as well as providing a suggested modulo.

The histogram shows the number of items in each frame and the number of frames used to store the item. When viewing the histogram, each ">" is an item. If you view a histogram with one frame having one item and another frame having ten items, then it is likely your files are out of balance and need resizing.

The HASH-TEST requires you to specify a test modulo size. This gives you the ability to test different modulos to find the right fit for your files.

If you want to use the existing modulo, then use the ISTAT command. The ISTAT TCL command does the exact same things as the HASH-TEST except it does not ask you for a test modulo size.

Options

s - Displays summary statistics only.

Example

:hash-test testfile

test modulo: 3

file= testfile modulo= 3 10:07:24 03 Mar 1992

frames bytes items

1 444 1 *>

1 772 2 *>>

1 1742 2 *>>

3

HASH-TEST/ISTAT Statistics

total item count = 5 byte

count = 2958

avg. bytes/item = 591.6 avg. items/group = 1.6

std. deviation = .5 avg. bytes/group = 986.0

suggested modulo = 3