Appendix 6
Glossary of terms and abbreviations

list of general acronyms

For definitions of the following acronyms refer to the glossary.

AMP
Access Module Processor, Teradata's name for a storage node.
ATM
Automated Teller Machine.
DBA
Database Administrator.
DBM
DataBase Machine.
DRAT
Dynamically Reconfigurable Array of Transputers, a forerunner of IDIOMS (J. Kerridge, "A proposal for a dynamically reconfigurable array of transputers to support database applications", Proc. 7th technical occam user group, Grenoble 1987, IOS, Amsterdam).
DSDL
Data Storage Description Language, a language for defining the allocation of data to storage media.
IDIOMS
Intelligent Decision making In Online Management Systems.
MIS
Management Information Service.
OLTP
Online Transaction Processing.
ORQ
Orthogonal Range Query.
POS
Point of Sale.
SQL
Structured Query Language.
SKI
Single Key Index.
TP
Transaction Processing.
TSB
Trustee Savings Bank, collaborators in the IDIOMS project.
VRP
Value Range Partitioning.

glossary of terms

access frequency
the frequency with which a data object is accessed.
access plan
used in this thesis to mean the graph which represents the implementation of a logical query tree e.g. a single logical relational operation may be carried out on more than one processor; the query tree shows a single node for the process, the access plan shown all the replicated instances of this.
associative memory device
a hardware architecture which retrieves data on the basis of the value of data items, rather than by using data location pointers. Occasionally the term is used by researchers to refer to software simulation of this.
automated teller machine
a machine which provides banking services, typically for withdrawal of funds by the user.
B+-tree
a tree structure consisting of an index set and a sequence set. The index set is a tree structured index (a B-tree), and the sequence set is a list of pointers to data buckets.
bucket
the unit of transfer between disc storage and main memory, also known as data page.
cell
a) an alternative name for a (logical) data partition b) a hardware unit of a cellular database machine.
cellular database machine
consists of a set of cells, each of which is composed of memory and a processor. For each track of a rotating (used in a loose sense to mean disc, drum, magnetic bubble memory) device there is a cell. If the whole database is stored on a set of these cells, then it can be searched in a single revolution. Data is physically accessed by value, rather than address. An example is the RAP machine of Ozkarahan.
clustered/clustering
the placement of records physically close together on disc, based (usually) on some attribute value(s).
condition clause
see predicate clause.
data allocation
the collective term for partitioning and placement of data on storage devices.
data dictionary
contains metadata—data about data, for example, database and query statistics, data partitioning and placement information, the machine state.
dataflow
in multi-processor database terms is used to describe a pipelined producer-consumer system. As soon as data is produced it is (usually) sent to the next process to be further processed. The use of the word should not be confused with the term as it is used to describe dataflow computer architectures (i.e. a computer which does not use an Instruction Pointer).
dataflow graph
(in multi-processor database terminology) a graph where the nodes represent processes and the arcs represent transfers of data between processes. Often used synonymously with query tree.
data fragmentation
see data partitioning.
data migration
the transfer of a data object from one partition to another.
data mining
the scanning of large amounts of data in order to extract information e.g. statistical information on ATM usage.
data partitioning
partitioning of (a possibly notional) data file or partition into a number of smaller files or partitions.
data placement
placement of data partitions on storage media.
data skew
uneven distribution of data values from the data domain.
data storage description language
a language which defines data partitioning and placement.
database administrator
person responsible for technical administration of a database.
database machine
specialised software or hardware configuration designed to manage a database system.
database management system
"… can be defined as a software package that provides all data management facilities for database creation, retrieval, manipulation, and maintenance of databases". Su, Database Computers, p20.
declustered/declustering
an alternative name for horizontally partitioned/ horizontal partitioning.
DegDecl
is the degree of declustering, the Bubba team's term for the number of processors over which data is horizontally partitioned.
directory
a) the Grid File structure which dynamically maps logical cells to physical data buckets b) in general, a structure which maps logical partitions to physical media c) a structure defining the location of data objects.
distributed database
"implies several computers, each one with a DBMS managing data stored on attached permanent storage devices; a general or local network…; and some facilities to manage data across the network". C. Esculier, Distributed Databases: state of the art, Computer Bulletin vol. 3, page 3, June 1987.
foreign key
is an attribute (or combination) in one relation whose values are required to match those of the primary key in some other relation.
hash partitioning
the partitioning of data dependent upon the value of the result of applying a randomising (i.e. hash) function to a data value or set of values of a record.
heat
an alternative name for access frequency.
head-per-track
an architecture in which each track of a rotating device has its own read/write head.
horizontal partitioning
the fragmentation of a (possibly notional) single file by placing complete records in different partitions.
management information service
used in this thesis to mean those parts of a business which use or require access to large amounts of data (e.g. planning, product targeting, analysis of consumer trends, direct marketing).
multikey hash function
a hash function based on many fields of a record.
online transaction processing
the processing of user queries (transactions) in an interactive manner (i.e. with fast response). See transaction processing.
orthogonal range query
a range query in more than one dimension.
partition cardinality
the number of records in a data partition.
point query
used in this thesis to mean a query which references a single point from the domain of an attribute. The term should not be confused with single hit which refers to a query accessing just a single record, although if an attribute has unique values the terms are equivalent.
point of sale terminal
is a machine which, in conjunction with a special card, debits a user's bank account directly to the value of the goods purchased.
predicate
the condition under which a statement is either true or false, thus used to specify conditions under which data is required.
predicate clause
also known as condition clause. That part of a query which contains or defines the query predicate (e.g. the SQL WHERE statement).
primary key
a unique identifier composed of one or more attributes.
processing element
a processor and associated local memory.
query
used in this thesis to refer to MIS queries, not OLTP transactions.
query selectivity
is the fraction of records referenced by a query (i.e. satisfying a condition), see query size.
query size
used in this thesis to mean the fraction of records referenced by a range query in a given dimension.
range query
a query which accesses data based on a range of data values specified in the predicate clause.
range partitioned/partitioning
placement of data in partitions based upon the value of one or more attributes, each partition containing data within a given range of values.
rotational latency
also known as rotational delay. The time taken for a disc to rotate from the current position (on the required track) to the position containing the first required data block.
round robin
the placement of records on disc (or in a partition) in a (notionally) sequential manner.
scan time
the time taken to access data by means of a sequential scan of the data, rather than by the use of an index to specify which data pages to access.
schema
the view, or definition, or description of data.
shared-disc
a generic multi-processor query architecture in which all of the discs are accessible by all the processors, but memory is local to each of the processors.
shared-everything
a generic multi-processor database machine architecture in which all discs and memory are directly accessible by all processors.
shared-nothing
a generic multi-processor database machine architecture in which each processor has its own disc(s) and memory, the only shared resource being the interconnection network.
seek time
time needed to reposition the disc arm from its current track to the required track.
single key index
Liou and Yao's term for an index on the primary key [LIOU77].
table handler
the name of the process which deals with file handling in the IDIOMS machine; essentially a small distributed file handler.
transaction
used in this thesis to refer to OLTP transactions cf. query.
transaction processing
the processing of small fixed queries which access one or perhaps a few records in a database.
transaction processor
the set of processes in IDIOMS that deals with the processing of OLTP transactions.
transputer
a microprocessor with built-in communications links which can operate concurrently with the main processor, developed and produced by INMOS.
value range partitioning
the round robin placement of records from each of a set of (possibly notional) partitions, the partitions themselves containing records which were placed (possibly notionally) using range partitioning.
vertical partitioning
the fragmentation of a file into subfiles by splitting each record into two or more subrecords, and placing these subrecords in different partitions.