Appendix 5
List of publications

M. Unwalla, J. Kerridge, "Control of a Large Massively Parallel Database Machine using SQL Catalogue Extensions, and a DSDL in Preference to an Operating System", in Advanced Database Systems: proc. BNCOD 10 (Aberdeen), P. M. D. Gray, R. J. Lucas (eds.), 138-155, Springer-Verlag, (LNCS 618), 1992.

The IDIOMS parallel database machine supports large applications where integrated OLTP and MIS is required. It can be considered a relational engine, and SQL is used as the MIS query language. We make some comparisons between IDIOMS and other database machines. We justify why IDIOMS does not use an operating system, and why a Data Storage Description Language (DSDL) is used to control data placement. Our implementation extends the SQL2 information schema tables. These extensions, which are described in detail, can be used by a Data Dictionary process to control resource allocation and data access. General principles behind further extensions which can be used to improve data partitioning are discussed. By means of examples, we show how our extensions support multi-column partitioning, and how, with such a partitioning strategy, MIS query access time can be reduced.

M. Unwalla, J. Kerridge, "Number of Partitions Accessed by Range Queries in Partitioned Files", submitted for publication, 1993.

The formula (1+s/d) presented in the literature for calculating the average number of pages accessed by range queries in range partitioned files does not hold if it is used for large queries in files with a low number of partitions (not data pages). It also implies that scan time reductions due to further partitioning are independent of the size of a query. We present an improved formula, which shows that both the number of partitions and the size of a query influence the reduction in scan time that can be made by increasing the number of partitions. We conclude with an examination of results obtained, and show a simple modification that allows the basic formula (1+s/d) and its derivatives to be used with confidence when the number of partitions is not very small.

J. M. Kerridge, S. D. North, R. Guiton, M. Unwalla, "A Data Storage Description Language for Database Language SQL", Department of Computer Science, University of Sheffield, Internal Report CS-91-05, 1991.

This document specifies the syntax and semantics of a language that is used to define the storage requirements of a database that is defined using the Database Language SQL-DDL. The language provides an implementation independent method of defining how the data associated with the tables of an SQL database is stored.

J. Kerridge, S. North, M. Unwalla, R. Guiton, "Table Placement in a Large Massively Parallel Database Machine", submitted for publication.

Computer systems that support large databases usually do so as their only task, because the maintenance of the database is sufficiently complex to consume all the computing resources. The problem of placement of data in such an environment is restricted by the lack of functionality within traditional operating systems. Operating systems are a general tool used to manage the resources of a computer system. A database machine undertakes a specific task for which general tools are inappropriate. The elimination of the operating system interface means that we are able to access the data directly by value rather than by the use of files. As a consequence, multi-column range partitioning, which has commonly been discounted in other database machines, can be used effectively. It also has the advantage of simplicity. These two requirements, that the data must be placed directly on the storage media and that it be placed by value necessitated the design and implementation of a Data Storage Description Language (DSDL). We discuss the DSDL and its accompanying toolset. We also present examples of its use.