Using the York Binary library in nhc13This document describes the York Binary library. (See also the BinArray library for an example of the use of Binary to build other abstractions.) The York Binary librarymodule Binary where data BinPtr a = ... data BinLocation = Memory | File FilePath BinIOMode data BinIOMode = RO | RW | WO data BinHandle = ... stdmem :: BinHandle openBin :: BinLocation -> IO BinHandle freezeBin :: BinHandle -> IO () -- changes BinIOMode to RO closeBin :: BinHandle -> IO () copyBin :: BinHandle -> BinLocation -> IO BinHandle isEOFBin :: BinHandle -> IO Bool seekBin :: BinHandle -> BinPtr a -> IO () tellBin :: BinHandle -> IO (BinPtr a) class Binary a where put :: BinHandle -> a -> IO (BinPtr a) get :: BinHandle -> IO a putAt :: BinHandle -> BinPtr a -> a -> IO () getAt :: BinHandle -> BinPtr a -> IO a getFAt :: BinHandle -> BinPtr a -> a Programming modelBoth in-heap data compression and binary I/O can be achieved using the York Binary library. The basic model is rather like file I/O: binary data resides in a separate space which is accessed only through a BinHandle acting like a buffering file descriptor. Each item of binary data lies at a particular position within the space, the position being denoted by a BinPtr. Data can be written and read sequentially just as with ordinary files. Also, like ordinary files, we allow random-access reading and writing. However, the particular beauty of this scheme is the ability to engage in pure, lazy, random-access reading when a BinHandle is in the appropriate RO (read-only) mode. (A BinHandle which is already open for writing can be changed to RO mode with the freezeBin call.) BinHandles do not just denote files - they can also refer to areas of heap memory. One such area is available by default - called stdmem - but new areas can be opened in just the same way as files. They are opened in the default mode RW. Binary heap areas grow automatically to fit the data placed in them, and, like files, they are naturally garbage-collected when they are no longer in use. (The closeBin operation is an explicit means to close a file or discard some memory.) The Binary class is derivable for any datatype defined in a program except functions. (Please note however that cyclic or infinite values will cause the compressing function to diverge.) The class member functions come in two varieties, one for sequential access, the other for random access. A BinHandle contains a hidden state, including the current position in the file or memory. Understanding the notion of the current position is important for using the sequential operations correctly. put and get always start reading or writing from the current position. All operations including the random-access ones, when they return, set the current position to the end of the value which has just been read or written.
Transferring bits in bulkThe easiest way to transfer bits in bulk is with the copyBin operation. It takes an active BinHandle and copies its entire contents into the given BinLocation, returning a fresh BinHandle denoting the copy. A completely different method exists to transfer individual binary values in bulk. By recording the size of each binary value explicitly (which costs a certain amount of space), we can transfer just the bits belonging to that value, rather than the entire BinHandle. (This method predates the introduction of BinHandles, so it may turn out to be less space-efficient, less quick, and more complex than the newer copyBin method. Tell us which method you prefer.) There are "sized" variations of two of the binary operations: data SizedBin a = SB Size BinHandle (BinPtr a) sizedPut :: Binary a => BinHandle -> a -> IO (SizedBin a) sizedGetFAt :: Binary a => SizedBin a -> aThe main purpose of the sized operations is to enable efficient bulk transfer of binary data between two BinHandles (typically between memory and file, although it can equally well occur between two files or two memory spaces). The sized operations do not implement this bulk transfer themselves - however bulk transfer is efficient only when the amount of data is known beforehand. Hence there is a standard instance of class Binary for the type SizedBin a: the operations put and get, when used on sized binary values, effect the bulk transfer of bits from one BinHandle to another. It should be noted that getting a sized binary value automatically allocates a fresh BinHandle in memory, since there is no other way of specifying a destination for a transfer in this direction. This fresh BinHandle (enclosed anonymously within the SizedBin a value) is also frozen into RO mode after the transfer, ready for a later sizedGetFAt operation. Defining your own compressionIf you want to play with defining your own instances of Binary, have a look at some of the instances for standard types like Int and Lists in src/prelude/Binary/Instances.hs to see how things work. One need define only three of the member functions - the others are defined in terms of them. The lower-level tools used in defining instances are: class Binary a where ... unsafeGetAt :: BinHandle -> BinPtr a -> (a,BinPtr b) getBits :: BinHandle -> Int -> BinPtr a -> IO Int putBits :: BinHandle -> Int -> Int -> IO (BinPtr a) getAux :: BinHandle -> Int -> BinPtr a -> (Int,BinPtr a) (<<) :: ((a->b),c) -> (c->(a,d)) -> (b,d) Read and write modesA file BinHandle can be opened in one of three modes: read-only (RO), write-only (WO), or read-write (RW). A memory BinHandle is always opened in RW mode, but may be changed to RO mode by the freezeBin operation. These modes differ from those of ordinary textual files:
An example program which uses binary I/O in RW mode is ZooQuiz.hs. The latest updates to these pages are available on the WWW from http://www.cs.york.ac.uk/fp/nhc13/
1998.03.26 |