OCaml Stew Utility Library

Shawn Wagner

Introduction

Stew is a general-purpose library of useful utility and extension routines. Highlights include random number generators and distributions, locale support, improved time printing, and character classification routines that are locale-dependant. Enjoy.

Interface for module Math

Modules that allows functors that do generic math without needing a specific type
This describes the type of math.
type integer
type real

module type Ops = sig
   type t
   type num_type
   val addt → t → t
   val subt → t → t
   val succt → t
   val predt → t
   val mult → t → t
   val divt → t → t
   val remt → t → t
   val abst → t
   val zerot
   val onet
   val mint
   val maxt
   val printout_channel → t → unit
   val to_floatt → float
   val of_floatfloat → t
end

module IntOps : (Ops with type num_type = integer and type t = int)
module Int32Ops : (Ops with type num_type = integer and type t = int32)
module Int64Ops : (Ops with type num_type = integer and type t = int64)
module NativeOps : (Ops with type num_type = integer and type t = nativeint)
module FloatOps : (Ops with type num_type = real and type t = float)

Useful functions
succ_float n returns n+1.0 and pred_float n returns n-1.0
val succ_floatfloat → float
val pred_floatfloat → float

Like compare, but takes an extra epsilon value to use in figuring out if the floats are 'close enough' to be considered equal. See The Art of Computer Programming, 4.2.2. As an example, fcmp ~epsilon:0.00001 5.000005 5.000006 returns 0, meaning 5.000005 ∼ 5.000006.
val fcmpepsilon:float → float → float → int

Useful constants
val efloat
val pifloat

Interface for module Rand

The idea here is to provide a variety of different random number distributions, for any math type you'd ever want to use, using any source of random numbers you'd ever need.

So, the *Dist modules are two-argument functors for the most part. The first parameter is one of the ones in the Math module (Or a user-provided one that has a compatible signature), that tells the code how to do basic math on a numeric type. The second parameter is a source of random numbers. Numbers generated by it are fiddled with by the Dist module to get the proper distribution.

There are several Source modules in the MtRand and FileRand packages, and SysSource in this file is a Source for the standard Random generator.

Each fully functorized module has two functions (min and max, aliases for Source.min and Source.max), and one class, rng. rng's constructor arguments vary depending on the distribution. The class has three methods: min and max, for the lowest and highest numbers an object will return, and genrand, which returns one random number according to the proper distribution.

Note that these distributions do not do any seeding of the underlying Source generator.

They also haven't been tested exceedingly well. Most of the implementations are based on algorithms and descriptions from Knuth's The Art of Computer Programming, but I might have made errors in the translation. One of these days I'll have to look into doing some real testing to make sure the distributions are accurate.


Example:

Set up a new module that works with ints using the Mersenne Twister:

module MyRNG = Rand.UniformDist2(Math.IntOps)(MtRand.IntSource)

Create a uniform RNG that returns values in the range 0≤ x<100 and print one number:

let myrand = new MyRNG.rng 0 100 in print_int myrand#genrand;  print_newline ()


All sources must provide this signature:
module type RNGSource =
   sig type t val genrand : unit → t val min : t val max : t end

A source that takes a source that returns numbers less than 0 and forces them to be positive
module S2USSource :
   functor (Ops : Math.Ops) →
     functor
       (Source : RNGSource with type t = Ops.t) →
         (RNGSource with type t = Ops.t)

Source for the standard Random generator.
module SysSource : (RNGSource with type t = int)
module SysFloatSource: (RNGSource with type t = float)

Some of the Distribution modules that work on integer ranges will implement this signature.
module type IDIST =
   functor (Ops : Math.Ops with type num_type = Math.integer) →
     functor (Source : RNGSource with type t = Ops.t) →
       sig
         type t = Source.t
         val min : t
         val max : t
         class rng :
           t →
           t →
           object
             method genrand : t
             method max : t
             method min : t
           end
       end

new UniformDist.rng low high returns an object that will generate random numbers in the range low ≤ x ≤ high. I need to test more.
module UniformDist : IDIST

new UniformDist2.rng low high returns an object that will generate random numbers in the range low ≤ x < high. See above about testing, especially since these uniform algorithms didn't come from Knuth.
module UniformDist2 : IDIST

Module type for distributions that need a uniform float source but return integer types and whose rng class constructor takes one argument.
module type IFDIST = 
   functor (OpsMath.Ops with type num_type = Math.integer) →
     functor (SourceRNGSource with type t = float) →
sig
   type t = Ops.t
   val mint
   val maxt
   class rngt → object
     method mint
     method maxt
     method genrandt
   end
end

new GeometricDist.rng mean returns an object that will generate random numbers in the geometric distribution using the given mean.
module GeometricDistIFDIST

new PoissonDist.rng mean returns an object that will generate random numbers in the Poisson distribution using the given mean.
module PoissonDistIFDIST

Signature used by many distributions that work entirely with floats. These expect a generator that works in the range 0 ≤ x ≤ 1 unless otherwise noted.
module type FDIST =
   functor (SourceRNGSource with type t = float) →
sig 
   type t = float
   class rngobject
     method genrandt
     method mint
     method maxt
   end
   val mint
   val maxt
end

new NormalDist.rng returns an object that will generate random floats in the normal distribution (Mean 0, standard deviation 1). It looks like this is also known as the gaussian distribution.
module NormalDistFDIST

new ExponentialDist.rng mean returns an object that will generate random floats in an exponential distribution with the given mean.
module ExponentialDist:
   functor (SourceRNGSource with type t = float) →
sig
   type t = float
   val mint
   val maxt
   class rngt → object
     method mint
     method maxt
     method genrandt
   end
end

Future possibilities for distributions include
  1. GammaDist (float)
  2. BinomialDist (int)

Interface for module MtRand

Ocaml version of the Mersenne Twister random number generator based on mt19937ar.c. See http://www.math.keio.ac.jp/matumoto/emt.html


The seed functions. Only one call to one of these is needed. This should be enough variety.
Seed from a single value
val init32int32 → unit
val initint → unit

Seed from an array of any length
val init_array32int32 array → unit
val init_arrayint array → unit
val init_bigarray32: (int32Bigarray.int32_elt, α) Bigarray.Array1.t → unit
val init_genarray: (α → int32) → α array → unit

Seed from raw data in a file. The files used should hold at least 700 ints
val init_from_file : string → unit
val init_from_channelin_channel → unit
val init_from_descrUnix.file_descr → unit

Seed from /dev/urandom
val urandom_foundunit → bool
val urandom_initunit → unit

Seed from /dev/urandom, or a default seed based on time and pid if /dev/urandom isn't available


val self_initunit → unit

Functions returning random numbers, of the given types and in the given ranges. All of these but the int64 ones produce the same output as the reference C code.
Range: 0≤ x0xffffffff
val uint32unit → int32

Range: 0≤ x0x7fffffff
val int32unit → int32

Range: 0≤ x0xffffffffffffffff
val uint64unit → int64

Range: 0≤ x0x7fffffffffffffff
val int64unit → int64

Whichever of the above two is appropriate
val unativeintunit → nativeint
val nativeintunit → nativeint

Range: 0≤ x0x7fffffff or 0≤ x0x7fffffffffffffff
val uintunit → int

Range: 0≤ x0x3fffffff or 0≤ x0x3fffffffffffffff
val intunit → int

Range: 0≤ x ≤ 1
val real1unit → float

Range: 0≤ x < 1
val real2unit → float

Range: 0<x<1
val real3unit → float

Range: 0≤ x<1 with 53-bit resolution
val res53unit → float

Sources for use with the Rand distributions
module IntSource : (Rand.RNGSource with type t = int)
module Int32Source : (Rand.RNGSource with type t = int32)
module FloatSource : (Rand.RNGSource with type t = float)

Interface for module FileRand

This generator reads raw bits from a RNG character device file, i.e., /dev/urandom.

Example usage:

module RNG =  Rand.UniformDist(Math.Int32Ops)(FileRand.Int32Source(FileRand.Dev_Urandom))


class rngstring → object
   method minint32
   method maxint32
   method genrandint32
end

module type FILE = sig
   val namestring
end

module Dev_Urandom : FILE

The sources for the Rand Distribution modules take the filename of the device as a module. Wierd, but hey. I'm fooling around with the capabilities of functors.


module IntSource (FileFILE): (Rand.RNGSource with type t = int)
module Int32Source (File : FILE): (Rand.RNGSource with type t = int32)

Interface for module SkipList

Module that provides the skiplist datastructure.

You should compile this with -labels. It can be used in programs compiled with or without -labels.


module type OrderedType =
   sig
     type t
     val comparet → t → int
   end

module type S =
   sig
     type key
     type α t
       (* Create a new skiplist with given max height, or 5 if none is given. *)
     val create : ?size:int → unit → α t
       (* Wipe out the skiplist *)
                 val clear : α t → unit
       (* The number of elements in the skiplist *)
                 val size : α t → int
       (* Insert an element *)
     val add : α t → key:key → data:α → unit
       (* Look up an element *)
     val find : α t → key → α
       (* Remove an element *)
                 val remove : α t → key → unit
       (* Like the list and map functions... *)
     val mem : α t → key → bool
                 val iter : α t → f:(key → α → unit) → unit
                 val fold : α t → f:(key → α → β → β) → init:β → β
   end

module Make(OrdOrderedType): (S with type key = Ord.t)

Interface for module IOExtras

Open a file for output and auto-flush at exit
val open_out : string → out_channel

print_endline on a generic channel or stdout if you don't give one
val output_endline : ?ostream:out_channel → string → unit

Interface for module StrExtras

Returns the first word of a string
val first_wordstring → string

Cuts off the first character of a string and returns the rest
val cut_first_charstring → string

Cuts off the first n characters of a string and returns the rest.
val cut_first_nstring → int → string

Cuts off the last character of a string and returns the rest
val cut_last_charstring → string

Cuts off the last n characters of a string and returns the rest
val cut_last_nstring → int → string

Cuts off the first word of a string and returns the rest
val cut_first_wordstring → string

Returns everything after the character
val split_atstr:string → sep:char → string

Concatenates a list of strings with a blank seperator
val combinestring list → string

Remove all trailing whitespace
val chompstring → string

Interface for module Syslog

The assorted logging facilities. The default is LOG_USER. You can set a new default with openlog, or give a specific facility per syslog call.
type facility = LOG_KERN | LOG_USER | LOG_MAIL | LOG_DAEMON | LOG_AUTH
                 | LOG_SYSLOG | LOG_LPR | LOG_NEWS | LOG_UUCP | LOG_CRON
                 | LOG_AUTHPRIV | LOG_FTP | LOG_NTP | LOG_SECURITY 
                 | LOG_CONSOLE | LOG_LOCAL0 | LOG_LOCAL1 | LOG_LOCAL2 
                 | LOG_LOCAL3 | LOG_LOCAL4 | LOG_LOCAL5 | LOG_LOCAL6 
                 | LOG_LOCAL7

Flags to pass to openlog. LOG_CONS isn't implemented yet.
type flag = LOG_CONS | LOG_NDELAY | LOG_PERROR | LOG_PID

The priority of the error.
type level = LOG_EMERG | LOG_ALERT | LOG_CRIT | LOG_ERR | LOG_WARNING
               | LOG_NOTICE | LOG_INFO | LOG_DEBUG

If your syslogd unix socket isn't /dev/log, call this before openlog or syslog to change it to the proper file
val set_logpathstring → unit

Same as openlog(3)
val openlogstring → flag list → facility → unit

Same as syslog(3), except there's no formats.
val syslog: ?fac:facility → level → string → unit

Close the log
val closelogunit → unit

Interface for module CharExtras

Set and query the current CTYPE locale
val set_chartype : string → string option
val get_chartype : unit → string

Test characters for certain properties. Wrapper for the C <ctype.h> header, basically.
external is_alpha : char → bool = "stew_is_alpha"
external is_space : char → bool = "stew_is_space"
external is_number : char → bool = "stew_is_number"
external is_lower : char → bool = "stew_is_lower"
external is_upper : char → bool = "stew_is_upper"
val is_alphanumeric : char → bool
external is_punctation : char → bool = "stew_is_punct"
external is_printable : char → bool = "stew_is_print"
external is_graphical : char → bool = "stew_is_graph"
external is_hexadecimal : char → bool = "stew_is_xdigit"
external to_lower : char → char = "stew_to_lower"
external to_upper : char → char = "stew_to_upper"

Interface for module Locale

Locale types.
type category = LC_ALL | LC_COLLATE | LC_CTYPE | LC_MESSAGES | LC_MONETARY
                 | LC_NUMERIC | LC_TIME

type numeric_lconv = {
   decimal_pointstring;
   thousands_sepstring;
   groupingstring;
}

type sign_pos = SurroundBoth | SignPrecedesBoth | SignSucceedsBoth
                 | SignPrecedsCS | SignSucceedsCS | UnknownOrder

type monetary_lconv = {
   int_curr_symbolstring;
   currency_symbolstring;
   decimal_pointstring;
   thousands_sepstring;
   groupingstring;
   positive_signstring;
   negative_signstring;
   int_frac_digitsint;
   frac_digitsint;
   p_cs_precedesbool;
   p_sep_by_spacebool;
   n_cs_precedesbool;
   n_sep_by_spacebool;
   p_sign_posnsign_pos;
   n_sign_posnsign_pos;
}

If a new locale name is not provided, just return the Some name of the current locale for the category. An empty name sets the locale based on environment variables. Returns None when setting the locale fails.
external set : ?name:string → category → string option = "stew_set_locale"

Set the locale based on environment variables
val set_from_env : category → string option

Get the current locale for a category
val get : category → string

external numeric_infounit → numeric_lconv = "stew_localeconv_n"
external monetary_infounit → monetary_lconv = "stew_localeconv_m"

Interface for module Time

Time utility functions to fill in gaps in the Unix library and to be more to my liking.
Like Unix.BLAH, but storing the time in int32 rather than float.
external time : unit → int32 = "stew_time_int32"

external gmtimeint32 → Unix.tm = "stew_gmtime"
external localtimeint32 → Unix.tm = "stew_localtime"
external mktimeUnix.tm → int32 × Unix.tm = "stew_mktime"

Wrapper for the C strftime() function. See local documentation for it for information on the string argument (strftime()'s format argument)
external format_tmstring → Unix.tm → string = "stew_strftime_tm"
val format_timestring → int32 → string

Returns the time as a string with trailing newline
external ctimeint32 → string = "stew_ctime"
external asctimeUnix.tm → string = "stew_asctime"
(* Time as a string without trailing newline *)
val time_stringint32 → string
val tm_stringUnix.tm → string

Interface for module UnixExtras

Runs f in a new daemon process. The calling process will exit if the second argument is true. * For example: let _ = UnixExtras.make_daemon server_func true will start a new process and run server_func in it, while the original process will exit. See Stevens, Advanced Programming in the Unix Environment for details.


val make_daemon: (unit → unit) → bool → unit

Reads len bytes from the src descr with offset start and copies them to the dest descr. Currently implemented only for Linux and FreeBSD. The FreeBSD sendfile() system call requires that dest be a socket.

Raises a Unix_error on failure.


external send_filesrc:Unix.file_descr → dest:Unix.file_descr → start:int → len:int → int = "stew_sendfile"

More servent functions not provided by the Unix module. They do the same thing as the C versions.
external getserventunit → Unix.service_entry = "stew_getservent"
external setserventbool → unit = "stew_setservent"
external endserventunit → unit = "stew_endservent"


This document was translated from LATEX by HEVEA.