OCaml Stew Utility Library
Shawn Wagner
Introduction
Stew is a general-purpose library of useful utility and extension
routines. Highlights include random number generators and
distributions, locale support, improved time printing, and character
classification routines that are locale-dependant. Enjoy.
Interface for module Math
Modules that allows functors that do generic math without needing a specific type
This describes the type of math.
type integer
type real
module type Ops = sig
type t
type num_type
val add: t → t → t
val sub: t → t → t
val succ: t → t
val pred: t → t
val mul: t → t → t
val div: t → t → t
val rem: t → t → t
val abs: t → t
val zero: t
val one: t
val min: t
val max: t
val print: out_channel → t → unit
val to_float: t → float
val of_float: float → t
end
module IntOps : (Ops with type num_type = integer and type t = int)
module Int32Ops : (Ops with type num_type = integer and type t = int32)
module Int64Ops : (Ops with type num_type = integer and type t = int64)
module NativeOps : (Ops with type num_type = integer and type t = nativeint)
module FloatOps : (Ops with type num_type = real and type t = float)
Useful functions
succ_float n returns n+1.0 and pred_float n returns n-1.0
val succ_float: float → float
val pred_float: float → float
Like compare, but takes an extra epsilon value to use in figuring
out if the floats are 'close enough' to be considered equal. See The
Art of Computer Programming, 4.2.2. As an example,
fcmp ~epsilon:0.00001 5.000005 5.000006 returns 0, meaning
5.000005 ∼ 5.000006.
val fcmp: epsilon:float → float → float → int
Useful constants
val e: float
val pi: float
Interface for module Rand
The idea here is to provide a variety of different random number
distributions, for any math type you'd ever want to use, using any
source of random numbers you'd ever need.
So, the *Dist modules are two-argument functors for the most
part. The first parameter is one of the ones in the Math
module (Or a user-provided one that has a compatible signature),
that tells the code how to do basic math on a numeric type. The
second parameter is a source of random numbers. Numbers generated
by it are fiddled with by the Dist module to get the proper
distribution.
There are several Source modules in the MtRand and
FileRand packages, and SysSource in this file is a
Source for the standard Random generator.
Each fully functorized module has two functions (min and max,
aliases for Source.min and Source.max), and one class,
rng. rng's constructor arguments vary depending on the
distribution. The class has three methods: min and max, for the
lowest and highest numbers an object will return, and genrand, which
returns one random number according to the proper distribution.
Note that these distributions do not do any seeding of the
underlying Source generator.
They also haven't been tested exceedingly well. Most of the
implementations are based on algorithms and descriptions from
Knuth's The Art of Computer Programming, but I might have made
errors in the translation. One of these days I'll have to look into
doing some real testing to make sure the distributions are accurate.
Example:
Set up a new module that works with ints using the Mersenne Twister:
module MyRNG = Rand.UniformDist2(Math.IntOps)(MtRand.IntSource)
Create a uniform RNG that returns values in the range 0≤ x<100
and print one number:
let myrand = new MyRNG.rng 0 100 in print_int myrand#genrand;
print_newline ()
All sources must provide this signature:
module type RNGSource =
sig type t val genrand : unit → t val min : t val max : t end
A source that takes a source that returns numbers less than 0 and
forces them to be positive
module S2USSource :
functor (Ops : Math.Ops) →
functor
(Source : RNGSource with type t = Ops.t) →
(RNGSource with type t = Ops.t)
Source for the standard Random generator.
module SysSource : (RNGSource with type t = int)
module SysFloatSource: (RNGSource with type t = float)
Some of the Distribution modules that work on integer
ranges will implement this signature.
module type IDIST =
functor (Ops : Math.Ops with type num_type = Math.integer) →
functor (Source : RNGSource with type t = Ops.t) →
sig
type t = Source.t
val min : t
val max : t
class rng :
t →
t →
object
method genrand : t
method max : t
method min : t
end
end
new UniformDist.rng low high returns an object that will generate
random numbers in the range low ≤ x ≤ high. I need to test
more.
module UniformDist : IDIST
new UniformDist2.rng low high returns an object that will
generate random numbers in the range low ≤ x < high. See above
about testing, especially since these uniform algorithms didn't come
from Knuth.
module UniformDist2 : IDIST
Module type for distributions that need a uniform float source but
return integer types and whose rng class constructor takes one argument.
module type IFDIST =
functor (Ops: Math.Ops with type num_type = Math.integer) →
functor (Source: RNGSource with type t = float) →
sig
type t = Ops.t
val min: t
val max: t
class rng: t → object
method min: t
method max: t
method genrand: t
end
end
new GeometricDist.rng mean returns an object that will generate
random numbers in the geometric distribution using the given mean.
module GeometricDist: IFDIST
new PoissonDist.rng mean returns an object that will generate
random numbers in the Poisson distribution using the given mean.
module PoissonDist: IFDIST
Signature used by many distributions that work entirely with
floats. These expect a generator that works in the range 0 ≤ x
≤ 1 unless otherwise noted.
module type FDIST =
functor (Source: RNGSource with type t = float) →
sig
type t = float
class rng: object
method genrand: t
method min: t
method max: t
end
val min: t
val max: t
end
new NormalDist.rng returns an object that will generate random
floats in the normal distribution (Mean 0, standard deviation 1). It
looks like this is also known as the gaussian distribution.
module NormalDist: FDIST
new ExponentialDist.rng mean returns an object that will generate
random floats in an exponential distribution with the given mean.
module ExponentialDist:
functor (Source: RNGSource with type t = float) →
sig
type t = float
val min: t
val max: t
class rng: t → object
method min: t
method max: t
method genrand: t
end
end
Future possibilities for distributions include
-
GammaDist (float)
- BinomialDist (int)
Interface for module MtRand
Ocaml version of the Mersenne Twister random number generator based
on mt19937ar.c. See http://www.math.keio.ac.jp/matumoto/emt.html
The seed functions. Only one call to one of these is needed. This should
be enough variety.
Seed from a single value
val init32: int32 → unit
val init: int → unit
Seed from an array of any length
val init_array32: int32 array → unit
val init_array: int array → unit
val init_bigarray32: (int32, Bigarray.int32_elt, α) Bigarray.Array1.t → unit
val init_genarray: (α → int32) → α array → unit
Seed from raw data in a file.
The files used should hold at least 700 ints
val init_from_file : string → unit
val init_from_channel: in_channel → unit
val init_from_descr: Unix.file_descr → unit
Seed from /dev/urandom
val urandom_found: unit → bool
val urandom_init: unit → unit
Seed from /dev/urandom, or a default seed based on time and pid if
/dev/urandom isn't available
val self_init: unit → unit
Functions returning random numbers, of the given types and in the
given ranges. All of these but the int64 ones produce the same output
as the reference C code.
Range: 0≤ x≤0xffffffff
val uint32: unit → int32
Range: 0≤ x≤0x7fffffff
val int32: unit → int32
Range: 0≤ x≤0xffffffffffffffff
val uint64: unit → int64
Range: 0≤ x≤0x7fffffffffffffff
val int64: unit → int64
Whichever of the above two is appropriate
val unativeint: unit → nativeint
val nativeint: unit → nativeint
Range: 0≤ x≤0x7fffffff or 0≤ x ≤0x7fffffffffffffff
val uint: unit → int
Range: 0≤ x≤0x3fffffff or 0≤ x≤0x3fffffffffffffff
val int: unit → int
Range: 0≤ x ≤ 1
val real1: unit → float
Range: 0≤ x < 1
val real2: unit → float
Range: 0<x<1
val real3: unit → float
Range: 0≤ x<1 with 53-bit resolution
val res53: unit → float
Sources for use with the Rand distributions
module IntSource : (Rand.RNGSource with type t = int)
module Int32Source : (Rand.RNGSource with type t = int32)
module FloatSource : (Rand.RNGSource with type t = float)
Interface for module FileRand
This generator reads raw bits from a RNG character device file, i.e., /dev/urandom.
Example usage:
module RNG =
Rand.UniformDist(Math.Int32Ops)(FileRand.Int32Source(FileRand.Dev_Urandom))
class rng: string → object
method min: int32
method max: int32
method genrand: int32
end
module type FILE = sig
val name: string
end
module Dev_Urandom : FILE
The sources for the Rand Distribution modules take the filename of the device as a module. Wierd, but hey. I'm fooling around with the capabilities of functors.
module IntSource (File: FILE): (Rand.RNGSource with type t = int)
module Int32Source (File : FILE): (Rand.RNGSource with type t = int32)
Interface for module SkipList
Module that provides the skiplist datastructure.
You should compile this with -labels. It can be used in programs
compiled with or without -labels.
module type OrderedType =
sig
type t
val compare: t → t → int
end
module type S =
sig
type key
type α t
(* Create a new skiplist with given max height, or 5 if none is given. *)
val create : ?size:int → unit → α t
(* Wipe out the skiplist *)
val clear : α t → unit
(* The number of elements in the skiplist *)
val size : α t → int
(* Insert an element *)
val add : α t → key:key → data:α → unit
(* Look up an element *)
val find : α t → key → α
(* Remove an element *)
val remove : α t → key → unit
(* Like the list and map functions... *)
val mem : α t → key → bool
val iter : α t → f:(key → α → unit) → unit
val fold : α t → f:(key → α → β → β) → init:β → β
end
module Make(Ord: OrderedType): (S with type key = Ord.t)
Interface for module IOExtras
Open a file for output and auto-flush at exit
val open_out : string → out_channel
print_endline on a generic channel or stdout if you don't give one
val output_endline : ?ostream:out_channel → string → unit
Interface for module StrExtras
Returns the first word of a string
val first_word: string → string
Cuts off the first character of a string and returns the rest
val cut_first_char: string → string
Cuts off the first n characters of a string and returns the rest.
val cut_first_n: string → int → string
Cuts off the last character of a string and returns the rest
val cut_last_char: string → string
Cuts off the last n characters of a string and returns the rest
val cut_last_n: string → int → string
Cuts off the first word of a string and returns the rest
val cut_first_word: string → string
Returns everything after the character
val split_at: str:string → sep:char → string
Concatenates a list of strings with a blank seperator
val combine: string list → string
Remove all trailing whitespace
val chomp: string → string
Interface for module Syslog
The assorted logging facilities. The default is LOG_USER. You can set a new default with openlog, or give a specific facility per syslog call.
type facility = LOG_KERN | LOG_USER | LOG_MAIL | LOG_DAEMON | LOG_AUTH
| LOG_SYSLOG | LOG_LPR | LOG_NEWS | LOG_UUCP | LOG_CRON
| LOG_AUTHPRIV | LOG_FTP | LOG_NTP | LOG_SECURITY
| LOG_CONSOLE | LOG_LOCAL0 | LOG_LOCAL1 | LOG_LOCAL2
| LOG_LOCAL3 | LOG_LOCAL4 | LOG_LOCAL5 | LOG_LOCAL6
| LOG_LOCAL7
Flags to pass to openlog. LOG_CONS isn't implemented yet.
type flag = LOG_CONS | LOG_NDELAY | LOG_PERROR | LOG_PID
The priority of the error.
type level = LOG_EMERG | LOG_ALERT | LOG_CRIT | LOG_ERR | LOG_WARNING
| LOG_NOTICE | LOG_INFO | LOG_DEBUG
If your syslogd unix socket isn't /dev/log, call this before openlog or syslog to change it to the proper file
val set_logpath: string → unit
Same as openlog(3)
val openlog: string → flag list → facility → unit
Same as syslog(3), except there's no formats.
val syslog: ?fac:facility → level → string → unit
Close the log
val closelog: unit → unit
Interface for module CharExtras
Set and query the current CTYPE locale
val set_chartype : string → string option
val get_chartype : unit → string
Test characters for certain properties. Wrapper for the C <ctype.h>
header, basically.
external is_alpha : char → bool = "stew_is_alpha"
external is_space : char → bool = "stew_is_space"
external is_number : char → bool = "stew_is_number"
external is_lower : char → bool = "stew_is_lower"
external is_upper : char → bool = "stew_is_upper"
val is_alphanumeric : char → bool
external is_punctation : char → bool = "stew_is_punct"
external is_printable : char → bool = "stew_is_print"
external is_graphical : char → bool = "stew_is_graph"
external is_hexadecimal : char → bool = "stew_is_xdigit"
external to_lower : char → char = "stew_to_lower"
external to_upper : char → char = "stew_to_upper"
Interface for module Locale
Locale types.
type category = LC_ALL | LC_COLLATE | LC_CTYPE | LC_MESSAGES | LC_MONETARY
| LC_NUMERIC | LC_TIME
type numeric_lconv = {
decimal_point: string;
thousands_sep: string;
grouping: string;
}
type sign_pos = SurroundBoth | SignPrecedesBoth | SignSucceedsBoth
| SignPrecedsCS | SignSucceedsCS | UnknownOrder
type monetary_lconv = {
int_curr_symbol: string;
currency_symbol: string;
decimal_point: string;
thousands_sep: string;
grouping: string;
positive_sign: string;
negative_sign: string;
int_frac_digits: int;
frac_digits: int;
p_cs_precedes: bool;
p_sep_by_space: bool;
n_cs_precedes: bool;
n_sep_by_space: bool;
p_sign_posn: sign_pos;
n_sign_posn: sign_pos;
}
If a new locale name is not provided, just return the Some name of the
current locale for the category. An empty name sets the locale based
on environment variables. Returns None when setting the locale fails.
external set : ?name:string → category → string option = "stew_set_locale"
Set the locale based on environment variables
val set_from_env : category → string option
Get the current locale for a category
val get : category → string
external numeric_info: unit → numeric_lconv = "stew_localeconv_n"
external monetary_info: unit → monetary_lconv = "stew_localeconv_m"
Interface for module Time
Time utility functions to fill in gaps in the Unix library and to
be more to my liking.
Like Unix.BLAH, but storing the time in int32 rather than float.
external time : unit → int32 = "stew_time_int32"
external gmtime: int32 → Unix.tm = "stew_gmtime"
external localtime: int32 → Unix.tm = "stew_localtime"
external mktime: Unix.tm → int32 × Unix.tm = "stew_mktime"
Wrapper for the C strftime() function. See local documentation for
it for information on the string argument (strftime()'s format
argument)
external format_tm: string → Unix.tm → string = "stew_strftime_tm"
val format_time: string → int32 → string
Returns the time as a string with trailing newline
external ctime: int32 → string = "stew_ctime"
external asctime: Unix.tm → string = "stew_asctime"
(* Time as a string without trailing newline *)
val time_string: int32 → string
val tm_string: Unix.tm → string
Interface for module UnixExtras
Runs f in a new daemon process. The calling process will exit
if the second argument is true.
*
For example:
let _ = UnixExtras.make_daemon server_func true
will start a new process and run server_func in it, while the original
process will exit. See Stevens, Advanced Programming in the Unix Environment
for details.
val make_daemon: (unit → unit) → bool → unit
Reads len bytes from the src descr with offset start and
copies them to the dest descr. Currently implemented only for Linux
and FreeBSD. The FreeBSD sendfile() system call requires that dest be
a socket.
Raises a Unix_error on failure.
external send_file: src:Unix.file_descr → dest:Unix.file_descr → start:int → len:int → int = "stew_sendfile"
More servent functions not provided by the Unix
module. They do the same thing as the C versions.
external getservent: unit → Unix.service_entry = "stew_getservent"
external setservent: bool → unit = "stew_setservent"
external endservent: unit → unit = "stew_endservent"
This document was translated from LATEX by
HEVEA.