Module Float_intf

module Float_intf: sig .. end
Floating-point representation and utilities.


Floating-point representation and utilities.
module Binable: Binable0
module type S = sig .. end

Floating-point representation and utilities.

max and min will return nan if either argument is nan.

The validate_* functions always fail if class is Nan or Infinite.

The results of robust comparisons on nan should be considered undefined.

validate_ordinary fails if class is Nan or Infinite.

equal to infinity

equal to neg_infinity

See Robust_compare

The difference between 1.0 and the smallest exactly representable floating-point number greater than 1.0. That is:

epsilon_float = (one_ulp `Up 1.0) -. 1.0

This gives the relative accuracy of type t, in the sense that for numbers on the order of x, the roundoff error is on the order of x *. float_epsilon.

See also: http://en.wikipedia.org/wiki/Machine_epsilon

min_positive_subnormal_value = 2 ** -1074 min_positive_normal_value = 2 ** -1022

An order-preserving bijection between all floats except for nans, and all int64s with absolute value smaller than or equal to 2**63 - 2**52. Note both 0. and -0. map to 0L.

returns nan if the absolute value of the argument is too large

The next or previous representable float. ULP stands for "unit of least precision", and is the spacing between floating point numbers. Both one_ulp `Up infinity and one_ulp `Down neg_infinity return a nan.

round rounds a float to an integer float. iround{,_exn} rounds a float to an int. Both round according to a direction dir, with default dir being `Nearest.

        | `Down    | rounds toward Float.neg_infinity                             |
        | `Up      | rounds toward Float.infinity                                 |
        | `Nearest | rounds to the nearest int ("round half-integers up")         |
        | `Zero    | rounds toward zero                                           |
     

iround_exn raises when trying to handle nan or trying to handle a float outside the range [float min_int, float max_int).

Here are some examples for round for each direction:

        | `Down    | [-2.,-1.)   to -2. | [-1.,0.)   to -1. | [0.,1.) to 0., [1.,2.) to 1. |
        | `Up      | (-2.,-1.]   to -1. | (-1.,0.]   to -0. | (0.,1.] to 1., (1.,2.] to 2. |
        | `Zero    | (-2.,-1.]   to -1. | (-1.,1.)   to 0.  | [1.,2.) to 1.                |
        | `Nearest | [-1.5,-0.5) to -1. | [-0.5,0.5) to 0.  | [0.5,1.5) to 1.              |
     

For convenience, versions of these functions with the dir argument hard-coded are provided. If you are writing performance-critical code you should use the versions with the hard-coded arguments (e.g. iround_down_exn). The _exn ones are the fastest.

The following properties hold:



If f <= iround_lbound || f >= iround_ubound, then iround* functions will refuse to round f, returning None or raising as appropriate.

includes positive and negative Float.infinity

min and max that return the other value if one of the values is a nan. Returns nan if both arguments are nan.

Returns the fractional part and the whole (i.e. integer) part. For example, modf (-3.14) returns { fractional = -0.14; integral = -3.; }!

mod_float x y returns a result with the same sign as x. It returns nan if y is 0. It is basically

 let mod_float x y = x -. float(truncate(x/.y)) *. y

not

 let mod_float x y = x -. floor(x/.y) *. y 

and therefore resembles mod on integers more than %.

Ordinary functions for arithmetic operations

These are for modules that inherit from t, since the infix operators are more convenient

A sub-module designed to be opened to make working with floats more convenient.

Like to_string, but guaranteed to be round-trippable.

It usually yields as few significant digits as possible. That is, it won't print 3.14 as 3.1400000000000001243. The only exception is that occasionally it will output 17 significant digits when the number can be represented with just 16 (but not 15 or less) of them.

Pretty print float, for example to_string_hum ~decimals:3 1234.1999 = "1_234.200" to_string_hum ~decimals:3 ~strip_zero:true 1234.1999 = "1_234.2" . No delimiters are inserted to the right of the decimal.

defaults to '_'

defaults to 3

defaults to false

Produce a lossy compact string representation of the float. The float is scaled by an appropriate power of 1000 and rendered with one digit after the decimal point, except that the decimal point is written as '.', 'k', 'm', 'g', 't', or 'p' to indicate the scale factor. (However, if the digit after the "decimal" point is 0, it is suppressed.) The smallest scale factor that allows the number to be rendered with at most 3 digits to the left of the decimal is used. If the number is too large for this format (i.e., the absolute value is at least 999.95e15), scientific notation is used instead. E.g.:

        to_padded_compact_string     (-0.01) =  "-0  "
        to_padded_compact_string       1.89  =   "1.9"
        to_padded_compact_string 999_949.99  = "999k9"
        to_padded_compact_string 999_950.    =   "1m "
      

In the case where the digit after the "decimal", or the "decimal" itself are omitted, the numbers are padded on the right with spaces to ensure the last two columns of the string always correspond to the decimal and the digit afterward (except in the case of scientific notation, where the exponent is the right-most element in the string and could take up to four characters).

        to_padded_compact_string    1. =    "1  ";
        to_padded_compact_string  1.e6 =    "1m ";
        to_padded_compact_string 1.e16 = "1.e+16";
        to_padded_compact_string max_finite_value = "1.8e+308";
      

Numbers in the range -.05 < x < .05 are rendered as "0 " or "-0 ".

Other cases:

        to_padded_compact_string nan          =  "nan  "
        to_padded_compact_string infinity     =  "inf  "
        to_padded_compact_string neg_infinity = "-inf  "
      

Exact ties are resolved to even in the decimal:

        to_padded_compact_string      3.25 =  "3.2"
        to_padded_compact_string      3.75 =  "3.8"
        to_padded_compact_string 33_250.   = "33k2"
        to_padded_compact_string 33_350.   = "33k4"
      


ldexp x n returns x *. 2 ** n

frexp f returns the pair of the significant and the exponent of f. When f is zero, the significant x and the exponent n of f are equal to zero. When f is non-zero, they are defined by f = x *. 2 ** n and 0.5 <= x < 1.0.

return the Class.t. Excluding nan the floating-point "number line" looks like:
               t                Class.t    example
             ^ neg_infinity     Infinite   neg_infinity
             | neg normals      Normal     -3.14
             | neg subnormals   Subnormal  -.2. ** -1023.
             | (-/+) zero       Zero       0.
             | pos subnormals   Subnormal  2. ** -1023.
             | pos normals      Normal     3.14
             v infinity         Infinite   infinity
     


is_finite t returns true iff classify t is in Normal; Subnormal; Zero;.

These functions construct and destruct 64-bit floating point numbers based on their IEEE representation with sign bit, 11-bit non-negative (biased) exponent, and 52-bit non-negative mantissa (or significand). See wikipedia for details of the encoding: http://en.wikipedia.org/wiki/Double-precision_floating-point_format.

In particular, if 1 <= exponent <= 2046, then: create_ieee_exn ~negative:false ~exponent ~mantissa = 2 ** (exponent - 1023) * (1 + (2 ** -52) * mantissa)

S-expressions contain at most 8 significant digits.