Skip to content

Latest commit

 

History

History
212 lines (142 loc) · 7.23 KB

File metadata and controls

212 lines (142 loc) · 7.23 KB
Feature Name rescript-integers
Start Date 2024-11-19
RFC PR #1
ReScript Issue (leave this empty)

Summary

Semantics definition of the ReScript's int type and integer primitives.

Motivation

ReScript has three numeric primitive types, int, float, and bigint.

The semantics of float and bigint completely match JavaScript's ones, but int is unique to ReScript and originally came from OCaml's int type.

int stands for 32-bit signed integers. It's a bit unusual for a language to have int32 only and no other precision — mostly for historical reasons, and it isn't very clear due to differences in behavior with JavaScript.

This RFC describes its semantics and chosen trade-offs as precisely as possible.

Definition

int is a built-in type.

type int

A numeric literal with only an integer part has type int.

let n = 100

The valid range of an integer literal is limited to the range of signed 32-bit integers $[-2^{31} .. 2^{31}-1]$.

Using unbounded numbers in literals may result in compile-time errors with messages such as "Integer literal exceeds the range of representable integers of type int."

Primitives

Let min_value be $-2^{31}$ and max_value be $2^{31}-1$

fromNumber: (x: number) => int

  1. Let int32 be ToInt32(x), return int32.

The ToInt32 behavior follows the definition in ECMA-262 as is. ReScript compiler uses bitwiseOR(number, 0) in action. This is what appears in the output as number | 0, which truncates all special numbers defined in IEEE-754.

The fromNumber shouldn't be directly exposed to the users. Applying the ToInt32 operation to special numeric values, such as Infinity, can lead to subtle bugs123.

Instead, public APIs should wrap it and perform bounds-checking, if necessary, either emit errors (explained further in the "API Consideration" section below) or notify the user via compiler warning.

int never contains the following values:

  • -0
  • NaN
  • Infinity and -Infinity
  • $x < $min_value
  • $x > $max_value

fromNumber(x) must be idempotent.

minus: (x: int) => int

  1. Let number be mathematically $-x$.
  2. Let int32 be fromNumber(number), return int32.

add: (x: int, y: int) => int

  1. Let number be mathematically $x + y$.
  2. Let int32 be fromNumber(number), return int32.

subtract: (x: int, y: int) => int

  1. Let number be mathematically $x - y$.
  2. Let int32 be fromNumber(number), return int32.

multiply: (x: int, y: int) => int

  1. Let number be mathematically $x * y$.
  2. Let int32 be fromNumber(number), return int32.

The multiply(x, y) must produce the same result as add(x) accumulated y times.

let multiply = (x, y) => {
  let id = 0
  let rec multiply = (x, y, acc) => {
    switch y {
    | 0 => acc
    | n => multiply(x, n - 1, add(x, acc))
    }
  }
  multiply(x, y, id)
}

exponentiate: (x: int, y: int) => int

  1. Let number be mathematically $x ^ y$.
  2. Let int32 be fromNumber(number), return int32.

The exponentiate(x, y) must produce the same result as multiply(x) accumulated y times.

let exponentiate = (x, y) => {
  let id = 1
  let rec exponentiate = (x, y, acc) => {
    switch y {
    | 0 => acc
    | n => exponentiate(x, n - 1, multiply(x, acc))
    }
  }
  exponentiate(x, y, id)
}

divide: (x: int, y: int) => int

  1. If y equals 0, raise Divide_by_zero.
  2. Let number be mathematically $x / y$.
  3. Let int32 be fromNumber(number), return int32.

remainder: (x: int, y: int) => int

  1. If y equals 0, raise Divide_by_zero.
  2. Let number be mathematically $x \mod y$.
  3. Let int32 be fromNumber(number), return int32.

abs: (x: int) => int

  1. If x is min_value, raise Overflow_value.
  2. If x is less than 0, return minus(x).
  3. return x.

API consideration

These primitive operations for int often don't work as intended by the user due to the fromNumber truncation.

Public APIs should make it safer by providing appropriate bunnds-checking, and errors with standard types.

Standard error types

type int_error =
  /** If the operand represents IEEE-754 `NaN`. */
  | NaN
  /** If the operation cannot be performed in the bounds of int32. */
  | Overflow_value
  /** If the operation perform devision value (first argument) by `0`. */
  | Divide_by_zero(int)

int_error can be used as error playload in a result, and also exception payload.

exception ConversionError(int_error)

let fromFloatUnsafe = (x: float) => {
  if x == Primitive_number.NaN {
    throw ConversionError(NaN)
  } else if x > Primitive_number.max_int32 || x < Primitive_number.min_int32 {
    throw ConversionError(Overflow_value)
  } else {
    Primitive_int32.fromNumber(x)
  }
}

let fromFloat = (x: int) => {
  switch fromFloatUnsafe(x) {
  | catch ConversionError(e) => Error(e)
  | x => Ok(x)
  }
}

Questions

Why do we even use int?

Using int is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases.

Additionally, using int benefits JavaScript programs since major JavaScript engines treat integers differently.

Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine for Chromium and Node.js) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation.

At compile time, the compiler ensures that certain operations are restricted to using only int types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element-kind transitions.

Why do we truncate values instead of bounds-checking?

It is also for backward compatibility.

Bounds-checking and failure early may be more useful for a fast feedback loop, but we don't want to break any programs that (accidentally) worked before.

The number | 0 is the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output.

Can we somehow make it match JavaScript's number?

Perhaps, we can make our number literals match JavaScript's number semantics. We could also rename int to int32 and assign another literal like 0l, as it was in OCaml syntax.

However, this will not happen in the near future. It won't occur until we are confident in our migration strategy to avoid breaking existing codebases. If done incorrectly, it could completely break compatibility with existing code or cause significant performance degradation.

Future posibilities

Guaranteeing the use of int32 types may offer additional advantages in the future when targeting WebAssembly or alternative native backends.

Footnotes

  1. https://github.com/rescript-lang/rescript/issues/6038

  2. https://github.com/rescript-lang/rescript/issues/6736

  3. https://github.com/rescript-lang/rescript/issues/6737