Stack types

Overview

TypeType constantType mask constantDescription
(none)DUK_TYPE_NONEDUK_TYPE_MASK_NONEno type (missing value, invalid index, etc)
undefinedDUK_TYPE_UNDEFINEDDUK_TYPE_MASK_UNDEFINEDundefined
nullDUK_TYPE_NULLDUK_TYPE_MASK_NULLnull
booleanDUK_TYPE_BOOLEANDUK_TYPE_MASK_BOOLEANtrue and false
numberDUK_TYPE_NUMBERDUK_TYPE_MASK_NUMBERIEEE double
stringDUK_TYPE_STRINGDUK_TYPE_MASK_STRINGimmutable (plain) string or (plain) Symbol
objectDUK_TYPE_OBJECTDUK_TYPE_MASK_OBJECTobject with properties
bufferDUK_TYPE_BUFFERDUK_TYPE_MASK_BUFFERmutable (plain) byte buffer, fixed/dynamic/external; mimics an Uint8Array
pointerDUK_TYPE_POINTERDUK_TYPE_MASK_POINTERopaque pointer (void *)
lightfuncDUK_TYPE_LIGHTFUNCDUK_TYPE_MASK_LIGHTFUNCplain Duktape/C function pointer (non-object); mimics a Function

Memory allocations

The following stack types involve additional heap allocations:

While strings and buffers are considered a primitive (pass-by-value) types, they are a heap allocated type from a memory allocation viewpoint.

Pointer stability

Heap objects allocated by Duktape have stable pointers: the objects are not relocated in memory while they are reachable from a garbage collection point of view. This is the case for the main heap object, but not necessarily for any additional allocations related to the object, such as dynamic property tables or dynamic buffer data area. A heap object is reachable e.g. when it resides on the value stack of a reachable thread or is reachable through the global object. Once a heap object becomes unreachable any pointers held by user C code referring to the object are unsafe and should no longer be dereferenced.

In practice the only heap allocated data directly referenced by user code are strings, fixed buffers, and dynamic buffers. The data area of strings and fixed buffers is stable; it is safe to keep a C pointer referring to the data even after a Duktape/C function returns as long the string or fixed buffer remains reachable from a garbage collection point of view at all times. Note that this is not the case for Duktape/C value stack arguments, for instance, unless specific arrangements are made.

The data area of a dynamic buffer does not have a stable pointer. The buffer itself has a heap header with a stable address but the current buffer is allocated separately and potentially relocated when the buffer is resized. It is thus unsafe to hold a pointer to a dynamic buffer's data area across a buffer resize, and it's probably best not to hold a pointer after a Duktape/C function returns (as there would be no easy way of being sure that the buffer hadn't been resized).

For external buffers the stability of the data pointer is up to the user code which sets and updates the pointer.

Type masks

Type masks allows calling code to easily check whether a type belongs to a certain type set. For instance, to check that a certain stack value is a number, string, or an object:

if (duk_get_type_mask(ctx, -3) & (DUK_TYPE_MASK_NUMBER |
                                  DUK_TYPE_MASK_STRING |
                                  DUK_TYPE_MASK_OBJECT)) {
    printf("type is number, string, or object\n");
}

There is a specific API call for matching a set of types even more conveniently:

if (duk_check_type_mask(ctx, -3, DUK_TYPE_MASK_NUMBER |
                                 DUK_TYPE_MASK_STRING |
                                 DUK_TYPE_MASK_OBJECT)) {
    printf("type is number, string, or object\n");
}

These are faster and more compact than the alternatives:

// alt 1
if (duk_is_number(ctx, -3) || duk_is_string(ctx, -3) || duk_is_object(ctx, -3)) {
    printf("type is number, string, or object\n");
}

// alt 2
int t = duk_get_type(ctx, -3);
if (t == DUK_TYPE_NUMBER || t == DUK_TYPE_STRING || t == DUK_TYPE_OBJECT) {
    printf("type is number, string, or object\n");
}

None

The none type is not actually a type but is used in the API to indicate that a value does not exist, a stack index is invalid, etc.

Undefined

The undefined type maps to ECMAScript undefined, which is distinguished from a null.

Values read from outside the active value stack range read back as undefined.

Null

The null type maps to ECMAScript null.

Boolean

The boolean type is represented in the C API as an integer: zero for false, and non-zero for true.

Whenever giving boolean values as arguments in API calls, any non-zero value is accepted as a "true" value. Whenever API calls return boolean values, the value 1 is always used for a "true" value. This allows certain C idioms to be used. For instance, a bitmask can be built directly based on API call return values, as follows:

/* this works and generates nice code */
int bitmask = (duk_get_boolean(ctx, -3) << 2) |
              (duk_get_boolean(ctx, -2) << 1) |
              duk_get_boolean(ctx, -1);

/* more verbose variant not relying on "true" being represented by 1 */
int bitmask = ((duk_get_boolean(ctx, -3) ? 1 : 0) << 2) |
              ((duk_get_boolean(ctx, -2) ? 1 : 0) << 1) |
              (duk_get_boolean(ctx, -1) ? 1 : 0);

/* another verbose variant */
int bitmask = (duk_get_boolean(ctx, -3) ? (1 << 2) : 0) |
              (duk_get_boolean(ctx, -2) ? (1 << 1) : 0) |
              (duk_get_boolean(ctx, -1) ? 1 : 0);

Number

The number type is an IEEE double, including +/- Infinity and NaN values. Zero sign is also preserved. An IEEE double represents all integers up to 53 bits accurately.

IEEE double allows NaN values to have additional signaling bits. Because these bits are used by Duktape internal tagged type representation (when using 8-byte packed values), NaN values in the Duktape API are normalized. Concretely, if you push a certain NaN value to the value stack, another (normalized) NaN value may come out. Don't rely on NaNs preserving their exact form.

String

The string stack type is used to represent both plain strings and plain Symbols (introduced in ES2015). A string is an arbitrary byte sequence of a certain length which may contain internal NUL (0x00) values. Strings are always automatically NUL terminated for C coding convenience. The NUL terminator is not counted as part of the string length. For instance, the string "foo" has byte length 3 and is stored in memory as { 'f', 'o', 'o', '\0' }. Because of the guaranteed NUL termination, strings can always be pointed to using a simple const char * as long as internal NULs are not an issue for the application; if they are, the explicit byte length of the string can be queried with the API. Calling code can refer directly to the string data held by Duktape. Such string data pointers are valid (and stable) for as long as a string is reachable in the Duktape heap.

Strings are interned for efficiency: only a single copy of a certain string ever exists at a time. Strings are immutable and must NEVER be changed by calling C code. Doing so will lead to very mysterious issues which are hard to diagnose.

Calling code most often deals with ECMAScript strings, which may contain arbitrary 16-bit codepoints in the range U+0000 to U+FFFF but cannot represent non-BMP codepoints. This is how strings are defined in the ECMAScript standard. In Duktape, ECMAScript strings are encoded with CESU-8 encoding. CESU-8 matches UTF-8 except that it allows codepoints in the surrogate pair range (U+D800 to U+DFFF) to be encoded directly (prohibited in UTF-8). CESU-8, like UTF-8, encodes all 7-bit ASCII characters as-is which is convenient for C code. For example:

Duktape also uses extended strings internally. Codepoints up to U+10FFFF can be represented with UTF-8, and codepoints above that up to full 32 bits can be represented with extended UTF-8. The extended UTF-8 encoding used by Duktape is described in the table below. The leading byte is shown in binary (with "x" marking data bits) while continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx):

Codepoint rangeBitsByte sequenceNotes
U+0000 to U+007F70xxxxxxx
U+0080 to U+07FF11110xxxxx C
U+0800 to U+FFFF161110xxxx C CU+D800 to U+DFFF allowed (unlike UTF-8)
U+1 0000 to U+1F FFFF2111110xxx C C CAbove U+10FFFF allowed (unlike UTF-8)
U+20 0000 to U+3FF FFFF26111110xx C C C C
U+400 0000 to U+7FFF FFFF311111110x C C C C C
U+8000 0000 to U+F FFFF FFFF3611111110 C C C C C COnly 32 bits used in practice (up to U+FFFF FFFF)

The downside of the encoding for codepoints above U+7FFFFFFF is that the leading byte will be 0xFE which conflicts with Unicode byte order marker encoding. This is not a practical concern in Duktape's internal use.

Finally, invalid extended UTF-8 byte sequences are used for special purposes such as representing Symbol values. Invalid extended UTF-8/CESU-8 byte sequences never conflict with standard ECMAScript strings (which are CESU-8) and will remain cleanly separated within object property tables. For more information see Symbols and symbols.rst.

Strings with invalid extended UTF-8 sequences can be pushed on the value stack from C code and also passed to ECMAScript functions, with two caveats:

Object

The object type includes ECMAScript objects and arrays, functions, threads (coroutines), and buffer objects. In other words, anything with properties is an object. Properties are key-value pairs with a string key and an arbitrary value (including undefined).

Objects may participate in garbage collection finalization.

Buffer

The plain buffer type is a raw buffer for user data. It's more memory efficient than standard buffer object types like Uint8Array or Node.js Buffer. There are three plain buffer sub-types:

Buffer sub-type Data pointer Resizable Memory managed by Description
Fixed Stable, non-NULL No Duktape Buffer size is fixed at creation, memory is managed automatically by Duktape. Fixed buffers have an unchanging (stable) non-NULL data pointer.
Dynamic Unstable, may be NULL Yes Duktape Buffer size can be changed after creation, memory is managed automatically by Duktape. Requires two memory allocations internally to allow resizing. Data pointer may change if buffer is resized. Zero-size buffer may have a NULL data pointer.
External Unstable, may be NULL Yes Duktape and user code Buffer data is externally allocated. Duktape allocates and manages a heap allocated buffer header structure, while the data area pointer and length are configured by user code explicitly. External buffers are useful to allow a buffer to point to data structures outside Duktape heap, e.g. a frame buffer allocated by a graphics library. Zero-size buffer may have a NULL data pointer.

Unlike strings, buffer data areas are not automatically NUL-terminated and calling code must never access bytes beyond the currently allocated buffer size. The data pointer of a zero-size dynamic or external buffer may be NULL; fixed buffers always have a non-NULL data pointer.

Fixed and dynamic buffers are automatically garbage collected. This also means that C code must not hold onto a buffer data pointer unless the buffer is reachable to Duktape, e.g. resides in an active value stack. The data area of an external buffer is not automatically garbage collected so user code is responsible for managing its life cycle. User code must also make sure that no external buffer value is accessed when the external buffer is no longer (or not yet) valid.

Plain buffers mimic Uint8Array objects to a large extent, so they are a non-standard, memory efficient alternative to working with e.g. ArrayBuffer and typed arrays. The standard buffer objects are implemented on top of plain buffers, so that e.g. an ArrayBuffer will be backed by a plain buffer. See Buffer objects for more discussion.

A few notes:

Pointer

The pointer type is a raw, uninterpreted C pointer, essentially a void *. Pointers can be used to point to native objects (memory allocations, handles, etc), but because Duktape doesn't know their use, they are not automatically garbage collected. You can, however, put one or more pointers inside an object and use the object finalizer to free the native resources related to the pointer(s).

Lightfunc

The lightfunc type is a plain Duktape/C function pointer and a small set of control flags packed into a single tagged value which requires no separate heap allocations. The control flags (16 bits currently) encode: (1) number of stack arguments expected by the Duktape/C function (0 to 14 or varargs), (2) virtual length property value (0 to 15), and (3) a magic value (-128 to 127).

Lightfuncs are a separate tagged type in the Duktape C API, but behave mostly like Function objects for ECMAScript code. They have significant limitations compared to ordinary Function objects, the most important being:

Lightfuncs are useful for very low memory environments where the memory impact of ordinary Function objects matters. See Function objects for more discussion.