POKI_PUT_TOC_HERE
mymap={"a"=>1,"b"=>2}
or mylist=[3,4,5]
), autodoc (e.g.
Javadoc), and so on. Yet, while memory management is indeed Miller’s
trickiest aspect, its garbage-collection needs are well-delineated and so the
absence of GC is no great loss. Miller’s performance relies on
the principles of touching each byte as few times as possible, and
copying bytes only when necessary. This results in a baton-passing,
free-on-last-use memory-management pattern which works well enough. (See also
https://github.com/johnkerl/miller/blob/master/c/README.md.)
Miller doesn’t require a complex collections library: mostly simple hash
maps, hash sets, and linked lists which aren’t difficult to code.
Moreover, Miller’s primary data structure, the
lrec_t
,
is hand-tuned to Miller’s use case and would have required hand-coding in
any case.
Specifically, I did simple experiments in several languages — Ruby,
Python, Lua, Rust, Go, D. In one I just read lines and printed them back out
— a line-oriented cat
. In another I consumed input lines like
x=1,y=2,z=3
one at a time, split them on commas and equals signs to
populate hash maps, transformed them (e.g. remove the y
field), and
emitted them. Basically mlr cut -x -f y
with DKVP format. I
didn’t do anything fancy — just using each language’s
getline
, string-split, hashmap-put, etc. And nothing was as fast as
C, so I used C. Here are the experiments I kept (I failed to keep the
Lua code, for example):
C cat,
another C cat,
D cat,
Go cat,
another Go cat,
Rust cat,
Nim cat,
D cut,
Go cut,
Nim cut.
apt-get
or
yum install
away from it). This, I hope, bodes well for uptake
of Miller.
//
comments; enough said.)
this
pointers and attributesthis
pointers into method calls:
for example
class MyClass { private: char* a; public: MyClass(char* a) { this->a = strdup(a); } ~MyClass() { free(a); } int myMethod(char* b) { return strlen(a) + strlen(b); } }; ... MyClass* myObj = new MyClass("hello"); int x = myObj->myMethod("world");results in something like
void MyClass$constructorcharptr(MyClass* this, char* a) { this->a = strdup(a); } void MyClass$destructor(MyClass* this) { free(this->a); } int MyClass$myMethod(MyClass* this, char* b) { return strlen(this->a) + strlen(b); } MyClass* myObj = MyClass$constructorcharptr("hello"); int x = MyClass$myMethod(myObj, "world");It’s easy enough to imitate this: simply use the coding convention of prepending the class name to all methods, and placing this-pointers as the first arguments to methods. Miller uses precisely this approach. For example:
typedef struct _lrec_t { ... } lrec_t; // Constructors lrec_t* lrec_csv_alloc(...) { lrec_t* prec = malloc(sizeof(lrec_t); ... prec->attribute = ...; return prec; } lrec_t* lrec_dkvp_alloc(...) { ... } // Destructor void lrec_free(lrec_t* prec) { ... free(prec->attribute); ... free(prec); } // Methods int lrec_foo(lrec_t* prec, ...) { return prec->...; } void lrec_bar(lrec_t* prec, ...) { prec->...; }This implements the object-oriented principle of encapsulation.
#include <stdio.h> #include <containers/lrec.h> typedef lrec_t* reader_func_t(FILE* fp, void* pvstate, context_t* pctx); typedef void reset_func_t(void* pvstate); typedef void reader_free_func_t(void* pvstate); typedef struct _reader_t { void* pvstate; reader_func_t* preader_func; // Interface method reset_func_t* preset_func; // Interface method reader_free_func_t* pfree_func; // Interface method } reader_t;A class implementing this interface might look like
// Attributes are private to this file typedef struct _reader_csv_state_t { ... } reader_csv_state_t; // Implementation of interface methods. Marked static (file-scope) to not // pollute the global namespace; exposed only via function pointers. static lrec_t* reader_csv_func(FILE* input_stream, void* pvstate, context_t* pctx) { reader_csv_state_t* pstate = pvstate; ... use various pstate->attributes ... } static void reset_csv_func(void* pvstate) { reader_csv_state_t* pstate = pvstate; ... use various pstate->attributes ... } static void reader_csv_free(void* pvstate) { ... use various pstate->attributes ... } // Constructor reader_t* reader_csv_alloc(...) { reader_t* preader = mlr_malloc_or_die(sizeof(reader_t)); reader_csv_state_t* pstate = mlr_malloc_or_die(sizeof(reader_csv_state_t)); ... set various pstate->attributes ... preader->pvstate = (void*)pstate; preader->preader_func = &reader_csv_func; preader->preset_func = &reset_csv_func; preader->pfree_func = &reader_csv_free; return preader; } // Factory method ... reader_t* preader = reader_csv_alloc(...); ... // Method call ... lrec_t* pinrec = preader->preader_func(input_stream, preader->pvstate, pctx); ...This implements the object-oriented principles of polymorphism and runtime binding. More details are at https://github.com/johnkerl/miller/tree/master/c/containers.