Are contextual’s arms zero- or one-based?

In contextual, arms are one based. In other words, in contextual k arms are represented by k <- c(1,2,3,4,...)

What is this “self$theta_to_arms” list I see in some Policies?

In many of contextual’s Policy’s set_parameters() initialisation function, a parameter list is being assigned to a “self$theta_to_arms”. But the parameters themselves are later only available through self$theta.

The following source code comments how this little helper variable is used:

“Cannot add bindings to a locked environment”?

Contextual is build using the R6 class system. In R6, you cannot create new attributes to R6 objects just by assigning to $ indexed values as you can in S3. More information on R6 in general and public and private members in particular can be found in the Introduction to R6 classes vignette.

Messages from Bandits and Policies do not show in the console?

By default, contextual’s agents are running in separate parallel worker processes. As these worker processes have no direct access to the R console, they can not be tracked there. But there are two options to keep track of Bandit and Policy progress and their messages. The first is to set Simulator’s do_parallel argument to FALSE. The second is to set Simulator’s progress_file argument to TRUE. If TRUE, Simulator writes workers_progress.log, agents_progress.log and parallel.log files to the current working directory, allowing you to keep track of respectively workers, agents, and potential errors when running a Simulator in parallel mode.

How do I use R packages within custom Bandits and Policies?

If one of your custom Bandits or Policies requires a specific R package, use Simulator’s include_packages option to distribute the package(s) to each of a Simulator intance’s workers:

It is then good practice to use a double colon operator to access some function from any included packages:

Why are my custom classes slower than contextual’s?

Contextual makes ample use of the data.table package, which, if used the “data.table way”, can speed up in-memory data related operations substantially. For more infomation on data.table, see the Intro to data.table vignette, and, for instance, this data.table cheat sheet.

That does not imply that this is the only way to go - contextual’s Bandits and Policies make use of basic (fast) matrix based operations as well, and its Simulation and offline Bandits have been extended to, among others, read and write to and from the super fast open-source column-oriented Monet database management system.

Where do I instantiate a Bandit’s simulation level-randomization?

Bandit initialization() function is called before Bandit and Policy clones are distributed simulations. So generally, it is advisable to randomize within Bandits’ post_initialization() function, which is called right before starting every simulation.

Here, contextual’s Simulator class ensures that every Policy and Bandit pair (bound by their shared Agent) will run their simulations on a deterministically set of seeds, ensuring both replicability and fair comparisons between agents.

What is this public “class_name” member in every Bandit and Policy?

Contextual uses the public “class_name” member internally to keep track of Policy’s and Bandit’s, and to, among others, generate names for Agents’ that have not been named explicitly. So don’t forget to include it in your own custom Bandit’s and Policies!