Welcome to Big Data Processor’s documentation!

Towards the next generation of the web workbench for data scientists

Big Data Processor (BDP) is a generic, light-weight, yet complete and scalable workflow platform. This web workbench mainly focuses on the workflow portability, reproducibility and scalability. We try to break the borderlines of workflow management systems and traditional web service applications. Each analysis package of BDP is portable and can be easily installed on every BDP host. Also, a package carries its own developer-customized web pages that allow users to set parameters, execute workflows, and visualize workflow results interactively. The user experience of each BDP package can be just like a traditional web service app, so that developers no longer need to create one web service for each workflow.

Portable and customizable web interface for workflows

Most workflow management systems only help us to execute and monitor workflows. After that, users offten have to find their way to explore results. Here, with BDP, workflow builders are allowed to freely develop web pages for their workflows both to provide guidances for workflow executions and to provide interactive data visualizations. With these customized web pages, users can therefore visualize results interactively right after workflow executions.

In addition, a workflow might not be fully automated. Some steps may require additional checks to decide parameters for the following analysis. For such semi-automatic workflows, a traditional solution may be to construct a web services for each workflow. Now, developers can have another choise. With BDP, developers need NOT build another web application. Instead, developers can freely develop web pages to communicate to BDP server without worrying about the backend server scripts or task scheduling. We provide a javascript library so that developers do not have to handle HTTP request and response events by themselves.

Our goal is to make development and execution of data analysis workflow all on web pages!

This web workbench provides built-in web interfaces for users to manage projects, execute workflows, monitoring tasks, and install packages. As for developers, BDP also offers built-in web pages to write scripts, define tasks, piping tasks into workflows, design web interfaces, and export these componets into portable packages. Packages of BDP are in a portable format that can be installed on every BDP hosts by just mouse clicks. Then, these workflows in the packages are ready to run with near-zero configurations!

The coolest part is that our Page system also provides the web proxy functionality. Our Page system can not only serve the above mentioned customized web pages, but can also serve containerized web services or even desktop applications, such as R/Shiny apps, RStudio, Jupyter Notebook or Matlab IDE and many more!