Qubole Auto-Scaling and Cluster lifecycle management

Through this Quickstart you have imported structured data from MySQL into Qubole and analyzed it with unstructured data stored in S3, all without worrying about managing the clusters needed to run the analysis. Qubole automatically provisions clusters as soon as users submit workloads and automatically scales clusters up and down to satisfy workload demands. This Quickstart configured a hadoop2 cluster with minimum 1 node and maximiun {{ config['hadoop_max_nodes_count'] }} nodes. We will show you auto-scale in action in this section

Qubole auto-scaling

  1. Click to submit simultaneous queries to the {{ config['hadoop_cluster_name'] }} cluster
  2. Switch to Clusters, locate the {{ config['hadoop_cluster_name'] }} cluster, click on resources at the right side and select AutoScaling logs
  3. This log shows how Qubole automatically started and scaled the {{ config['hadoop_cluster_name'] }} cluster. Look for entries similar to these:
    Current cluster size (excluding master node) is: 2
    INFO - UPSCALE: Adding to provision new nodes: 1
    INFO - UPSCALE: Required nodes from autoscaling: 1 nodes
  4. Those indicate that Qubole automatically scale up the cluster based on the demand from the queries you submitted in this section