Query Structured Data

In this section, we will:

Top 10 most popular product categories

  1. Run query by clicking this button:
    USE {{ config['qubole_database_name'] }};
    
    SELECT c.category_name AS Category, COUNT(order_item_quantity) AS QuantityOrdered
    FROM order_items oi
    INNER JOIN products p ON oi.order_item_product_id = p.product_id
    INNER JOIN categories c ON c.category_id = p.product_category_id
    GROUP BY c.category_name
    ORDER BY QuantityOrdered DESC
    LIMIT 10;
    
  2. Switch to Analyze History tab and click on Results tab at the bottom to see query results

Top 10 revenue generating products

  1. Run query by clicking this button:
    USE {{ config['qubole_database_name'] }};
    
    SELECT p.product_name AS Product, r.revenue AS TotalRevenue
    FROM products p
    INNER JOIN (
      SELECT oi.order_item_product_id, CAST(round(SUM(oi.order_item_subtotal)) AS Int) AS revenue
      FROM order_items oi INNER JOIN orders o
      ON oi.order_item_order_id = o.order_id
      WHERE o.order_status <> 'CANCELED'
      AND o.order_status <> 'SUSPECTED_FRAUD'
      GROUP BY order_item_product_id
    ) r
    ON p.product_id = r.order_item_product_id
    ORDER BY TotalRevenue DESC
    LIMIT 10;
    
  2. Switch to Analyze History tab and click on Results tab at the bottom to see query results

What you’ve learned

How to create and query structured data using Hive and use familiar interfaces and tools such as SQL within the QDS environment.

You can save or re-run any query by switching to Analyze History tab, clicking on a query, and using buttons on the right: to save click on Save, to re-run query click on Run