Pentaho Data Integration

Welcome
Meet the Family
Credits
Many Reasons to Get Enterprise Edition

Get the Most
From Pentaho

Pentaho Community
Projects and Capabilities

Open source at
its core

There's No Better Time

Want to take your implementation to the next level? Experience for yourself our comprehensive data
integration, visualization and analysis tools and create customizable reports and interactive dashboards.

Get Pentaho Enterprise Edition

Getting Started

New user? Want to know how to get started? Check out our documentation. These documents are also available in the Pentaho Data Integration installation directory, under docs/.

Documentation

Both the Pentaho Help and the Kettle Wiki are extensive repositories of information. To learn more spend time navigating content through these important links.

Samples

PDI is bundled with numerous samples. To view, from the menu bar, click on File > Import from an XML file and navigate to the samples under your PDI installation directory.

Forums

The Pentaho forums are an excellent resource not only to get help from experts but also to pass on your expertise to less experienced users.

IRC

Internet Relay Chat, aka facebook of the 90’s. Good old IRC surprisingly is not dead. On the contrary, since generic “chit-chat” moved, it’s one of the best technical resources for real-time collaborative communication.

Use your favorite client to meet us on server irc.freenode.net, channel ##pentaho or use the webchat client.

Mailing List

This is the core developers’ classic communication method. To engage in low level coding and architecture decisions, this is the way to go.

Jira

Think you found a bug? Report it! We might not know about it and it’s the simplest way to help with the project.

Get the Source

Want to build kettle from source? Fork it from github at git@github.com:pentaho/pentaho-kettle.git. Bring the bug fixes on.

Continuous Integration

Curious about what the next version will look like? Want to see if a bug is fixed? Download and test the most up-to-date version from our CI builds.

Build Your Own Plugin

Can’t find a particular step? Do you want to extend PDI? If you have some Java knowledge go ahead and build it! PDI is a pluggable platform, and it’s simple to add new steps. We’ll help you with the process and then go ahead and submit it back to the community.


Embed PDI

To embed the PDI engine into your own Java applications, check out our PDI SDK.

Blogs

Follow some of our most active bloggers.

Books

Take a look at our literature on Kettle.

Contact Us

Want to get in touch with comments, suggestions? We would love to hear from you.

BI Platform

The BI Platform referenced as the Pentaho Business Analytics Platform serves as the connection point for all other projects. The platform enables the delivery of a unified, end to end solution from data integration to visualization and consumption of data. The Pentaho Business Analytics Platform runs in the Tomcat Java Application Server and can be embedded into other Java Application Servers.

Kettle Project

Pentaho Data Integration, codenamed Kettle, consists of core data integration (ETL) engine, and GUI applications to allow users to define data integration jobs and transformations and deliver outstanding performance. It is the premier data integration tool for ETL, running jobs and complex logic orchestration in the Business Analytics Platform.

Mondrian Project

Pentaho Analysis Services, codenamed Mondrian, is an open source, Java, OLAP (online analytical processing) server. Mondrian supports the MDX (multidimensional expressions) query language, XML for analysis and OLAP4J interface specifications.

Mondrian is the centerpiece of multiple projects around the world.

Pentaho Reporting

Pentaho Report Designer is a visual, banded report writer for creating printable pixel perfect reports. The Report Designer queries and uses data from multiple sources and outputs to several formats. It includes a core reporting engine capable of generating reports based on an XML definition file.

Weka Project

Pentaho Data Mining uses the Waikato Environment for Knowledge Analysis (Weka) to mine data, identify patterns, and predict future outcomes. Weka consists of machine learning algorithms for a broad set of data mining tasks, including data processing, regression analysis, classification methods, cluster analysis, and visualization.

CTools

The Community Tools project includes an array of projects that leverage the Pentaho platform. Each tool package begins with an abbreviated name and the letter "C" to designate community origins. This indicates it is open-source and available free of charge. These tools are produced and managed by Webdetails, a Pentaho company.


We thank our Pentaho Data Integration contributors, whose significant efforts in writing code and documentation and testing been instrumental in creating and maintaining the product, its associated development kits, our build tools and web sites.

  • Aaron Phillips
  • Alex Silva
  • Andrew Hoesley
  • Angelo Rodriguez
  • Audin Chan
  • Ben Lienig
  • Bernardo Arlandis
  • Bill Seyler
  • Biswapesh Chattopadhyay
  • Bo Conroy
  • Bryan Hagan
  • Bryan Rosander
  • Curtis Boyden
  • Daniel Einspanjer
  • David Kincade
  • Dennis Van Roeyen
  • Doug Moran
  • Ezequiel Cuellar-Ojeda
  • Gretchen Moran
  • Henri Dupre
  • Hiroyuki Kawaguchi
  • Holger Hymmen
  • Itzik Pailis
  • Jake Cornelius
  • James Dixon
  • Jay Goldman
  • Jean-Francois Daune
  • Jeffrey Thomas
  • Jens Bleuel
  • Jianjun Chu
  • John Doe
  • Johnny Vanhentenryk
  • Jordan Ganoff
  • Kasper Sørensen
  • Kurtis Walker
  • Lee Cheng
  • Luc Boudreau
  • Manfred Olm
  • Marc Batchelor
  • Maria Carina
  • Mark Hall
  • Mathias Stödtler
  • Mat Lowery
  • Matt Burgess
  • Matt Casters
  • Michael Gugerell
  • Michel Jansen
  • Mike D’Amour
  • Nicholas Goodman
  • Nick Baker
  • Nick Hudak
  • Nicola Benaglia
  • Paul Stoellberger
  • Paul Sung
  • Pedro Alves
  • Pentaho Build Guy
  • Phillip Cole
  • Pieter van der Merwe
  • Robert Mansoor
  • Rob Fellows
  • Roland Bouman
  • Samatar Hassan
  • Sean Flatley
  • Shingo Yamagami
  • Slawomir Chodnicki
  • Steven Barkdull
  • Sulaiman Karmali
  • Sven Boden
  • Sven Thiergen
  • Sylvain Decloix
  • Tomas Di Domenico
  • Tom Qin
  • Tony Cook
  • Will Gorman
  • Wim De Clercq
  • Wintner Robert
  • YoungWoo Kim

STARTBUTTONIZE upgradeToEE
For Kettle

Enhanced Functionality

Expanded Core Capabilities

  • Includes enterprise repository that provides security and revision management
  • Includes the DI Server for scheduling and running the enterprise repository
  • Integrates job and transformation scheduling in a visual interface
  • Provides logging and monitoring features
  • Support for the JBoss platform

New Capabilities in 5.0 & 5.1

  • Support for R Scripts in transformations
  • Includes security enhancements such as support for Kerberos and AES
  • Provides fine-grained security for connections; includes new execute permission
  • Provides row-level load balancing at the transformation step level
  • Includes job-level database transaction support, commit, and rollback
  • Provides a thin-client JDBC driver
  • Includes steps that can only be found in the Enterprise Edition such as:
    • R Script Executor
    • JMS Consumer and Producer
    • IBM Websphere MQ Consumer and Producer
    • Google Docs Input
    • Splunk Input and Output
    • Knowledge Flow
    • Weka Scoring
    • ARFF Output

Big Data Capabilties

  • Includes the Pentaho Visual MapReduce for drag and drop development for Hadoop, no coding required
  • In-Hadoop deployment for dramatic performance improvements
  • Includes Pentaho Big Data support and expertise to ensure your success

To learn more about these or any other capabilities, see the Pentaho Help.

For Pentaho BI Suite

Enhanced Functionality

With Pentaho Enterprise Edition you can move from core business intelligence functionality and individual components to advanced functionality, managed releases and a single install with a certified and quality assured release. Take a look at the Enterprise edition capabilities to get started now.

Data Discovery, Analysis and Visualization

With Pentaho Analyzer an intuitive, interactive web user interface enables free exploration and visualization of all data—including big data.

  • Interactive visual analysis allows decision makers to drill into data for greater insight
  • Advanced visualizations including geo-mapping, heat grids and scatter/bubble charts
  • Drag and drop operational report creation
  • Extreme scale in-memory data caching for speed-of-thought analysis

Dashboards

Delivering key performance indicators in a highly graphical, interactive visual interface, Pentaho dashboards provide critical information business users neet to understand and improve organizational performance.

  • Rich graphical visualizations with navigation, drill through and a rich library of filter controls
  • Web-based drag and drop dashboard designer for business users
  • Portal and mash-up integration to seamlessly integrate business analytics with other applications

Administration and Deployment Options

The Administration Perspective provides centralized management tools to easily and efficiently develop, deploy and manage the Pentaho platform.

  • Analytic content permissions, versioning, locking and expiration
  • Backup and recovery
  • Performance monitoring and usage auditing

Built on a contemporary lightweight, high-performance platform Pentaho can be flexibly deployed on-premise, in the cloud, seamlessly embedded into other software applications and is available on the iPad for viewing and creating content for a true mobile experience

Support and Expertise

Pentaho Expertise

Content

  • Pentaho InfoCenter includes complete product documentation to ensure you have the right information when you need it.

People

  • Enterprise Edition Online Forum
  • Remote Assistance “check points”
  • Professional training
  • Professional services

Professional Support

Pentaho subscription includes professional technical support to help answer and resolve any product-related issues and questions with your Pentaho Business Analytics solution.

Professional Support You Can Depend On

Pentaho support plans encompass bug fixes, technical issues and developer assistance directly from our experts. With professional support you will be up and running quickly- and avoiding downtime. Bug fixes and patches are made available on a monthly basis and are incorporated into future versions of the product.

Customer satisfaction is our highest priority. Pentaho hires, trains and certifies only the brightest and highest quality support staff to ensure you receive the expertise and assistance needed to ensure your success.

Choose our Professional Support

  • On-site or remote design, development and deployment assistance.
  • Assistance with configuration and resolving performance bottlenecks.
  • Quick response to bug fixes and patches that automatically roll into future product releases.
  • Upgrade assistance.
  • Rapid response times for business-critical applications.
  • Alerts for relevant bug fixes and patches, security holes, optimization tips, and more.
  • Extended lifecycle support, minimum 2 years per major release.
  • Solution architecture and implementation roadmap collaboration.
  • Best practice sharing and guidance.

Training and Certification

Training is critical to the success of your Pentaho implementation. Not only will it reduce implementation risks and accelerate user productivity – proper training also ensures you reap all of the benefits Pentaho’s tools can provide for your company.

Pentaho offers 2 certification programs - Pentaho Solution and Pentaho Data Integration – that confirms and recognizes your skills and competence in building Pentaho solutions.

ENDBUTTONIZE upgradeToEE