Pentaho Data Integration

Engage
Meet the Family
Credits
Upgrade to Enterprise Edition

Pentaho Data Integration delivers powerful Extraction, Transformation, and Loading (ETL) capabilities using an innovative, metadata-driven approach. If you are new to Pentaho Data Integration, please use the links below to learn more and become active in the community

Pentaho community projects and capabilities continue to grow, evolve and expand everyday. Each component plays an important role in delivering the most comprehensive and robust business analytics platform available. Take a look below to learn more about major components of the projects

All Pentaho projects are opensource at their core. As a result they require a huge collaborative effort.
Listed below are the main developers and contributors responsible for building the project and creating its success today. We would love to add your name to the list.

Want to take your implementation to the next level? Experience for yourself our comprehensive data integration, visualization and analysis tools and create customizable reports and interactive dashboards. Pentaho provides professional support, services and add-ons that will allow you to get up and running in no time

Try Pentaho

Thanks for choosing Pentaho Data Integration and joining a huge community of users and developers contributing to the long-term success of this project. Your contributions will help us innovate and improve every day.

Here are some instructions to help you engage and move our collective project forward.

Matt Casters, Chief of data integration at Pentaho and Pedro Alves, SVP Community at Pentaho

Getting Started

New user? Want to know how to get started? Check out our documentation. These documents are also available in the Pentaho Data Integration installation directory, under docs/.

Documentation

Both the Pentaho InfoCenter and the Kettle Wiki are extensive repositories of information. To learn more spend time navigating content through these important links.

Samples

PDI is bundled with numerous samples. To view, from the menu bar, click on File > Import from an XML file and navigate to the samples under your PDI installation directory.

Forums

The Pentaho forums are an excellent resource not only to get help from experts but also to pass on your expertise to less experienced users

IRC

Internet Relay Chat, aka facebook of the 90’s. Good old IRC surprisingly is not dead. On the contrary, since generic “chit-chat” moved, it’s one of the best technical resources for real-time collaborative communication.

Use your favorite client to meet us on server irc.freenode.net, channel ##pentaho or use the webchat client

Mailing List

This is the core developers’ classic communication. To engage in low level coding and architecture decisions, this is the way to go.

Jira

Think you found a bug? Report it! We might not know about it and it’s the simplest way to help with the project.

Get the Source

Want to build kettle from source? Fork it from github at git@github.com:pentaho/pentaho-kettle.git. Bring the bug fixes on.

Continuous Integration

Curious about what the next version will look like? Want to see if a bug is fixed? Download and test the most up to date version from our CI builds

Build Your Own Plugin

Can’t find a particular step? If you have some Java knowledge go ahead and build it! PDI is a pluggable platform, and it’s simple to add new steps. We’ll help you with the process and then go ahead and submit it back to the community.

Blogs

Follow some of our most active bloggers

Books

Take a look at our literature on Kettle

Get in Touch

Want to get in touch with comments, suggestions? We would love to hear from you.

BI Platform

The BI Platform referenced as the Pentaho Business Analytics Platform serves as the connection point for all other projects. The platform enables the delivery of a unified, end to end solution from data integration to visualization and consumption of data. The Pentaho BA Platform runs in the Tomcat Java Application Server and can be embedded into other Java Application Servers.

Kettle Project

Pentaho Data Integration, codenamed Kettle, consists of core data integration (ETL) engine, and GUI applications to allow users to define data integration jobs and transformations and deliver outstanding performance. It is the premier data integration tool for both standalone ETL, running jobs or heavier logic in the Business Analytics platform.

Mondrian Project

Pentaho Analysis Services, codenamed Mondrian, is an open source, Java, OLAP (online analytical processing) server. Mondrian supports the MDX (multidimensional expressions) query language, XML for analysis and OLAP4J interface specifications.

Mondrian is the center piece of multiple projects around the world.

Pentaho Reporting

Pentaho Report Designer is a visual, banded report writer for creating printable pixel perfect reports. The Report Designer queries and uses data from multiple sources and outputs to several formats. It includes a core reporting engine capable of generating reports based on an XML definition file.

Weka Project

Pentaho Data Mining uses the Waikato Environment for Knowledge Analysis (Weka) to mine data, identify patterns and predict future outcomes. Weka consists of machine learning algorithms for a broad set of data mining tasks, including data processing, regression analysis, classification methods, cluster analysis, and visualization.

Ctools

The Community tools project includes an array of projects that leverage the Pentaho platform. Each tool package begins with an abbreviated name and with C to designate community origins, representing that it is open-source and available free of charge. These tools are produced and managed by Webdetails, a Pentaho company.


We thank our Pentaho Data Integration contributors, whose significant efforts in writing code and documentation and testing been instrumental in creating and maintaining the product, its associated development kits, our build tools and web sites.

  • Aaron Phillips
  • Alex Silva
  • Andrew Hoesley
  • Angelo Rodriguez
  • Audin Chan
  • Ben Lienig
  • Bernardo Arlandis
  • Bill Seyler
  • Biswapesh Chattopadhyay
  • Bo Conroy
  • Bryan Hagan
  • Bryan Rosander
  • Curtis Boyden
  • Daniel Einspanjer
  • David Kincade
  • Dennis Van Roeyen
  • Doug Moran
  • Ezequiel Cuellar-Ojeda
  • Gretchen Moran
  • Henri Dupre
  • Hiroyuki Kawaguchi
  • Holger Hymmen
  • Itzik Pailis
  • Jake Cornelius
  • James Dixon
  • Jay Goldman
  • Jean-Francois Daune
  • Jeffrey Thomas
  • Jens Bleuel
  • Jianjun Chu
  • John Doe
  • Johnny Vanhentenryk
  • Jordan Ganoff
  • Kasper Sørensen
  • Kurtis Walker
  • Lee Cheng
  • Luc Boudreau
  • Manfred Olm
  • Marc Batchelor
  • Maria Carina
  • Mark Hall
  • Mathias Stödtler
  • Mat Lowery
  • Matt Burgess
  • Matt Casters
  • Michael Gugerell
  • Michel Jansen
  • Mike D’Amour
  • Nicholas Goodman
  • Nick Baker
  • Nicola Benaglia
  • Paul Stoellberger
  • Paul Sung
  • Pedro Alves
  • Pentaho Build Guy
  • Phillip Cole
  • Pieter van der Merwe
  • Robert Mansoor
  • Rob Fellows
  • Roland Bouman
  • Samatar Hassan
  • Sean Flatley
  • Shingo Yamagami
  • Slawomir Chodnicki
  • Steven Barkdull
  • Sven Boden
  • Sven Thiergen
  • Sylvain Decloix
  • Tomas Di Domenico
  • Tom Qin
  • Tony Cook
  • Will Gorman
  • Wim De Clercq
  • Wintner Robert
  • YoungWoo Kim
STARTBUTTONIZE upgradeToEE
For Kettle

Enhanced Functionality

Expanded Core Capabilities

  • Enterprise repository with 3rd party security, revision management and team-work
  • Integrated interface for scheduling
  • Data Integration Server for scheduling and running the enterprise repository
  • Specific documentation
  • Logging and monitoring dashboards

Pentaho 5.0 Capabilities

  • Job level database transaction support, commit and rollback
  • Checkpoint restart
  • Checkpoint and restart support for jobs, auto-restart after failure
  • Row level load balancing at the transformation step level
  • Thin client JDBC driver

Big Data Capabilties

  • Pentaho Visual MapReduce for drag and drop development for Hadoop, no coding required
  • In-Hadoop deployment for dramatic performance improvements
  • Instaview for big data discovery on the leading big data stores including Hadoop, Cassandra, HBase, MongoDB and more
  • Pentaho Big Data support and expertise to ensure your success
For Pentaho BI Suite

Enhanced Functionality

With Pentaho Enterprise Edition you can move from core business intelligence functionality and individual components to advanced functionality, managed releases and a single install with a certified and quality assured release. Take a look at the Enterprise edition capabilities to get started now.

Data Discovery, Analysis and Visualization

With Pentaho Analyzer an intuitive, interactive web user interface enables free exploration and visualization of all data—including big data.

  • Interactive visual analysis allows decision makers to drill into data for greater insight
  • Advanced visualizations including geo-mapping, heat grids and scatter/bubble charts
  • Drag and drop operational report creation
  • Extreme scale in-memory data caching for speed-of-thought analysis

Dashboards

Delivering key performance indicators in a highly graphical, interactive visual interface, Pentaho dashboards provide critical information business users neet to understand and improve organizational performance.

  • Rich graphical visualizations with navigation, drill through and a rich library of filter controls
  • Web-based drag and drop dashboard designer for business users
  • Portal and mash-up integration to seamlessly integrate business analytics with other applications

Administration and Deployment Options

The Administration Perspective provides centralized management tools to easily and efficiently develop, deploy and manage the Pentaho platform.

  • Analytic content permissions, versioning, locking and expiration
  • Backup and recovery
  • Performance monitoring and usage auditing

Built on a contemporary lightweight, high-performance platform Pentaho can be flexibly deployed on-premise, in the cloud, seamlessly embedded into other software applications and is available on the iPad for viewing and creating content for a true mobile experience

Support and Expertise

Pentaho Expertise

Content

  • Pentaho Infocenter includes complete product documentation to ensure you have the right information when you need it.

People

  • Enterprise Edition Online Forum
  • Remote Assistance “check points”
  • Professional training
  • Professional services

Professional Support

Pentaho subscriptions include professional technical support plans designed to help resolve and anticipate product-related issues when designing, developing, deploying and supporting your Pentaho Business Analytics implementation.

Professional Support You Can Depend On

Pentaho support plans encompass bug fixes, technical issues and developer assistance directly from our experts. With professional support you will be up and running quickly- and avoiding downtime. Bug fixes and patches are delivered directly and are incorporated into future versions of the product.

Customer satisfaction is our highest priority. Pentaho hires, trains and certifies only the brightest and highest quality support staff to ensure you receive the expertise and assistance needed to ensure your success.

Choose our Professional Support

  • Remote design, development or deployment assistance.
  • Assistance with configuration and resolving performance bottlenecks
  • Quick response to bug fixes and patches that automatically roll into future product releases.
  • Upgrade assistance
  • Rapid response times for business-critical applications.
  • Alerts for relevant bug fixes and patches, security holes, optimization tips, and more
  • Extended lifecycle support, minimum 2 years per major release

Training

Training is critical to the success of your Pentaho implementation. Not only will it reduce implementation risks and accelerate user productivity – proper training also ensures you reap all of the benefits Pentaho’s tools can provide for your company.

ENDBUTTONIZE upgradeToEE