Thanks for choosing Pentaho Data Integration and joining a huge community of users and developers contributing to the long-term success of this project. Your contributions will help us innovate and improve every day.
Here are some instructions to help you engage and move our collective project forward.
Matt Casters, Chief of data integration at Pentaho and Pedro Alves, SVP Community at Pentaho
Getting Started
New user? Want to know how to get started? Check out our documentation. These documents are also available in the Pentaho Data Integration installation directory, under docs/.
Documentation
Both the Pentaho InfoCenter and the Kettle Wiki are extensive repositories of information. To learn more spend time navigating content through these important links.
- InfoCenter, Create DI Solutions
- Transformation Steps Documentation
- Job Steps Documentation
- Community Plugins
Samples
PDI is bundled with numerous samples. To view, from the menu bar, click on File > Import from an XML file and navigate to the samples under your PDI installation directory.
Forums
The Pentaho forums are an excellent resource not only to get help from experts but also to pass on your expertise to less experienced users
IRC
Internet Relay Chat, aka facebook of the 90’s. Good old IRC surprisingly is not dead. On the contrary, since generic “chit-chat” moved, it’s one of the best technical resources for real-time collaborative communication.
Use your favorite client to meet us on server irc.freenode.net, channel ##pentaho or use the webchat client
Mailing List
This is the core developers’ classic communication. To engage in low level coding and architecture decisions, this is the way to go.
Jira
Think you found a bug? Report it! We might not know about it and it’s the simplest way to help with the project.
Get the Source
Want to build kettle from source? Fork it from github at git@github.com:pentaho/pentaho-kettle.git. Bring the bug fixes on.
Continuous Integration
Curious about what the next version will look like? Want to see if a bug is fixed? Download and test the most up to date version from our CI builds
Build Your Own Plugin
Can’t find a particular step? If you have some Java knowledge go ahead and build it! PDI is a pluggable platform, and it’s simple to add new steps. We’ll help you with the process and then go ahead and submit it back to the community.
Blogs
Follow some of our most active bloggers
- Matt Casters on Data Integration
- Matt Burgess on Fun with PDI
- Jens Bleuel about Kettle
- Diethard Steiner on Business Intelligence
- Pentaho Corp blog
- Will Gorman’s blog
- James Dixon’s blog
- Pedro Alves on Business Intelligence
Books
Take a look at our literature on Kettle
- Pentaho Kettle Solutions from Matt Casters, Roland Bouman and Jos van Dongen
- Pentaho Data Integration Beginner’s Guide - Second Edition from María Carina Roldán
- Pentaho Data Integration 4 Cookbook from María Carina Roldán and Adrián Sergio Pulvirenti
- Instant Pentaho Data Integration Kitchen from Sergio Ramazzina
Get in Touch
Want to get in touch with comments, suggestions? We would love to hear from you.
BI Platform
The BI Platform referenced as the Pentaho Business Analytics Platform serves as the connection point for all other projects. The platform enables the delivery of a unified, end to end solution from data integration to visualization and consumption of data. The Pentaho BA Platform runs in the Tomcat Java Application Server and can be embedded into other Java Application Servers.
Kettle Project
Pentaho Data Integration, codenamed Kettle, consists of core data integration (ETL) engine, and GUI applications to allow users to define data integration jobs and transformations and deliver outstanding performance. It is the premier data integration tool for both standalone ETL, running jobs or heavier logic in the Business Analytics platform.
Mondrian Project
Pentaho Analysis Services, codenamed Mondrian, is an open source, Java, OLAP (online analytical processing) server. Mondrian supports the MDX (multidimensional expressions) query language, XML for analysis and OLAP4J interface specifications.
Mondrian is the center piece of multiple projects around the world.
Pentaho Reporting
Pentaho Report Designer is a visual, banded report writer for creating printable pixel perfect reports. The Report Designer queries and uses data from multiple sources and outputs to several formats. It includes a core reporting engine capable of generating reports based on an XML definition file.
Weka Project
Pentaho Data Mining uses the Waikato Environment for Knowledge Analysis (Weka) to mine data, identify patterns and predict future outcomes. Weka consists of machine learning algorithms for a broad set of data mining tasks, including data processing, regression analysis, classification methods, cluster analysis, and visualization.
Ctools
The Community tools project includes an array of projects that leverage the Pentaho platform. Each tool package begins with an abbreviated name and with C to designate community origins, representing that it is open-source and available free of charge. These tools are produced and managed by Webdetails, a Pentaho company.
We thank our Pentaho Data Integration contributors, whose significant efforts in writing code and documentation and testing been instrumental in creating and maintaining the product, its associated development kits, our build tools and web sites.
- Aaron Phillips
- Alex Silva
- Andrew Hoesley
- Angelo Rodriguez
- Audin Chan
- Ben Lienig
- Bernardo Arlandis
- Bill Seyler
- Biswapesh Chattopadhyay
- Bo Conroy
- Bryan Hagan
- Bryan Rosander
- Curtis Boyden
- Daniel Einspanjer
- David Kincade
- Dennis Van Roeyen
- Doug Moran
- Ezequiel Cuellar-Ojeda
- Gretchen Moran
- Henri Dupre
- Hiroyuki Kawaguchi
- Holger Hymmen
- Itzik Pailis
- Jake Cornelius
- James Dixon
- Jay Goldman
- Jean-Francois Daune
- Jeffrey Thomas
- Jens Bleuel
- Jianjun Chu
- John Doe
- Johnny Vanhentenryk
- Jordan Ganoff
- Kasper Sørensen
- Kurtis Walker
- Lee Cheng
- Luc Boudreau
- Manfred Olm
- Marc Batchelor
- Maria Carina
- Mark Hall
- Mathias Stödtler
- Mat Lowery
- Matt Burgess
- Matt Casters
- Michael Gugerell
- Michel Jansen
- Mike D’Amour
- Nicholas Goodman
- Nick Baker
- Nicola Benaglia
- Paul Stoellberger
- Paul Sung
- Pedro Alves
- Pentaho Build Guy
- Phillip Cole
- Pieter van der Merwe
- Robert Mansoor
- Rob Fellows
- Roland Bouman
- Samatar Hassan
- Sean Flatley
- Shingo Yamagami
- Slawomir Chodnicki
- Steven Barkdull
- Sven Boden
- Sven Thiergen
- Sylvain Decloix
- Tomas Di Domenico
- Tom Qin
- Tony Cook
- Will Gorman
- Wim De Clercq
- Wintner Robert
- YoungWoo Kim