Pentaho 5.0 CE is released – What’s New in PDI & Big Data?

Additional to the release of the Enterprise Edition (EE), Pentaho released the stable build of 5.0 Community Edition (CE).

If you can’t wait to get to the download, have a look at our new It hosts all information and the download link. And for the download of a free 30-day trial of Pentaho EE, visit

Among all the other great additions in the Pentaho Business Analytics Suite 5.0, here are the highlights for the new PDI 5.0 release:

PDI Community Edition includes the following new feature:

  • Inline Help links on transformation steps and job entries
  • Simplified looping – call jobs and transformations within a transformation and loop through the rows
  • Better previewing with dedicated window in run tab
  • Detailed timing metrics for low level operations – helps to analyze bottle necks in more detail, e.g. database connect & query time vs. transformation time
  • Extended monitoring of sub jobs and transformations in the Carte- and DI-Server (Expand remote job option)
  • Introduction of REST services (Carte and DI-Server)
  • Several new Transformation Steps and Job Entries including: Table compare, ZIP file, OpenERP input/output, Telnet, Nagios traps…
  • Lots of new configuration options for tuning the engine back-ends
  • Marketplace – Share and/or download new plug-ins
  • A series of new plugin systems to make it easier to extend Kettle (data types, extension points, carte servlets, logging tables, step-to-step row distributions…)
  • and many more…

PDI Enterprise Edition includes the following new features additional to the CE features:

  • Kettle JDBC driver – Query data using SQL/JDBC from any transformation, see how this can be used to blend data on the source in Pentaho’s Big Data Blend of the Week
  • Job Restart-ability – Set checkpoints, Restart jobs from last successful checkpoint
  • Transactional Job Execution – Provides the ability to roll back job execution on failure
  • Security on Database Connections – Full permissions: R/W/D, Securely share connections with other users or groups
  • Load Balancing of data within Transformations
  • Splunk input/output steps
  • WebSphere MQ / MQSeries integration
  • 4.4 to 5.0 Migration – Ease-of-use for Administrators

Matt Casters, Pentaho’s Chief Architect of PDI & Kettle Project Founder, will post more details over the next coming weeks, but just in case you can’t wait, test it yourself and go to or

And what about big data?

Pentaho continues to lead in embracing and extending the big data ecosystem

Enhanced Big Data Functionality (5.0):

  • New InstaView use case templates for Hadoop and Splunk
  • Expanded NoSQL Integration

Expanded Big Data Ecosystem Support (5.0):

  • Hadoop High Availability support
  • New Integrations: RedShift, Impala, Splunk
  • New Hadoop Certs: Intel, Hortonworks, DataStax
  • Support for latest versions of CDH, MapR, MongoDB and Cassandra

Adaptive Big Data Layer (Transparent Access to & Integration of Big Data):

  • Insulates from changing versions, vendors, data stores
  • Give customers broad flexibility of choice, rapid time to value, reduced risk
  • Provides native integration into the big data ecosystem
  • Broadest, deepest Big Data Support

We are able to deliver support for new technologies as plug-ins very fast even without depending on release cycles of our core products. For the latest additions, see also the New and updated content in the Pentaho Big Data wiki space – stay tuned!

Learn more about Pentaho & Big Data and check out:

Dieser Eintrag wurde veröffentlicht in Kettle (PDI). Fügen Sie den permalink zu Ihren Favoriten hinzu.