What’s New in PDI 4.4?

Among the other great additions like Mobile and many other feature enhancements in the Pentaho Business Analytics Suite 4.8, here are the highlights for the new PDI 4.4 release:

Pentaho Instaview

Pentaho Instaview is the fastest way to start using Pentaho Data Integration to analyze and visualize data. Instaview uses templates to manage the complexities of data access and preparation. You can focus on selecting and filtering the data you want to explore, rather than spending time creating source connections and identifying measure and dimension fields. Once the data has been selected, Instaview automatically generates transformation and metadata models, executes them, and launches Pentaho Analyzer. This allows you to explore your data in the Analyzer desktop user interface.
As your data requirements become more advanced, you have the ability to create your own templates and use the full power of Pentaho Data Integration (PDI).
Watch this video and see the Getting Started with Pentaho Data Integration Instaview Guide to understand and learn more about Pentaho Instaview or

PDI Operations Mart

The PDI Operations Mart enables administrators to collect and query PDI log data into one centralized data mart for easy reporting and analysis. The operations mart has predefined samples for Pentaho Analyzer, Interactive Reporting, and Dashboards. You can create individualized reports to meet your specific needs.

Sample inquiries include

  • How many jobs or transformations have been successful compared to how many failed in a given period?
  • How many jobs or transformations are currently running?
  • What are the longest running jobs or transformations in a given period?
  • What is the highest failure rate of job or transformations in a given period?
  • How many rows have been processed in a particular time period? This enables you to see a trend of rows or time in time series for selected transformations.

The operations mart provides setup procedures for MySQL, Oracle, and PostgresSQL databases. Install instructions for the PDI Operations Mart are available in the Pentaho InfoCenter.

Concat Fields Step

The Concat Fields step is used to join multiple fields into one target field. The fields can be delimited by a separator and the enclosure logic is completely compatible with the Text File Output step.

This step is very useful for joining fields as key/value pairs for the Hadoop MapReduce Output step.

SAS Input Step

The SAS Input step reads files in sas7bdat format created by SAS software. This step allows PDI developers to import files in sas7bdat format.

EDI to XML Step

The EDI to XML step converts EDI message text, which conforms to the ISO 9735 standard, to generic XML. The XML text is more accessible and enables selective data extraction using XPath and the Get Data From XML step.

New Pentaho Data Integration Software Development Kit (PDI SDK)

Extending and Embedding Pentaho Data Integration enables developers to utilize PDI beyond the out-of-the box functionality. This guide explains the mechanics of extending PDI plugins. It also explains embedding PDI functionality directly into Java applications. Pentaho provides sample code for all plugin types and embedding scenarios. More details can be found in the Pentaho Infocenter Embedding and Extending Pentaho Data Integration.

The complete PDI 4.4 change log: Now with JIRA Components and PDI Sub-Components

This is the first release with very detailed information about what has changed on a more granular level, so you could query JIRA by components and PDI Sub-components like steps, job entries and specific database topics. This helps in the upgrade process and finding already existing issues in JIRA. More details can be found in the Pentaho Wiki page JIRA PDI Components, Sub-components and Labels and the complete PDI 4.4 change log can be found in JIRA. You need to be logged into JIRA to include the fields in your default report, but you can also download the complete list when you select “View” / “Excel (All fields)”.

Some background about versioning and compatibility for Kettle core

From the Kettle core perspective, 4.3.0 and 4.4.0 are still bug fix releases for 4.2.x but with significant new features. As such, we have decided to call the release version 4.4.0 rather than 4.3.1. As always, please see the PDI Upgrade Guide for specific topics when upgrading.

Download PDI 4.4

And when you don’t have it already, the download is over here: www.pentaho.com/download (Enterprise Edition) and should be soon on Sourceforce (Community Edition).

