PDI/Kettle Telemetry (Presented at #PCM14)

This has been presented at the Pentaho Community Meeting 2014 (#PCM14):

There are many reasons to collect usage statistics, for example:

  • It can help in improving the product in the main used areas and features (steps, job entries, database types etc.)
  • It can help the user to determine if some features are effected by a planned upgrade (the upgrade notes on each release cover affected steps, job entries etc.)
  • When it gets combined with usage statistics in development/test/production you can also determine if some jobs/transformation are never used

Here is one solution with an how to:

Analyze the used steps, job entries and database types

  • Download the solution analyze_trans_job
  • Within PDI/Kettle, please open the job _analyze_trans_job/transformations_jobs/0_analyze_trans_job.kjb
  • Look at the comment within the job, it gives you all the usage information.
  • It is also possible to anonymize file names, transformation and step names: please see the option anonymize_names within the parameters.txt file.
  • The sending of the results (../data/analyze_result_*.zip) to the given e-mail address within the job is absolutely anonymous and you can opt-in with your personal data if you like.

If you want to contribute to this solution, the jobs/transformations are hosted on GitHub.

Note: This is limited actually to the file system and does not support a repository or repository exported file (todo).

Further information can be found on the Pentaho Community Wiki for the user statistics that can be achieved by using the Pentaho Operations Mart.

Here is also an option for uploading your file (instead of sending per mail):

