Build a Streamlined Data Refinery

by Zachary Zeus
October 29, 2014
time lapse photography of tunnel

With the surge of data over the last few years, it has been a complex task for many businesses to get real value from Big Data. 

Simple batch reporting isn’t up to scratch anymore – consumers want easy to understand visual analytics, in their favourite on-demand real-time format, collaborating with their existing software. 

This puts strain on the IT department, while slowing business users down, which led to the cool visualisation tools to help themselves- although demands are only partially met. While they can view a subset a subset of data, trusting the data and getting approval from IT is a set of challenges they’ve had to face – until now.

In Pentaho’s latest 5.2 release, the innovative Streamlined Data Refinery (SDR) is a flexible, economical way to process and automate delivery of information to a large numbers of users for many analytic purposes. It sets a new standard of data delivery by streamlining the process, empowering business users. The design pattern accommodates an on-demand process from user-initiated data requests, blending and refining the data, automatic analysis schema generation, and the ability to publish analytic data sets in any format.

Innovative features

  • JDBC drivers are simpler to install
  • New data source locations have improved visibility
  • The PDI depository’s performance has improved
  • Simplified R script integration in data science pack
  • Enhanced documentation and samples for embedded analytics

Pentaho Data Integration

Pentaho’s highly scalable data integration engine, managed through its intuitive end user interface, provides the glue between the various data sources and stores in this architecture. This process can be actioned on-demand using PDI:

Blending & Orchestration: PDI absorbs data from any data source and then processes, cleanses and blends the data to drive insight.

Automatic Modelling & Publishing: PDI, as part of the data orchestration process, creates an OLAP schema and publishes it to the Pentaho Business Analytics server for end user visualisation.

Governance: IT can promptly validate data sources blended at the source, allowing for the right measure of control. Governed Data Delivery is the delivery of blended, trusted and timely data to power analytics, regardless of positions.

How does the SDR solution compare to our competitors?

Pentaho’s SDR solution is unique to anything else on the market as it’s a complete solution. With the combination of data integration, orchestration, Big Data connectivity and governed data delivery through an open web-enabled platform, the SDR is a differentiator. For example:

  • Informatica and Tableau: Informatica and Tableau visualisations, but together they can’t deliver on-demand data blended across many source upon user request in a web UI.
  • Alteryx: Delivers lightweight data integration in an analyst drag/drop UI, but cannot provide for robust Big Data transformation and orchestration or high-performance embeddable visual analytics.
Portrait of Maxx Silver
Zachary Zeus

Zachary Zeus is the Co-CEO & Founder of BizCubed. He provides the business with more than 20 years' engineering experience and a solid background in providing large financial services with data capability. He maintains a passion for providing engineering solutions to real world problems, lending his considerable experience to enabling people to make better data driven decisions.

More blog posts