How to set up Mongo on Pentaho

by Zachary Zeus
August 18, 2014
Mongo DB

With the Pentaho 5.1 release there was a really exciting innovation – analytics directly against a Mongo DB cluster.  Mongo on Pentaho is exciting because it means that you can do advanced data discovery against Mongo collections directly.

Mongo LogoBetter TogetherPentaho Logo

Setting this up can be a bit tricky – as you have to understand how both Pentaho and Mongo and work.  Additionally, Pentaho has introduced some new technologies in order to make this capability robust, scalable and extendible out to other data sources.

New Pentaho Technologies to make this work:

– Mondrian 4.0 – this is the next generation OLAP engine that is part of the Pentaho suite.  This is the first time that it has made it into a Pentaho production release.  It also has a new schema syntax.

– OSGI – This allows the Mondrian 4.0 engine to be implemented along side the existing mondrian implementations.  This also opens up more options in the platforms plug able layer.

– PentahoMongoOlap Layer.  This is a mapping from the Olap Engine and it’s syntax (expressed on the front-end in MDX) to the Mongo functions and query syntax.  Historically, this mapping has only been done to SQL – now we have an implementation of two syntaxes (SQL and Mongo) – this opens the door to do OLAP analyses across any syntax.

See this article by Will Gorman, on the technical details of all of this and to get the Mondrian 4 package installed if you’re using the Pentaho CE version.

– Mondrian 4, OSGi in Pentaho 5.1 CE

Assumptions

Make sure you have these in place before you get started.

1.  Running Pentaho Business Analytics server with Mondrian 4 installed and functioning correctly

2.  Running MongoDB database v 2.6 or above

Resources:

1.  Will Gormans blog post (helps getting Mondrian 4 running)

– Mondrian 4, OSGi in Pentaho 5.1 CE

2.  This JIRA task – (preliminary documentation on configuring Mongo on Pentaho)

– http://jira.pentaho.com/browse/MONDRIAN-1902

Steps (taken straight from the doc):

– Import sample data

– Upload Mondrian 4.0 Schema file

– Define the data connection

Import Sample Data

– If you have Pentaho 5.1 EE the sample data is located in /pentaho-solutons/system/samples/mondrian-data-foodmart-json-0.3.3.zip

– Unzip “mondrian-data-foodmart-json-0.3.3.zip”

– This gives you a folder with a bunch of .json files the ones that need to be loaded are:

Filename                                                                Mongo Collection name

sales_fact_1997_collapsed.json                            sales

foodmart_data_sales_transactions.json                  sales_transactions

agg_g_ms_pcat_sales_fact_1997.json                   agg_g_ms_pcat_sales_fact_1997

agg_c_10_sales_fact_1997.json                            agg_c_10_sales_fact_1997

Here is a sample command:

/> mongoimport -db foomart –collection agg_c_10_sales_fact_1997 –type json –file agg_c_10_sales_fact_1997.json

Upload Mondrian 4.0 Schema

 – If you have Pentaho 5.1 EE the sample data is located in /pentaho-solutons/system/samples/FoodMart.mongo.xml

 – Upload it into the BI Server from the Pentaho User Console (PUC) (instructions stolen from the Pentaho doc):

1. Login to the User Console using the admin username and password.

2. Open the Browse perspective by selecting this from the upper-left menu.

3. In the Folders panel, select the location where you want to store the schema. Click on Upload… in the Folder Actions panel. The Upload dialog box appears.

4. In the dialog box, click Browse to go to the location of the schema for upload. Double-click on the schema. If needed, specific permissions are set on the schema by using the Advanced Options settings.

5. Click OK. The schema is uploaded and available to specified users. 

Add the connection details

Need to edit the olap4j.properties file, which is located under the pentaho-solutions/system folder.  This is what I ended up with:

foodmart.name=MongoFoodmart
foodmart.className=org.pentaho.platform.plugin.services.connections.PentahoSystemDriver
foodmart.connectString=jdbc:mondrian4:Host=127.0.0.1;dbname=test;authenticationDatabase=admin;DataServicesProvider=com.pentaho.analysis.mongo.MongoDataServicesProvider;Catalog=solution:/home/admin/pentaho-mongolap/test-data/FoodMart.mongo.xml;username=bizreporter;password=ENC:Yml6M3ViZWQ=

Breaking this down:

– MongoFoodmart    needs to match the <Schema name=‘MongoFoodmar’ in the mondrian schema definition.

– foodmart                this uniquely defines the connection name – it needs to be the same at the beginning of each connection string.

– Host                       127.0.0.1 – my mongo instance is on the same machine as the BI server – most implementations would have it on a separate machine.

– dbname                  I loaded all of the json data into a test db

– Catalog                  This is the location on the server that I uploaded the Foodmart.mongo.xml file.  The key thing to notice is the “:/home” at the beginning of the path.  This tells the platform to look at its own directories rather then the ones on the file system.

– username               This is the specific user that has access to this data

– password                I’ve used an encrypted password by using the password encryption utility here: http://localhost:8080/pentaho/api/password/encrypt

After these have been added to the olap4j.properties file, then we need to restart the server.

And we’re done.  We test by selecting “Create New” -> “Analysis Report” and looking for the MongoFoodmart datasource.

Our free Proof of Concept Offer

BizCubed currently has a free proof of concept offer, where to help you evaluate Pentaho, we’d like to offer you one of our experts for two days. Our expert will work along side you, loading your data, making sense of it and building some impressive dashboards. This work will give you first hand experience with the tools, their power and how quickly you can deliver outcomes.

We will also provide a secure hosted environment – alternatively, we will get you things set up internally. This will provided to you free of risk and at no cost.

Click here for more information.

Register your Free Proof of Concept.

Register your Free Proof of Concept

The next blog post will be focused on building a Mongo schema and connection from scratch – not just using the examples. 

Portrait of Maxx Silver
Zachary Zeus

Zachary Zeus is the Co-CEO & Founder of BizCubed. He provides the business with more than 20 years' engineering experience and a solid background in providing large financial services with data capability. He maintains a passion for providing engineering solutions to real world problems, lending his considerable experience to enabling people to make better data driven decisions.

More blog posts