The Pentaho Reporting Camp

Project news and updates directly from the source

Current Version: 3.6.1

Download Release Notes ChangeLog

Previous Version: 3.6.0

Download ChangeLog

Development Version: 3.7.0

Download Release Notes ChangeLog

What is Pentaho Reporting

Pentaho Reporting is a suite of open-source reporting tools which allows you to create relational and analytical reports from a wide range of data-sources.

 

The Pentaho Reporting Engine is able to create PDF, Excel, HTML, Text, Rich-Text-File and XML and CSV outputs of your data. Our OpenFormula/Excel-formula expressions help you to create more dynamic reports exactly the way you want them. Our open architecture and our powerful API and extension points make sure this system can grow with your requirements.

.. more ..

 

Subprojects and Project Structure

Thursday, June 26, 2008

Taking small steps to cross the tab

After a long silence, let's have something positive today: Pentaho Reporting now officially talks to Mondrian and any other OLAP4J datasources.

We now ship with two flavors of MDX access. The existing MDX capabilities are covered by the BandedMDXDataFactory, while the crosstabbing functionality will rely on a new DenormalizedMDXDataFactory.

The BandedMDXDataFactory takes a two dimensional MDX-Query-Result and maps the multidimensional dataset into a flat table. The approach is reasonable if you want to access the cube row-by-row, but it fails badly as soon as your query has more than two dimensions or if your query-result displays a ragged hierarchy. The report-designer used this mode for a very long time to provide at least some access to Mondrian-DataSources. The banded mode is still great if you need banded reporting over MDX datasources.

However, with Version 0.8.11 of the reporting engine, we finally have to provide real crosstabbing capabilities.

At that point, the pre-chewed data provided by the BandedMDXDataFactory is totally unsuitable for anything sophisticated. You cannot reconstruct a cow from a steak. In the same way we cannot use the banded data to reconstruct the axis and hierarchy information provided by the real MDX-ResultSet. At the same time, the complex (and in some points ambiguous) nature of the data-processing that happens inside the BandedMDXDataFactory makes it next to impossible to use plain queries as source for a crosstabbed report.

The goals for our crosstab-implementation are straightforward:

  • It has to work on existing data-sources using only TableModels as input (Don't over-architect)

  • The internal data-source structures must be simple so that any source-system is capable of providing the data in the correct format. (Don't exclude anyone.)

  • Provide only simple aggregation as built-in functions (Don't copy Mondrian.)

  • Make sure that functions and expressions work exactly like in relational reports. (Don't be special.)


The new denormalized MDX-DataFactory provides a streaming view over the MDX-Cells. Any datasource can provide a similar view by simply joining the fact-table with all dimensions (and by sorting them according to the desired axis structure). The denormalized view now makes it possible to treat MDX-Columns and Rows (and any of the other 253 possible axises) as relational groupings, which just happen to be displayed in a non-banded manor.

Now with the data-problem solved, displaying the data will be quite easy, even for huge result-sets.

0 comments:

Post a Comment