Multidimensional Databases
David M. Raab
DM News
September, 1995
Direct marketing systems are often handle very sophisticated selections, but their analysis capabilities rarely extend beyond cross tab reports. By contrast, marketers in other industries routinely rotate, summarize, subdivide, calculate and otherwise fiddle with five or six dimensions of data without understanding the complicated logical statements needed for typical query generators. These marketers are using “multidimensional” databases.
At the simplest level–the only one mere mortals need understand–a multidimensional database summarizes information into a limited number of categories, called “dimensions”. Each dimension is an intuitive business concept, such as time, product, region, or sales channel, and one dimension can be split into units such as days, weeks, months, quarters, and years. The system creates a cell for each possible combination of these units–a kind of gigantic many-way cross tab–and stores specific “measures” (revenues, expenses, units sold, etc.) for each cell. Analysts can drill down to the lowest individual cell or look at summarized data. The summary levels are determined by assigning hierarchies to the dimensions: for example, days roll up into weeks, months, quarters and years.
The user sees something like a spreadsheet with choices for which dimensions and what level of detail to display. Users can add calculations, select measures, “rotate” by swapping the row and column dimensions, “drill down” to the detail behind a selected cell or column, “drill up” to a higher level, or “slice” the data by including only certain values. The more deluxe systems also let the user “drill down” by pointing to a spot on a graph or map, provide time-series and forecasting functions, and automatically highlight selected conditions.
The multidimensional interface is the only tool for sophisticated data analysis that has been successfully adopted by non-technical users. This is because dimensions are familiar business entities and users don’t have to explicitly describe logical relationships in order to analyze their data. Multidimensional systems are also very fast: because most summaries are precalculated, screens are usually redrawn in a couple of seconds.
Multidimensional databases are often referred to as “On Line Analytical Processing” (OLAP) products. Purists argue that “OLAP” should be limited only to multidimensional products that meet additional criteria such as client/server architecture and unrestricted cross-dimensional calculations. But they seem to be losing the fight: the term has become so fashionable that OLAP is often applied to any fast analytical tool for large databases, even non-multidimensional ones.
There are dozens of multidimensional tools available. They can be split into three broad categories.
– desktop tools, such as Cognos PowerPlay (800?426?4667), Trinzic Forest & Trees (800-952-8779) or Andyne PaBLO (800-267-0665), load the multidimensional data onto the user’s PC. They have snazzy graphical interfaces but are limited to relatively small data sets–usually a couple hundred thousand cells–and can take many hours to create a new file. Cost per user is usually around $600.
– multidimensional engines, such as Arbor essBase (800-858-1666), Comshare Commander (800-922-7979), Oracle/IRI Express (800-765-7227), Pilot Lightship (617-374-9400), Kenan Acumate (800-775-3626), Sinper TM/1 (800-822-1596) and Holistic Systems Holos (908-321-6500). These create a proprietary multidimensional structure that resides on a central server. This allows them to handle files with over a billion cells and several gigabytes. These tools usually offer a proprietary user interface, although some–including TM/1 and essBase–can be accessed from a spreadsheet such as Microsoft Excel. Costs range from $2,000 to $10,000 per user.
– multidimensional-on-relational tools, such as Microstrategy DSS Agent (703?848?8600), Information Advantage Axsys (800-959-7015), Prodea Beacon (800-776-3321) and Stanford Technology Group MetaCube (415-288-7960), place a multidimensional interface on a standard relational database such as Oracle or DB2. This is attractive to buyers who are wary of proprietary multidimensional engines, although there is still plenty of proprietary technology involved and the relational databases must use special “star” or “snowflake” designs to get adequate performance. With proper design, summary tables and indexes, multidimensional-on-relational users report response times from 30 seconds to five minutes range on even complex queries. This is much faster than conventional relational databases, but slower than a proprietary multidimensional server. Prices are similar to pure multidimensional products.
Boundaries among these categories are increasingly blurred. Several products, including Holos and Express, can access data in either proprietary multidimensional or standard relational formats. Many vendors can actually mix the two data types by storing summary data in a multidimensional format but “reaching through” to relational files for details when necessary. And vendors have begun to develop standards that allow one firm’s front-end tool can read data in another vendor’s engine.
What can direct marketers do with all this? The first step is to REMAIN CALM: multidimensional databases cannot replace your main marketing database because they are singularly unsuited to storing data that cannot be meaningfully summarized, such as customer lists. At best, users could use the multidimensional interface to identify cells with desirable customer segments and then “reach through” to actual customer records in a relational database.
But “reach through” is a new and fragile technique for many vendors, and all the capabilities needed to generate a mailing list are not necessarily present. Anyone bold enough to consider such a project must consider a long list of requirements including:
-create sufficiently precise selections within the multidimensional interface
-support specialized capabilities such as Nth selects and random samples
-return large numbers of customer name and address records
-view a sample of the selected records on the desktop
-create output files on the server in appropriate predefined formats
-update the underlying relational database with promotion history
– rebuild the database with fresh information in a timely fashion
No one vendor today fully meets these needs, although a few–including Holistic and Kenan–may come close. An intriguing alternative is Cross Z (516-228-8500), whose Fractal database resembles a multidimensional product although the firm does not describe it that way. Cross/Z has long offered segmentation and selection abilities and is working other tools for multi-cell campaign management.
On the other hand, direct marketers have successfully employed multidimensional products for exploratory analysis and end-user reporting. Marketing database vendors including Customer Development Corporation and Miglautsch Marketing have integrated PowerPlay with their systems and report that users are extremely pleased. As pure analytical tools, multi-dimensional databases are an excellent option to consider.
* * *
David M. Raab is a Principal at Raab Associates Inc., a consultancy specializing in marketing technology and analytics. He can be reached at draab@raabassociates.com.
Leave a Reply
You must be logged in to post a comment.