1993 Sep 01
NeuralWare Professional II/Plus
HNC Database Mining Workstation
Advanced Software Applications ModelMAX

David M. Raab
DM News
September, 1993

Neural networks are inherently intriguing. For on thing, they have a really snappy name. More important, they also promise a low-cost, easy-to-use, highly-effective alternative to traditional statistical modeling. There have been enough reports of success that many direct marketers now want to try neural nets for themselves. They have several choices.

But first, a brief explanation of what a neural net really is. Basically, it’s a modeling technique that automatically splits the range of situations to be modeled into different groups, and then builds separate statistical models for each group. This roughly mimics the view that the human brain works by changing the strength of outputs from different groups of neurons, in response to the pattern of stimulus that is presented.

Clear as mud? Imagine a project to identify frequent pizza buyers, working with pizza purchase history plus age, income, sex, marital status, number of children, etc. Further imagine that the heaviest users actually fall into two groups: single moms with young children, and dual career couples with no kids.

A traditional regression model would need a single mathematical equation that gives high scores to members of both groups–not easy to write, given the complex relationships among the variables. A neural network, by contrast, would apply different equations to each group. Since each equation must be accurate only within the group, it is much easier to find formulas that give the desired answers. (In neural network terms, each of these equations is a “node” or “neuron”.)

Great, but where do the formulas come from, and how does the system know which formula goes with which group? The initial formulas are generated more or less at random; what’s important is less the contents of any particular formula than having a range of alternative formulas to choose from. The system determines which formula to apply in a particular circumstance by “training” itself on cases where the correct answer is known. Initially, it gives equal weight to the results of each formula; later, it adjusts the weights so that formulas yielding the correct answer for a case are given more weight when that type of case is presented. At the same time, the system is also adjusting the formulas themselves to further reduce the error.

After many cases are presented, the system has “learned” which formulas to weigh heavily in which circumstances. When changing the weights or formulas yields no further increase in accuracy, the training stops.

The advantage of this approach is that it automatically adjusts for data interactions and non-linear relationships. With traditional techniques like regression, these relationships must be discovered and defined by a statistician before the model is built. Because this type of work accounts for much of the effort in building a traditional model, neural networks can be built faster–in days instead of weeks–and statistical skills are less necessary. The result is greater speed and less cost.

Performance is another issue. Although some vendors report their neural network models outperform traditional statistical techniques by 5% to 15%, the general experience in direct marketing seems to be that a well-executed neural net will perform about the same as a well-executed statistical model. More substantial improvements are possible in circumstances that play to the neural nets’ particular strengths–say, when patterns are very complex or data contains a great deal of “noise”.

Because performance is not necessarily better, firms with sophisticated in-house modeling staffs sometimes see little benefit in neural networks. These firms–especially if most of their projects use familiar files–already spend relatively little time on preparation. Some statisticians are also uncomfortable with neural network methods, which require them to abandon familiar tools and techniques.

In fact, the most enthusiastic users of neural networks tend to be firms that face many different types of modeling problems, lack in-house modeling staff, or need multiple models more quickly than traditional methods can deliver. These firms get the most benefit from the speed and cost advantages of neural networks.

Despite their capabilities, results of a neural network model still depend heavily on human skill. Even though the network will automatically adjust for certain types of data interactions, it cannot create important “derived” variables such as ratios (for example: purchases per year) or combinations (for example: individual products collapsed into groups). This must be done beforehand by someone who understands the business situation.

But this person need not be a statistical expert. Neural networks can be built by a business analyst who has been trained in the traditional modeling skills of data preparation, model design and variable selection. The best results often come from a collaboration of business and statistical experts.

Results are also affected by the software itself. Key considerations include:

Data preparation: the source data must be moved from its original format into a flat file suitable for analysis; contents must be analyzed and derived variables created; variables must be defined to the system; and values must scaled and otherwise massaged to meet neural network requirements. Some systems include extensive tools to help with these activities, while others pretty much leave users to their own devices. Even the most helpful systems still require substantial, skilled preparation, although some are moving to automate portions of those procedures.

User skills: some neural network products are designed primarily for researchers who have extensive statistical background and deep understanding of neural network issues. Other systems are designed primarily for end-users who are business, but not statistical, experts. The most specialized systems are targeted at particular applications, such as direct marketing. These tend to require the least technical understanding from a user. However, all systems rely on the user’s business knowledge to prepare the input and to interpret results.

Technical options: systems built for technical users allow control over innumerable details including the ways that weights are assigned and reassigned, the type of equations used in the formulas, how many layers the network has, the number of neurons per layer, how many trials are attempted, and methods to avoid overtraining (that is, memorizing the test data, which makes the system less accurate when scoring new cases). Systems may provide defaults that sophisticated users can change, use built-in logic to make the choices automatically, or just implement a single method. In addition, some systems give very extensive diagnostic tools for examining the network and for statistically evaluating the quality of its results.

Reports: in addition to diagnostic procedures, some systems provide user-oriented reports such as gains charts (complete with profit and loss impact) and sensitivity analyses that highlight the impact of different variables. Other systems do not provide those reports as part of the package, but either include general report-writing tools that could produce them, or allow data export into standard spreadsheets or report-writing systems.

Scoring: once a model is trained, it can be used to score new records. In all systems, the software that built the model can also apply the scores. But many users will want to export the scoring logic to work within other computer systems. This might be done by creating a custom program with the variables and weights embedded, or by exporting weight and variable files into a standard program. Sometimes the exported networks can be trained on new cases without going back to the original model building system. Most of the export options create “C” programs, and some vendors support COBOL as well.

Services: all vendors offer substantial training, both in the mechanics of using their software and in the underlying skills of data preparation and model design. In addition, some offer custom consulting and project management. Even clients with extensive experience in traditional modeling methods need some help in adjusting to neural nets.

Cost: charges for these systems range from a few thousand dollars for a general network building tool to nearly $100,000 for a specialized system and extensive consulting support. Any comparison should include not only the system cost, but also the training and staff time that will be invested in learning to use it. Of course, the ultimate payoff–in terms of lower costs and higher profits–is an even more important financial consideration.

Here is a look at three neural network systems that are available to direct marketers.

NeuralWare (412-787-8222) provides a general purpose neural network tool kit, aimed at R&D groups, universities and other technically advanced users. The main product, Professional II/Plus, allows users to create standard back-propagation networks plus over two dozen alternate configurations. If you suspect you really need a radial basis function network with Gaussian pattern units, Euclidean summations, hyperbolic tangent transfers, and a generalized delta learning rule with momentum, then NeuralWare will let you find out. If you want to invent your very own network paradigm by writing C code, you can do that, too.

On the other hand, NeuralWare also provides default settings for common network types. But the system is clearly aimed at users who are already sophisticated or who want to explore the intricacies of neural networks for themselves.

Professional II/Plus does not have substantial data preparation capabilities. But NeuralWare recently introduced DataSculptor, a graphically-oriented tool with extensive data import, transformation, analysis and display functions.

NeuralWare’s great strength is the richness and flexibility. It offers a huge array of network configuration options, with no limit to the size of the training data set. In addition, users can select from dozens of diagnostic tools, which can be updated on the screen as the network is trained or viewed after training is complete. To effectively manage the variety of options, the system uses an highly graphical, icon-driven interface.

Professional II/Plus also includes a SaveBest feature to return to a prior configuration if the network overtrains itself by memorizing the training data. This is a particular danger with neural networks.

Other than diagnostics, reporting in NeuralWare is largely limited to the ExplainNet facility, which shows the impact of each variable on individual cases and the data set as a whole. It can also show the impact of slight changes in any variable, in a form of sensitivity analysis. For other reports, the user must export results via an ASCII file into spreadsheet or report-writing programs.

Professional Plus/II can convert a trained network into C code, which allows new records to be scored from within another system. The optional Designer Pack provides the ability to create C programs that both execute the network as built and allow it to be further trained on new cases.

NeuralWare’s products run platforms including DOS PC, Mac, DEC VAX, Sun, HP, IBM RS/6000, and Silicon Graphics. Training a 40-variable file might take six to eight hours on a 486 PC; more powerful workstations will be much faster.

Prices for Professional II/Plus range from $1,995 on the PC and Mac to $4,995 on high-end systems. Other products cost from $495 to $2,995 additional, depending on the product and hardware platform.

The firm also offers a $149 Explorer program, which is designed for education and training. This system has nearly all the features of Professional II/Plus, but does not allow a complex network to be saved or converted to C code.

In all, NeuralWare has sold over 10,000 copies of its systems.

Beyond software, NeuralWare offers several courses on using its products, neural computing principles, and specific applications including target marketing. These run two to four days each. A five-day program combining Professional II/Plus training with an application workshop costs $2,000. The firm does not take on consulting projects, but offers “technology transfer” arrangements to help customers learn how to apply neural networks to their particular situations.

HNC (619-546-8877) offers the DataBase Mining Workstation, a general purpose tool designed to let statisticians and non-statisticians build neural network models for scoring applications. About 60 copies have been sold to users in a variety of financial, insurance, process control and other industries, including a handful of direct marketers.

The Workstation comes with a range of tools that help move a project through the development cycle. These start with a data conversion package, DBMS Copy, that takes files in relational and other structures and puts them into the required flat ASCII format. A user then typically employs non-HNC tools to analyze the data and add any derived variables that are important.

Once the data is ready, the Data module lets users import it, define the variables to the Workstation and perform additional transformations. The next step is typically to run a “kitchen sink” model with all the variables, and then use Workstation’s Relationship Discovery and Automated Variable Selection modules to analyze the results and help determine which variables should really be included.

One or several new models are then built with the most important variables, and evaluated with several other modules. The Sensitivity module shows the impact that changing a variable has on a single case; the Explanation module ranks the importance of each variable; and the Evaluate module shows the performance of the model using gains charts and several statistical measures. Other reporting is done by exporting results as a flat ASCII file.

When the model is complete, scoring can be done within the Workstation or by using DeployNet to read the model files and score additional cases. A C version of DeployNet is included with the system, and a COBOL version costs extra.

All the modules are accessed via a graphical interface that runs under Windows on a PC. Versions also exist for Sun and RS/6000 workstations. To speed performance, the PC version uses a Balboa i860 add-in processing board. This reduces the time to train a typical network to under one hour, compared with three to six hours without a board. Balboa boards can also be installed in the Sun and RS/6000 versions, although the benefit will be less because these systems are themselves more powerful.

Being designed for business analysts rather than neural network researchers, the DataBase Mining Workstation limits itself to back-propagation networks and no more than three middle layers. This is more than adequate for most real-world situations.

The system is also limited by the memory of the Balboa board, which must hold the entire data set during training. The board can be loaded with up to 64 megabytes of data, which would hold about 150,000 records with 100 four-byte variables each. Since modeling is nearly always done on much smaller sets, this will rarely be a practical problem. There is no limit to the number of records the Workstation can score after the model is built.

Most business users are probably less concerned with the limits on their options than with how to choose among the options they have. The Workstation provides defaults for all settings. In addition, it can automatically create several versions of a model, each using different numbers of neurons and layers. This would typically be done overnight. At the end, the system automatically selects the version with the best results.

Because having too many layers often leads to overtraining, the Workstation’s limits provide a degree of automatic protection. In addition, the system checks the model against an independent “test” data set while training is underway, and stops training when results start to deteriorate. Both the test and training sets are loaded onto the Balboa board.

The Workstation provides basic statistics to evaluate network results, though nothing like NeuralWare’s tools to explore what’s going on inside the network. But Workstation capabilities should suffice for the non-technical users it is intended to serve.

HNC provides two weeks of training for new users, split evenly between the mechanics of using the system and basic modeling techniques. New users are expected to have basic PC skills and to understand the business problem they want to solve, but do not need a modeling or statistical background.

The $80,000 price of the Workstation includes enough on-site consulting to get the user through an initial project–usually ten to twenty days. It also includes the Balboa board with 16 megabytes of RAM and a two year warranty, two years of software upgrades, and the C version of DeployNet. Adding Balboa RAM to 64 megabytes costs another $8,000, while the RS/6000 and Sun systems need special interfaces for the Balboa board which add $3,000 to $8,500. The Sun and RS/6000 versions can also be bought without the Balboa board, which reduces the price by $5,500. The COBOL version of DeployNet costs $20,000 additional.

HNC also offers custom model building and consulting services. In addition, it sells offers specialized systems for credit fraud detection, real estate appraisals, retail inventory management and mortgage underwriting.

Advanced Software Applications (412-227-5300) was founded a little over a year ago to develop systems tailored to end-users in specific industries. Its ModelMAX software is aimed at direct marketers who want to build response and performance models. (Response models predict a yes-or-no result, while performance models predict a continuously-varying quantity such as lifetime value.) About a dozen copies of the system have been sold, although aggressive marketing is just beginning. Over fifty evaluation copies have been placed.

Like the other neural network products, ModelMAX requires the user to prepare the variables before data is imported to the system. Once this is done, though, ModelMAX carefully leads the user through the model building process. The progression is made as painless as possible, with automatic prompting for the “Next Step” as each task is completed (and a “Congratulations!” message when the project is done). System menus show a check mark next to completed steps for the current project, giving a quick status report.

Running ModelMAX is simplicity itself. Once the data are loaded, the user has nothing to do except select each Next Step when it is presented. ModelMAX automatically chooses which variables to use, applying rules that both check for significance and screen out suspect data. The system then creates a back-propagation model with a single middle layer. The user has literally no choices to make, although he or she can force the model to include particular variables.

ModelMAX can handle up to 30,000 records in its training set, or a total of 40,000 records in training and validation sets combined. The system has internal safeguards to prevent overtraining. Records can contain an unlimited number of variables, although no more than 25 will be used in any single model.

The limits on middle layers, number of variables and training set size cause some concern, although solutions to most real-world problems should fit comfortably within ModelMAX’s range. ASA says its tests show that ModelMAX performance is competitive with other products. The firm is also considering ways to loosen some of the constraints.

ModelMAX provides several attractive reports. To check imported data before the model is run, the system shows the distribution of values for each variable in both table and graph formats. After training is complete, ModelMAX produces a gains chart that splits the file into segments and shows the expected profit or loss for each segment, based on user-provided mail cost and revenue per order. The user can also define cut-off levels for list selection.

Once the segments are defined, additional graphs show the distribution of one variable among all file segments, and compare the distributions of all variables in two user-selected segments. These reports give some indication of the importance the system has attached to different variables, but do not explicitly rank the variables by importance or show sensitivity to changes. Although ranking and sensitivity reports are traditionally important tools for understanding a model’s contents, they are less central in neural networks, which look at patterns rather than each variable in isolation.

ModelMAX has an extremely useful Evaluation Report, which compares the expected mail quantity, responses and profits from three different strategies: mailing all records; using the ModelMAX score with the user-defined cut-off; and using a different system (presumably, the current segmentation method) whose mail/no-mail decisions were stored on each record when the data set was imported. This report can be run against either the training or validation set. A similar report gives the same information for projections based on a mailing universe size provided by the user.

Once a satisfactory model is complete and the cut-off is set, ModelMAX can either score records internally or create C code for export to other systems. This code can be rewritten in another language if no C compiler is available. There is no limit to the size of the file that can be scored internally.

ModelMAX runs under Microsoft Windows on a PC; for speed, compute-intensive processes can run in native DOS instead. On a 486/33, it takes 30 minutes to two hours to train a model. Other functions take minutes each, so the total time to complete a project is typically three to four hours (excluding data preparation). Scoring on a PC runs at about 50,000 records per minute.

ASA provides ModelMAX users with a 2 1/2 day training course that gives a general background in modeling and focuses mostly on variable selection and definition. No previous modeling experience is required.

In addition to training, ASA provides consulting and custom system development using advanced technologies including neural nets, expert systems, genetic algorithms, fuzzy logic, and chaos. The firm has developed end-user applications for credit scoring, fraud detection and process control as well as direct marketing.

ASA offers a free 30-day evaluation copy of ModelMAX, which is fully functional but cannot be used to score records. A version that creates response and performance models, allows scoring on the PC and can create a C program to export a model to other platforms is $25,000 for a single user. Annual maintenance costs another 16.5%.

ModelMAX is available from ASA and through Group 1 Software (800-368-5806).

* * *

David M. Raab is a Principal at Raab Associates Inc., a consultancy specializing in marketing technology and analytics. He can be reached at draab@raabassociates.com.

Leave a Reply

You must be logged in to post a comment.