MapInfo Corporation MapInfo
Strategic Mapping, Inc. Atlas GIS for Windows
Environmental Systems Research Institute ArcView
David M. Raab
DM News
May, 1994
Direct marketers have a huge appetite for data, but few seem to have developed a taste for mapping systems. In a way this is surprising, since so much direct marketing analysis is based on Zip Codes, which are inherently geographical. But most direct marketers have used such alternative techniques as regression to discover the relationships that maps could make apparent. And direct marketing businesses often lack an obvious geographic component such as retail trading areas or sales force territories.
Still, interest in maps (more properly, “geographic information systems”, or GISs) has been growing. One reason is purely practical: costs have fallen sharply in the past two years, as vendors have replaced proprietary data with public domain information based on the 1990 Census and the associated TIGER geographical database. A package of street-level data that cost over $70,000 in 1992 can now be had for less than $10,000.
The software itself has also gained in appeal. Windows-based systems allow vendors to add more features and let users exploit them more easily. More powerful hardware gives improved performance, and more demanding managers want the efficiency that maps can supply. A single map can replace reams of tabular data, and be vastly more comprehensible in the bargain.
But just what, precisely, would a direct marketer do with a map? Some applications are obvious–choosing newspaper zones for free standing inserts; purchasing radio or TV time; selecting sites for distribution centers. Other uses are more subtle–for example, one garden supply catalog uses mapping software to relate customer location to climate zones, so it can identify the right prospects for specialized products.
These examples point to some of the unique advantages of mapping systems. It’s easier to pick broadcast markets or newspaper zones with a map than columns of numbers. Traditional databases do a poor job with the distance calculations required for distribution analysis. And although the gardening cataloger could have manually coded 40,000 Zip codes with climate zones, it was much easier to place a climate map on top of a Zip map, and let the software do the work.
The way mapping systems achieve these miracles is somewhat complicated, but worth taking the time to understand.
The central concept is that mapping systems relate all types of data to a common reference point–specifically, the latitute and longitude point that defines position on the earth’s surface. Items such as a street address have a single point, while geographic regions (such as a Zip code or census tract) are defined by a series of boundary points. A particular point or region can have non-geographic data associated with it, such as the demographics of a block group or the purchase history of a customer site. The common geographic reference allows previously unrelated information to be combined. This is how the gardening catalog was able to append climate zones to Zip codes even though no explicit table existed.
The process of applying geographic reference points to a data set is called “geocoding”. Generally, it works like the Zip+4 or carrier route coding processes familiar to most direct marketers–each address is read by the system, which tries to match it against a set of standard address tables. Instead of postal codes, these tables contain latitude and longitude. Interestingly, the mapping software vendors do not rely on the Postal Service’s national address files, but instead use proprietary street databases.
Like other matching systems, geocoding software needs to handle abbreviations, misspellings, alternate formats and other real-world data problems. The leading vendors have built fairly sophisticated algorithms, which reportedly yield match rates of 60% to 95% for street addresses. The best of these probably equal the performance of CASS-certified postal coding systems, although there is quite a range in capabilities. One vendor, Strategic Mapping, has actually embedded Group 1’s AccuMail Zip+4 encoding and address standardization system, although it still uses separate algorithms to match specific street addresses.
Geocoding is only needed on files that the user creates independently, such as customer lists. Most files used in a mapping system are purchased with the codes already in place. A huge variety is available, split broadly among lines (streets, highways, rivers, railroads), boundaries (states, area codes, census blocks, political districts, Zip Codes), and points (cities, banks, hospitals, airports). Any of these can linked to statistical data such as demographics, purchasing behavior, lifestyle clusters, or bank deposits. A file might cost a few hundred dollars or tens of thousands, depending mostly on how many sources there are for its type of information.
Some systems can also import raster and vector image data, such as satellite photographs or topographic maps. These are given latitude/longitude coordinates so they can be related to other maps, but typically don’t have additional data attached.
To manage all this information, mapping systems organize their data into “layers”, which roughly correspond to transparent overlays on a conventional printed map. Typically, each map layer contains a single type of information–say, interstate highways–which can have both a graphic component (the lines that would show on the map) and data elements (name, tolls, exit numbers, etc.) Several layers can be used in a single map, and the user can control how each layer is displayed to achieve the desired effect.
Once the user has selected the data to appear on the map, the real fun begins. Sometimes the goal is just to print out the data–for example, to plot the locations of all of customers. Mapping systems provide many tools to help format these maps for presentations. The most powerful can include tabular data and graphs along with the maps themselves.
In other cases, the system must do some type of analytical work. This may involve “thematic” mapping, such as using different colors to indicate areas with high or low response rates. Or it may use some type of “aggregation”, such as calculating all sales to customers within a certain area. Aggregation can get quite complex–for example, finding the average income of all households within three miles of a stretch of highway. To support these abilities, mapping systems have query languages that include both standard data operations and specialized geographic concepts such as “near”, “touching”, “within” and “outside”.
A really large mapping project can run to thousands of images–say, five different views of the trading area around each store in a retail chain. To make these practical, most mapping systems have a programming language that allows map production to be automated. These languages can frequently be used to build self-contained specialized systems. Since mapping systems are extremely complex–about on par with desktop publishing software–a simplified shell is necessary to make them accessible to casual users.
Mapping systems originally ran on mainframe computers, and then migrated to mini-computers and Unix workstations. These powerful systems are still used for specialized activities such as truck dispatching or urban planning. But there are about a half dozen general-purpose PC-based mapping systems that can handle most marketing projects. These systems compete fiercely on features–so any detailed description of existing products will quickly become outdated. Instead, this column will limit itself to a quick look at three of the leaders. These products are often reviewed in computer publications, so look there to find more depth.
MapInfo (MapInfo Corporation, 800-327-8627) is generally considered the leader in the PC mapping systems. Version 3.0 of its Windows product is scheduled to ship in June. Since the DOS version was introduced in 1987, about 40,000 copies of MapInfo have been sold for DOS, Windows, MacIntosh and Unix.
MapInfo offers especially powerful data management capabilities. Data is stored in a proprietary format, but the system can also directly read dBASE, Excel and Lotus files. MapInfo can handle vector, raster and “bitmap” files such as corporate logos. In addition, alone among these systems, MapInfo can relate multiple files on a common key–so a dBASE file with customer numbers could be linked to another file that had customer numbers and was already geocoded, without having to import or geocode the dBASE file itself. A single map layer can include several “joined” files.
The system can also query external database files in Oracle, Sybase and ODBC-compliant systems through the optional “SQL DataLink”. DataLink can only connect to one external system at a time, however.
The other main option available to MapInfo users is the MapBasic programming language. This lets users write programs to automate repetitive tasks, and can link several tasks to create complete applications.
MapInfo offers a very powerful set of query and analysis tools. One query can access data in multiple map layers, and the system can perform sophisticated functions such as “area weighted aggregations” (assigning fractional weights to objects that are only partly within the target region). The system can produce graphics such as pie and bar charts, and can even use charts in a thematic map–for example, place pie charts showing market share on each county of a state map, and vary the size of the charts based on total market size in each county.
The system’s geocoding is somewhat less impressive. It does not allow the user to control the “tightness” of matching through rules for handling directionals, spelling errors, etc.; nor can it automatically apply different matching methods in a single pass. This means that a user who wants to first find exact street-level matches, then Zip+4 matches, then Zip matches, or to look in different fields for the street address line, must make separate geocoding runs for each method.
MapInfo comes with maps of world country boundaries, 1994 world capitals, European provincial/municipal boundaries, U.S. state boundaries, interstate highways, location and population for the top 1,000 U.S. cities, and total U.S. statistics for population, income, retail sales and business establishments. As a sample of what else is available, U.S. buyers also get detailed data on San Francisco, including street maps, census block groups, census tracts, 5-digit Zip boundaries, a satellite image, a scanned map image, Claritas trendline and PRIZM data, plus northern California congressional districts and area codes. Outside the U.S., buyers get detailed information on London, Munich or Tokyo instead of San Francisco.
The system is priced at $1,295 for Windows and Mac versions, and $2,495 for Unix. MapBasic costs $795 for Windows and Mac, and $1,595 on Unix, while SQL DataLink costs $595 for Windows and Mac and $1,195 for Unix.
The company provides a variety of training classes and technical support options for additional fees.
Atlas GIS for Windows (Strategic Mapping, Inc., 408-970-9600) was originally introduced late last year, and shipped its 2.0 version in April. The system follows Atlas’s original DOS product, launched in 1983. Strategic Mapping gained notice among direct marketers last year when it purchased Donnelley Marketing Information Services.
Atlas stores its data in the standard dBASE format, and can also read Excel and Lotus files. The files cannot be linked, however, and only one file can be open per map layer. The system’s built-in “SQL Link” gives access to external ODBC-compliant databases.
Geocoding is quite sophisticated, including the ability to match at street, Zip+4, 5-digit Zip or other levels in a single pass, some user control over matching “tightness”, and a code that shows the type of match for each geocoded record. As with any vendor’s street-level matching, these advanced functions only work if you have spent $10,000 or so to buy street or Zip+4-level databases. The company also plans to release a system that places Donnelley ClusterPlus codes on geocoded records. Atlas can handle bitmaps, but not raster or vector images.
Atlas offers strong geographic analysis features, including area-weighted aggregation, the ability to create multiple buffers in a single command, and automatic separation of points that are on top of each other. The system also lets users select objects by clicking on them directly, rather than first requiring the user to specify which layer is “active”. However, Atlas lacks charting facilities, cannot query more than one file at a time, and cannot show tabular data on printed maps.
The product comes with far more data than its competitors: current year (as opposed to now-outdated 1990 Census data) population, income and ethnicity and boundary files by state, county, 3-digit Zip, television Area of Dominant Influence (ADI), television Designated Marketing Area (DMA), telephone area code, census Metropolitan Statistical Area and 100 top cities, plus centroids of 5-digit Zips. It also provides demographics on 1,500 international countries and cities. Buyers also get detailed market, product usage and business location data for Charlotte, NC.
The new version of Atlas is especially strong at application development. In addition to its own scripting language, it supports interfaces to Visual Basic and C, which let applications written in those languages call up Atlas functions. A separate product allows Atlas functions to be treated as “objects” that are embedded in other systems.
Atlas GIS for Windows is priced at $1,595. Training and telephone support beyond the first 90 days are additional.
ArcView (Environmental Systems Research Institute, 909-793-5953) was introduced in 1992 as an entry-level Windows product from ESRI, which is better known for its more powerful Unix-based ARC/INFO. About 20,000 copies of ArcView have been sold, and the company is now preparing a much more powerful version for release some time this year.
ArcView stores data in a proprietary format that combines dBASE data files with a specialized format for geographic information. The system can read dBASE data files directly, and the new version will be able to access other database formats via SQL queries. The current version supports raster, vector and bitmap files.
Geocoding gives the user some control over how matches are defined, but needs separate passes for matches at the street or Zip Code levels. The new version will also be able to apply Zip+4 codes and handle more complex address formats.
ArcView offers a somewhat limited set of analytical functions: queries are limited to a single layer at a time, there is no area-weighted aggregation and the system lacks charting capabilities. Printed output is limited to maps and text, excluding even tabular data listings. Most of these functions will be enhanced in the new version.
The system does have a good interface. For example, an on-screen “table of contents” allows the user to easily determine which layers are selected for a particular operation.
ArcView is currently delivered with only a small set of sample data, although the new version is expected to provide a competitive selection. The new version will also add a customization capability.
Despite its limits, ArcView does have advantages. Most important may be its ability to read the many databases already prepared in the ARC/INFO format. The existing version also has a major price edge–$495 for Windows, including lifetime telephone support, and $995 for Unix. Price and support for the new system will probably be more in line with its competitors, however.
* * *
David M. Raab is a Principal at Raab Associates Inc., a consultancy specializing in marketing technology and analytics. He can be reached at draab@raabassociates.com.
Leave a Reply
You must be logged in to post a comment.