2000 Jun 01
Intarka ProspectMiner
David M. Raab
DM News
June, 2000

In recent years, the Deep Thinkers of marketing have focused almost exclusively on managing relationships with existing customers. Since those relationships are affected by just about everything a company does, the result has been a steady inflation of marketing systems’ size and scope. Today’s grandest visions involve all-encompassing product suites that merge marketing and traditional operations in a seamless, if somewhat Orwellian, whole.

One byproduct of this evolution toward customer management has been the atrophy of features associated with prospecting. Like other evolutionary relics, these haven’t vanished entirely but have not matured and in some cases even regressed. In fact, many of today’s marketing systems treat prospects as if they were just customers without a purchase history.

As Darwin would have predicted, new products have appeared to fill this vacant ecological niche. Since prospecting is one of the few areas outside the range of the ever-broader marketing suites, it is particularly attractive to small companies that cannot compete head-on with the behemoths. These firms display the wide variety of approaches expected in an environment where natural selection has not yet had time to eliminate the weak.

ProspectMiner (Intarka Inc., 408-232-1000, www.intarka.com) is one of the most specialized competitors vying for a place in the world of prospecting systems. It does just one thing: build business-to-business prospect lists from the World Wide Web. And it seems to do it remarkably well.

Of course, there is no shortage of either business prospect lists or tools to search the Internet. What sets Intarka apart is its ability to generate highly targeted lists that contain relevant details extracted from live Web pages. As anyone knows who has ever attempted a manual Web search, this is quite a feat.

Intarka works its magic by splitting the process into three steps. The first involves identifying appropriate companies, which it does by first using existing search engines to find potential matches and then applying text-based filters to eliminate inappropriate entries. The second step is extracting information from both structured and unstructured data sources and putting it into a standard format. The third is distributing and presenting the information to users.

The technology underlying these processes is impressive but well hidden from the casual user. To build a list, the user specifies conventional key words, exclusion filters based on geography, business types and specific companies or terms, and up to three Web sites that match the desired profile. The system then automatically analyzes these Web sites using proprietary methods that look at how often, how prominently and in what context different words and phrases appear. One output of this analysis is a list of additional key words that users can add to the search list. This setup takes an experienced user from five to 15 minutes.

Once the initial settings are complete, the system uses the specified key words to query 19 standard search engines such as Yahoo!, Altavista, and Google. It builds a master list of all search engine hits, and then eliminates duplicates, dead links, sites that are not businesses, and sites that match the exclusion filters. It ranks the remaining sites on their similarity to the original user-specified Web sites, again using the key words and phrases it identified during the automated analysis. The system also notes any additional relevant key words or phrases it finds in the new sites.

At this point the user again steps in to review the rankings assigned to individual sites and to assess the additional key words or phrases. The user can increase or decrease the ranking of a site and also determine whether a particular word or phrase adds or detracts from a site’s score. (A common word might detract from a score if it distinguishes sites that are easily confused. For example, a search for members of the American Marketing Association would probably also find members of the American Medical Association; penalizing the word “doctor” would help eliminate some of the latter.) The system can then rerun the search and ranking processes using the adjusted critieria. Intarka reports it usually takes one or two iterations, each ranking five to ten sites, before a search is acceptably accurate. To speed this portion of the process, the system can run in a test mode that returns about 30 sites in half hour.

Once the user is satisfied with the quality of the list being generated, a full search usually runs overnight. During this process, the system will also complete its second task, of gathering specific information about the selected companies.

ProspectMiner draws from online sources including the site itself, corporate directories such as Hoover’s, news sources such as C-Net and Marketwatch, and SEC filings. It again uses proprietary text-analysis methods to identify and extract the company address and phone number, names of corporate officials, financial data, a company description, and a list of recent news stories–complete with links to the stories themselves. Although results vary depending on what is actually available on the Web, the system can usually glean at least basic contact information, and sometimes a remarkably complete dossier. Since the gathering process takes five to ten minutes per site, it is limited to sites that rank above a user-specified cut-off.

Once the data is assembled, it can be reviewed online, emailed to appropriate individuals, or exported in file format. The system can also rerun the process automatically at regular intervals to find new sites. It automatically excludes sites it has already processed.

ProspectMiner was released in December 1998 and has been sold to about 40 companies. The system is currently offered as stand-alone software with a single user license starting at $15,000 for six months or $25,000 for one year. It runs on a Windows 95 or later workstation with a reasonably fast Internet connection. The vendor plans to switch to a transaction-based model by the end of the year, allowing users to access the system over the Web and pay based on the number of items found.

* * *

David M. Raab is a Principal at Raab Associates Inc., a consultancy specializing in marketing technology and analytics. He can be reached at draab@raabassociates.com.

Leave a Reply

You must be logged in to post a comment.