Friday, July 23, 2010

Data Mining: What's The Big Deal?

I was participating in a meeting with a new client yesterday and we were discussing some preliminary results of a data-mining exercise. Since some of the team members were unfamiliar with me or the project, I explained the process we used, and then we brainstormed for an hour on what they would like to learn and how they would apply it immediately to their normal practices. As I told them at the end, it was a rare (and gratifying) meeting when "data" and "really excited" occur together in the same sentence.

What I found most fascinating about this meeting was the latent desire among staff to have the database tell them something—ANYTHING. This isn't an unsophsticated or small organization—without giving away state secrets, they have 3.26 million membership dues transactions over the past 17 years, for example, and the quality of their questions and thoughts regarding how they'd apply learnings fell somewhere between pretty good and truly outstanding. But their environment—waiting six months for IT to download their data, receiving only what they refer to as 'sales reports' and little else—is very typical in associations, and it neither rewards nor cultivates expertise working with data among association staff.

No wonder the "data driven decision making" approach recommended in ASAE's 7 Measures book never took off. Even in other ASAE publications, common advice regarding data seems to be "don't collect or store what  you don't know how to use." This might be good advice and promote efficiency, if it wasn't so easy to actually snapshot your data—all of it—in an environment where it can facilitate ad hoc queries, periodic dashboarding, developing product purchaser profiles, measuring the migration patterns of your recent graduates into full membership, exploring the relationship between product/event purchase/attendance with membership conversion, measuring repeat customer repeat rates, doing market basket analysis, creating an RFM matrix for your fundraising, etc.

The key is actually very simple—download your data just once from your AMS into a series of flat files (comma or tab delimited) and import them into a decision support tool. I use SAS, whose basic product for a single user runs $3200ish in the first year then $1600 for an annual license. John Dorman and the folks down at Texas Medical prefer to use the MS SQL that comes free with their network but he describes the cost of upgrades and training a staffer with at least some expertise in programming & analysis to be a one-time expense of maybe $8-$10,000. I find that loading and reprogramming an association's file takes me 2 to 10 hours depending on the number of modules the data is stored in and how much of the data we need to simplify or eliminate (since you don't really need to know the name of the event they registered for on July 2 2003—you just need to know it's one of ten they attended early in their membership tenure before they stopped attending but continued paying their dues). Querying it .. including re-sorts, creation of new variables, categories, etc. in new datasets might take 5 to 20 minutes, even for files with hundreds of thousands of members or millions of transactions. Of course, most consulting isn't iterative: most of our reports have to be large and episodic, rather than small and applied, because we're paid to do projects rather than programs, but if you added this capability in-house (my recommendation) any association who takes this approach could have answers literally on demand without annoying the IT staff with requests or annoying everyone by slowing down a production server.

The sad part of this is that the technology is actually ancient. It's been this cheap at least for the almost 25 years I've worked with these systems: at first nobody believed it was possible because we were using AS400 mini-cmoputers (actually mini meant 'not a mainframe;' weighing 1000 pounds and sometimes being fed by magnetic tape reels or cartridges). But once you turn the corner on this, in 1987 or 2010, seeing is believing. It's a simple process and I promise you'll never miss your 80 page reports again, nor do you have to pay $50,000 for 'data integration' or other support to integrate Cognos, Crystal Reports, or any other tool de jour. And after doing it inside of several associations I also never ran into the program of being regarded as the nerdy 'stapler guy' once we proved the ease and power of real, daily data mining.


  1. You make it sound easy Kevin ... easier than getting into the thing called social media! Isn't it interesting that people talk more about social media than data mining. Thanks for opening the conversation gate!

  2. Sadly, it IS easy. Most staff could learn more from three hours with a downloaded spreadsheet of their IMIS data than they could wading through a week of tweets. The sad part is that the latter would feel fun for many of us while the former sounds like torture. It's always funny for me to hear that younger staff & people are not technology phobic and then to go to a technology conference and see 12 sessions on social media & 1 on data mining when we've had systematic data for about 30 years, social media effectively for 2-3 years. Most of us are ahead of the curve on the latter & never left the starting gate on the former. Never was sure why except the latter feels easier and human nature always takes the path of least resistance.