[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
DM: teaching data miningFrom: Joseph Albert Brady Date: Sat, 20 Dec 1997 10:05:44 -0500 (EST) I had written to this list asking for help. (See below). I received many potentially helpful replies. I thank all those who responded. List readers may be interested in the replies, which are exerpted below. I have stripped away the names of the senders, from the private correspondance. Company names were not stripped away in all cases. My questions were: I teach MIS subjects in the University of Delaware's College of Business. We teach fundamental database topics to our students. We'd like to move past that stuff, to have our students do some datamining work. So far, I have not been able to figure out how to do that economically. Three questions: 1. Are any of you aware of good introductory texts on datamining, which includes datamining software? 2. Are any of you aware of public domain datamining=20 software? Failing that, very reasonably priced data mining software? 3. Are any of you aware of any industrial strength data mining packages that would be deeply discounted for educational institution use? Exerpted replies follow: *** Cognos Inc. has a reasonably priced Business Intelligence Suite (Impromptu, PowerPaly, and Scenario) which might meet your needs. I am not sure if there is an educational discount, but you can contact the sales department and ask them about the discount. **** If you are interested in OLAP, I would suggest that you look at PowerPlay. If you want to set up an advanced graphical query environment we offer a tool called Impromptu. If you are looking for real datamining then try Scenario. Each of these packages is very sophisticated, flexible, and easy to use. Scenario: Designed for spotting patterns and exceptions in business data that might otherwise be missed, Scenario's sophisticated interface allows users to readily visualize the business information being uncovered. It automates the discovery and ranking of critical factors impacting a business, exposes hidden relationships between factors and establishes thresholds and benchmarks. An intuitive, cost-effective desktop tool, Scenario liberates data mining from what is typically an expensive and time-consuming process. Insights derived using Scenario are achieved directly by those best positioned to use the knowledge and effect rapid change. Scenario 1.0, runs on Windows 95 and Windows NT and requires an IBM-compatible 486 PC and 8 MB of RAM. http://www.cognos.com/busintell/products/scenario_overview.html Point your browser to www.cognos.com for more information about our software. *** A good site to check out for software (free and commerical) is http://www.kdnuggets.com It also has links to conferences, data and other related sites. *** [] You may consider WizSoft's data mining products -- (1) WizWhy for revealing rules and issuing predictions (2) WizRule for revealing rules and discovering errors. Both products -- 1. Reads ASCII, dBase, Access and any ODBC compliant database 2. Reveals ALL if-then rules (Association rules) 3. Reveals mathematical formula rules 4. Run on Win 95 / Win NT Both products have full working demo versions that are limited by the = number of records (WizWhy - 250 records, WizRule - 1,000 records). These = demo versions may be used in order to teach data mining. The prices of the full versions are -- WizWhy - $ 3,995 WizRule - $ 1,395 However, as an educational institution you are entitled for a 90% = discount, and so in your case the prices are: WizWhy - $ 399 WizRule - $ 139 *** Predictive Data Mining by Shalom Weiss and another co-author (sorry I do not have the book with me) is the best book I've read so far. It also comes with software (thought I have not tried it!) Check out www.kdnuggets.com for software. Probably most <vendors> will discount for academic use. I've heard of Silicon Graphics giving their software MineSet for free for academic use. I also know that Darwin has 90% discount for academic use (but that's 90% of $50K!) *** Re 3. SAS costs educational institutions about 10% of the cost for industrial organisations. *** We have an economical solution for your entire campus. For a $10,000/year license fee we offer a campus a universal site license-- ALL campus machines, UNIX or PC copies for ALL faculty and staff, no limit Each PC licensee is required to purchase the documentation and disks from the campus bookstore (about $60 wholesale--the bookstore will mark up) A single point of contact for tech support on campus is required (we will offer several trainings per year for campus tech support folks here in San diego at no charge for the class). This is a brand new program. If you can enlist the computer center and the departments which have interest (computer science, Business IT, statistics, economics, medical school, any department with an applied stat component) you might be able to get this one going. A similar program offered to a large mid-western campus was financed by the bookstore charging $125 for the package and returning $50 to the university to cover the site license. Thjey had about 300 registrants in the first year. *---------------------------+---------------------------------* | FAX (619) 543 8888 | | Salford Systems | VOICE (619) 543-8880 | | 8880 Rio San Diego Dr | | | Suite 1045 | | | San Diego, CA 92108 | email:dstein@salford-systems.com| | | web : www.salford-systems.com| *---------------------------+---------------------------------* | Developers of CARTŪ (tm) for Windows, DOS, MacOS, Unix | | | | Comprehensive Statistical Consulting and Database Services | | Database Mining Solutions | | Discrete Choice Experiment Design & Analysis | *-------------------------------------------------------------* *** For a data mining tool that uses rule induction, visit http://www.azmy.com You can download SuperQuery and try it for 7 days. The Office edition of SuperQuery costs only $49.95! And the Discovery Edition is $449.95. You can download only one edition per PC. SuperQuery is a new proffesional comercial product that is very reasonably priced. There is a white paper http://www.azmy.com/wp1.htm that explains rule induction and the principles behind the Inference Engine in SuperQuery. Let me know your comments after you download SuperQuery and try it. It contains examples and a step-by-step tutorial. AZMY Thinkware, Inc. 1450 Palisade Ave. #M1D Fort Lee, NJ 07024 http://www.azmy.com 201 947 1881 *** Dear data mining and database marketing instructors, Our company, Megaputer Intelligence, is a world leader in providing data mining software and solutions. Megaputer is the developer of PolyAnalyst - one of the most popular and powerful data mining systems on the market. I would like to offer you cooperation of Megaputer Intelligence in educating students, as well as the broad business community, about the new opportunities opened by the introduction of the automated machine learning technology in the fields of database marketing, risk analysis, quality control, etc. The objective of this offer is not to sell a huge number of copies of the software, but rather to become a part of the educational process. We are ready to discuss any form of cooperation. We have special very low educational rates for PolyAnalyst for universities who become our partners. As an example, PolyAnalyst is being used at Kelley School of Business at Indiana University as the main data mining tool for a course in database marketing. Megaputer is a provider of data mining solutions for The Center for Education and Research in Retailing at IU, sponsored by Sears. The Megaputer team could furnish its thorough expertize in data mining, carry out some sample data exploration projects, as well as provide you with the latest version of the next generation data mining solution - PolyAnalyst - at a special intoductory educational rate. In addition, a FREE evaluation copy of PolyAnalyst 3.2, is available for downloading from http://www.megaputer.ru PolyAnalyst represents a technological breakthrough in the field of knowledge discovery in databases. The system automatically discovers the EXPLICIT SYMBOLIC FORM OF RELATIONS hidden in data. As a first step of the proposed cooperation, please, visit our website to learn more about PolyAnalyst and its applications, and download the tutorial and the program itself. Next we could discuss what joint efforts we are ready to undertake for promoting the new leading edge technology. Megaputer Intelligence, USA http://www.megaputer.ru 812-325-3026 tel (not available 12/19/97 - 01/14/98) 812-339-1646 FAX mailto:megaputers@aol.com or megaputer@glas.apc.org *** One of the better books I have seen on the subject just came out. It is called "Predictive Data Mining: A Practical Approach". I was written by Sholom M. Weiss & Nintin Induskhya. ISBN 1-55860-478-2 Morgan Kaufman publishes it. The nice thing about it is that you can order it with a software option. They have a bunch of command line tools for neural networks, decision trees, and associative rules along with some data reduction techniques. I just bought the book last week with the software option. The book is 39.95 and the codes to down load the software are 24.95 You can reach the publisher at 1-800-745-7323 or you can look at the books website at: http://www.data-miner.com *** See Snob on my mixture modelling page. Dept of Computer Science, Monash University, Clayton, Victoria 3168, Australia dld@cs.monash.edu.au Fax:+61 3 9905-5146 http://www.cs.monash.edu.au/~dld/ http://www.cs.monash.edu.au/~dld/Snob.html http://www.cs.monash.edu.au/~dld/mixture.modelling.page.html **** I know some good texts: 1. "Computer systems that learn" by Weiss and Kulikowski 2. "Machine Learning" by Tom Mitchell 3. "Predictive Data Mining" by Weiss and Indurkhya *** There is a new book out Data Mining: A Hands On Approach for Business Professionals by Robertgroth Published by Prentice Hall PTR in their Data Warehouse Institute series. It is written at a relatively elementary level but gives a good overview. The most interesting piece is the inclusion of a CD Rom with three commercial products in student versions: Data Mind Angoss Knowledge Seeker Neural Network Predict The ISBN Number is 013-756412-0 --- The best source of ongoing information about KDD is in ht KDD newlsetter. It contains announcements of both free and low cost software, The following is copied from this newsletter. If you don't subscribe, you should. It is archived so you can go into the files and get back issues. Knowledge Discovery Nuggets (tm) is a free electronic newsletter for the Data Mining and Knowledge Discovery community, focusing on the latest research and applications. Submissions are most welcome and should be emailed, with a DESCRIPTIVE subject line (and a URL) to gps@kdnuggets.com. Please keep CFP and meetings announcements short and provide a URL for details. To subscribe, see http://www.kdnuggets.com/subscribe.html *** I work for ISL, the producers and suppliers of Clementine. I know that we offer substantial educational discounts but I'm not sure what the situation is in the US. You may find it worthwhile to contact our US office: ISL Decision Systems Inc 630 Freedom Business Center Suite 314 King of Prussia PA 19406 Contact: Frank V. Borrelli Tel +1 610 768 7725 Fax +1 610 768 7774 Email: isldsi@isl.co.uk *** One interesting book which I am currently reviewing for "PC AI" magazine: "Predictive Data Mining", co-authored by Sholom Weiss, published by Morgan Kaufmann (approx. $35). I would recommend either that book or "Computer Systems That Learn" by Weiss and Kulikowski- easily a classic in the literature, but very readable. The cheapest useable software of which I am aware is DMSK ("Data Mining Software Toolkit"), weighing in at approx. $25, which is a disk companion= to "Predictive Data Mining" from Morgan Kaufmann. The interface is command-line driven, so it's not the most inviting software, especially for a generation that has never used DOS, but it is not that bad. The underlying modeling algorithms are fairly capable and the whole thing runs off of data files in text format, so it should be capable of handling very large data sets. DMSK provides a nice mix of data mining technologies (many commercial tools concentrate on one or two), including neural networks, decision tree-induction, rule-induction, clustering, text mining, association rule discovery and a variety of data preparation methods. There is a Web site for the book and software, which should not be too hard to find. Barring DMSK, I think your next stop on the price scale would be roughly at $200, where one can purchase BrainMaker from California Scientific Software. This is a pretty nice neural network package. I use the $795 BrainMaker Professional version for professional work, so I can vouch for this tool. Some companies do offer either an academic discount or some sort of site license, but you'd have to contact vendors directly for that information. I will check with some friends of mine at Unica (who make PRW ["Pattern Recognition Workbench"]) to see what sort of deal they might be willing to make. *** MLC++ can be used freely for research purposes, such as your course. http://www.sgi.com/Technology/mlc/ Compiled versions are available for SGI, SUN, NT. Source is available, but it's not trivial to compile. MineSet from Silicon Graphics is under a varsity agreement that makes it extremely cheap for Universities ($20,000 otherwise). It requires Silicon Graphics hardware. See mineset.sgi.com/ under more information. *** Our book, "Data Mining Techniques for Marketing, Sales, and Customer Support" (John Wiley, ISBN 0-471-17980-9), covers data mining from both the business perspective and various algorithms, with case studies. It is being used for similar courses at Rice and UBC. *** Thinking Machines has a 90% educational discount program. That would make our Darwin WindowsNT/95 client/UNIX server, parallel algorithms, neural net, CARTŪ and k-nearest neighbor, data mining software regularly $50k software cost the university only $5-a real value. Thinking Machines Corporation phone: 781.238.3418 16 New England Executive Park fax: 781.238.3440 Burlington, MA 01803 web: http://www.think.com *** Our book, "Data Mining Techniques for Marketing, Sales, and Customer Support" (John Wiley, ISBN 0-471-17980-9), covers data mining from both the business perspective and various algorithms, with case studies. It is being used for similar courses at Rice and UBC. A course using our book is also being taught at the business school of Dalhousie University. The syllabus is on the web at http://ttg.sba.dal.ca/Courses/mba6522/ complete with homework assignments and everything.
|
MHonArc
2.2.0