Thursday, March 11, 2010

SQL Server Data Mining Code Posting Digest

As promised, here is a digest of my old blog’s postings about coding with respect to SQL Server Data Mining.  I actually thought there would be a lot more, but it turns out that most of my evangelizing in that space ended up as Tips & Tricks on – what do you think?  Should I digest those as well?

Anyway, here are the relevant postings from “old blog” delivered to you on my new blog.

DMX Queries – The Datasource Hole – this is probably the most important coding post.  This post provides the source code for a stored procedure to allow you to create datasources from a DMX call, which are required in order to query external data.  Since almost all data you would mine is external, this is pretty important!

Tree Utilities in Analysis Services Stored Procedures – this post provides a set of stored procedures for getting a variety of information from tree models, for example, the shortest path, longest path, etc.  Neat stuff that I used to help reduce the size of a gargantuan online questionnaire.

The amazing flexibility of DMX Table Valued Parameters – this post shows how table-valued parameters were meant to be done and how you can use them.  No offense to the SQL relational engine – natch.

Automatic Generation of CREATE MINING MODEL statementsthis post shows how to generate the DMX for a CREATE MINING MODEL statement given the model.  This is particularly useful, for example, when the model was created with BI Dev Studio or some other interface that uses XMLA.

The next set of links aren’t my own code, but references to other people’s great work in adding to the SQL Server Data Mining experience

Support Vector Machines for SQL Server Data Mining – A reference SVM plug-in implementation available on CodePlex by Joris Valkonet

Visual Numerics integration into SQL Server Data Mining – A great whitepaper by Visual Numerics discussing the C# plug-in algorithm model

Automatically Labeling Clusters Using Analysis Services Stored Procedures – another codeplex project – this time from furmangg – giving sprocs containing some cluster labeling hueristics


So that’s it for this digest and I think I’ve covered the most important posts – maybe next I’ll create a digest of the fluff pieces?  Let me know what you think….

No comments:

Post a Comment