Showing posts with label mining. Show all posts
Showing posts with label mining. Show all posts

Sunday, March 11, 2012

custom plugin error: how to pass information to Mining Model Viewer?

Good morning,

Well... As I said in other topics, I'm doing a clustering plugin for text mining. I'm facing many problems and, with your help, solving them one by one.

First of all, thanks a lot again.

Well... I've made a clustering function that is actually working very well. But I'm exporting its results to a log file I use as an algorithm trace for debugging.

My clustering method returns a vector containing information of what cluster each register belongs. For instance:

vector[0] = 1 -> The register of index 0 belongs to cluster 1.

vector[1] = 9 -> The register of index 1 belongs to cluster 9.

vector[2] = 2 -> The register of index 2 belongs to cluster 2.

...

And so on.

But... I know that none of the Navigation methods receives a structure like this one discribed above. I only use it to log the results to debug the algorithm.

But how to pass this information (what register (or test case) belongs to what cluster) to the Navigation ?

Thanks a lot again, and any help will be very appreciated.

Actually, you can use the GetNodeDistribution method of the navigator, then use the ATTRIBUTE_NAME and ATTRIBUTE_VALUE properties of a distribution row to present, say, the register index and the cluster. Effectively, NODE_DISTRIBUTION is a table inside a content node.

The tutorial for managed plug-in algorithms (as well as the sample) detail how to add custom rows to a NODE_DISTRIBUTION (nodes that are not attribute/value pairs). The information is useful even if you use the C++ infrastructure

Wednesday, March 7, 2012

Custom Data Mining Functions

I would like to write a custom mining function, which takes a string, queries the database, and returns an answer based upon those queries. So the basic function is then:

[MiningFunction("Performs Foo")]
public string Foo(string param)
{
// process parameters
// query database

// calculate answer from query results

// return query results
}

And is executed from the client using:

SELECT Foo("X Y Z") FROM FooModel

This arrangement is so that resource-intensive calculations are performed server-side.

My question is: what is the preferrable method for executing the database query from within the custom mining function?

Custom mining functions are not actually designed for this kind of operations. They are intended for predictive features that are related to the mining model and typically this kind of operations do not need external access (such as a database query). I assume that your function's calculation part will use some information from the mining model and apply it to the database query results.

I think you should use a stored procedure. Inside the stored procedure, you should use the server side object model (add a reference to Microsoft.AnalysisServices.AdomdServer). With the server side object model, you can perform the following operations:

- use AdomdCommand to execute calls such as CALL SystemOpenQuery(DataSource, Query), which is the recommended way of querying a relational database from analysis services

- also use AdomdCommand to execute calls such as SELECT .... FROM YourModel PREDICTION JOIN OPENQUERY(DataSource, Query)

This would allow you to get the information from the data base together with scoring for each, scoring computed as a prediction from your model ). Your code could use the results and perform aggregations or more complex computations on the result.

If, in the code of your stored procedure, you need to get extra information from your model, you can traverse the content of the mining model using the object model.

The article at http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/4264.aspx contains such a stored procedure, which requires both data and model content information, so I think it may be a good example. The data is coming directly from the model, with a drillthrough query. You can replace that query with a CALL SystemOpenQuery or prediction against an OPENQUERY statement.

Hope this helps

|||Yes, this is helpful. I'm looking at the material you referenced to see if it completely answers my question. Unfortunately, I don't seem able to progress pass the logon screen at sqlserverdatamining at the moment, so I can't get the .cs example...|||Do you have an account on sqlserverdatamining.com? Do you have problems logging in with your account? Or creating a new account