|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
Deployment of Weka models to frontlineHi Does anyone on the list have any experience or ideas on how best to deploy Weka models for use in an operational environment? At the moment we're thinking of refreshing and revalidating the models every month but when it comes to interfacing with production systems we simply don't know how best to go about it. An RMI call from our applications to a little data-mining server that runs a JDBC-compliant database for querying and anaysis of results? Or direct call of Weka's classes? Help! There must be pros and cons for each option. We're a small data- and text-mining team in a large government Ministry that offers many services to the citizens of NZ: benefits, student loans, child protection etc and the models we're building are designed to support the decision-making of frontline staff. We have terabytes of data in one of NZ's biggest data-warehouse and we have lots of models to build - so if there is anyone out there wanting to apply their data-mining skills to real-life problems, then please send me your CV as we are hiring! Dr. Kip Marks CSRE Forecasting & Modelling Ministry of Social Development Wellington NZ DDI: +64-4-9163594 ------------------------------- This email and any attachments may contain information that is confidential and subject to legal privilege. If you are not the intended recipient, any use, dissemination, distribution or duplication of this email and attachments is prohibited. If you have received this email in error please notify the author immediately and erase all copies of the email and attachments. The Ministry of Social Development accepts no responsibility for changes made to this message or attachments after transmission from the Ministry. ------------------------------- _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Deployment of Weka models to frontlineHi Mark,
Pentaho offers a platform for deploying Weka models for scoring and for refreshing/rebuilding models via their Kettle ETL tool. Kettle is a streaming, process-flow style ETL engine that can interface with many data sources. There are components for Kettle that allow serialized Weka models to be loaded from disk or repositories at run time and used to score data as part of an ETL process. Similarly, another Kettle component can execute Weka Knowledge Flow processes to rebuild models. The ETL processes can be scheduled using your OS's scheduling utilities or deployed on the Pentaho BI server. There is a white paper on this at: http://www.pentaho.com/products/demo/data_mining_models_with_pentaho.php?asset=data-mining-models-pdf Documentation on the Weka-related components for Kettle can be found at: http://wiki.pentaho.com/display/DATAMINING/Pentaho+Data+Mining+Community+Documentation Cheers, Mark. On 6/11/09 8:15 AM, Kip Marks wrote: > > Hi > > Does anyone on the list have any experience or ideas on how best to deploy Weka models for use in an operational environment? At the moment we're thinking of refreshing and revalidating the models every month but when it comes to interfacing with production systems we simply don't know how best to go about it. An RMI call from our applications to a little data-mining server that runs a JDBC-compliant database for querying and anaysis of results? Or direct call of Weka's classes? Help! There must be pros and cons for each option. > > We're a small data- and text-mining team in a large government Ministry that offers many services to the citizens of NZ: benefits, student loans, child protection etc and the models we're building are designed to support the decision-making of frontline staff. We have terabytes of data in one of NZ's biggest data-warehouse and we have lots of models to build - so if there is anyone out there wanting to apply their data-mining skills to real-life problems, then please send me your CV as we are hiring! > > Dr. Kip Marks > > CSRE Forecasting& Modelling > Ministry of Social Development > Wellington > NZ > > DDI: +64-4-9163594 > > ------------------------------- > This email and any attachments may contain information that is confidential and subject to legal privilege. If you are not the intended recipient, any use, dissemination, distribution or duplication of this email and attachments is prohibited. If you have received this email in error please notify the author immediately and erase all copies of the email and attachments. The Ministry of Social Development accepts no responsibility for changes made to this message or attachments after transmission from the Ministry. > ------------------------------- > > > > > _______________________________________________ > Wekalist mailing list > Send posts to: Wekalist@... > List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html -- Mark Hall Senior Developer/Consultant, Pentaho Open Source Business Intelligence Citadel International, Suite 340, 5950 Hazeltine National Dr., Orlando, FL 32822, USA +64 7 348-7099 office, +64 21 399-132 mobile, +1 815 550-8637 fax, Skype: mark.andrew.hall, Yahoo: mark_andrew_hall Download the latest release today <http://www.sourceforge.net/projects/pentaho> _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Deployment of Weka models to frontline2009/11/5 Kip Marks <Kip.Marks007@...>
If you are working with huge datasets, it might be more appropriate to choose one technique, and to programm it in a language such as C. As far as I know, Weka is only available in Java, which is an excellent programming language for many situations, but not when it comes to speed. This is a typical result of interpretation (eg Java) in stead of compilation (eg C). However, if you do choose for Java (or any other programming language), I'd keep the querying load separated from the mining load. This means I'd go for at least 2 servers, not counting the computer/server that interacts with the end user. This means you need a central database where features can be stored and indexed. In that case, the database becomes the central point: the database answers queries from the user, while the mining server sends its results to the database on a regular basis (eg in batches or every time unit). Hence, you can easily divide the mining work to more servers when they become available. RMI looks only interesting when you want to develop an application on the client computer that does not has much CPU power. In that case, RMI allows you to move the load of interpreting the query results to a central server, such that the client only has to visualise. However, when many computers are connected to such a server, expect delays. In any case, there are many possibilities, but more important is to define what the goal is. Otherwise, you risk creating a great application that is everything but performant. _______________________________________________ -- Thomas Debray | Theoretical Epidemiology | Julius Center | Stratenum 6.131 | University Medical Center Utrecht | P.O.Box 85500 | 3508 GA Utrecht | The Netherlands | www.juliuscenter.nl | www.thomasdebray.be | www.netstorm.be _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Deployment of Weka models to frontline> If you are working with huge datasets, it might be more appropriate to
> choose one technique, and to programm it in a language such as C. As far as > I know, Weka is only available in Java, which is an excellent programming > language for many situations, but not when it comes to speed. This is a > typical result of interpretation (eg Java) in stead of compilation (eg C). Please stop spreading myths! Do you have *recent* comparison reference to backup your speed claims? Java running on your mobile phone might be interpreted, but Java running on a server will be compiled and with modern HotSpot technology will be optimized using immediate runtime performance feedback, i.e. it will be specifically optimized for what the current problem set is, or in other words it uses profiling information on the fly. The same could of course be done for C, but I am not aware of any C product actually implementing this. http://java.sun.com/javase/technologies/hotspot/ Bernhard --------------------------------------------------------------------- Bernhard Pfahringer, Dept. of Computer Science, University of Waikato http://www.cs.waikato.ac.nz/~bernhard +64 7 838 4041 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Deployment of Weka models to frontlineOn Thu, Nov 05, 2009 at 11:24:09PM +0100, Thomas Debray wrote:
[snip] > As far as I know, Weka is only available in Java, which is an > excellent programming language for many situations, but not when > it comes to speed. It has little bearing on the speed issue, though it might be of interest or news to some that Weka is "available" in any programming language that has a JVM implementation, of which there are many. I'm currently accessing Weka classes through JRuby. -- Michael C. Harris, School of CS&IT, RMIT University http://twofishcreative.com/michael/blog IRC: michaeltwofish #habari _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Deployment of Weka models to frontline2009/11/5 Bernhard Pfahringer <bernhard.pfahringer@...>
I never heard of Hotspot, however, after some research I must admit that some new evolves have brought Java into a better position. -- www.juliuscenter.nl | www.thomasdebray.be | www.netstorm.be _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Exeception when calling classifyInstance method
_______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
|
Re: Exeception when calling classifyInstance method> When I call the following method
> > try { > ((AdaBoostM1) myClassifier).classifyInstance(testing_instances.firstInstance()); > } > catch (Exception e) { > System.out.println(" nothing "); > } > it throw an exception -> " nothing ". I verified my classifier, it is ok. In fact, the AdaBoostM1 classifier is only performed 1 time ( then it gets out due to the error is too small ). > > The testing_instances is empty and is created based on the dataset of the training instances. Then I add a new instance to the testing_instances, set the classValue to 0. It seems everything is ok but why it results an exception ? > > Any suggestion ? Can you please post the full stacktrace of the exception? That will most likely shed some more light on the problem. Use the following code to output the stacktrace: try { ((AdaBoostM1) myClassifier).classifyInstance(testing_instances.firstInstance()); } catch (Exception e) { e.printStackTrace(); } Cheers, Peter -- Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
| Free embeddable forum powered by Nabble | Forum Help |