|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Ocrad on WindowsI'm one of the developers of SpamBayes (http://www.spambayes.org/). A
frequent source of spam these days are messages with essentially no text (or random gibberish) and one or more GIF images containing a pitch for cheap pharmaceuticals or penny stocks. I recently added code to use Ocrad to extract the text from these images: http://mail.python.org/pipermail/spambayes-dev/2006-August/003697.html http://mail.python.org/pipermail/spambayes-dev/2006-August/003699.html http://mail.python.org/pipermail/spambayes-dev/2006-August/003715.html Even though ocrad doesn't do a great job at extracting human-readable text from these images, it does a good enough job, and I expect it will get better over time. For this technique to be broadly useful in the SpamBayes community, it will need to be available on Windows. A couple developers have compiled ocrad on Windows using cygwin with one small code change ("std::fprintf" -> "fprintf"). Can we distribute that executable on the SpamBayes SF site (or convince you to do so) so that we can get Windows users to test out my new additions? Related to that, is there any interest in making an OCR library which can be linked into other applications instead of requiring the program to be run? Thanks, -- Skip Montanaro - skip@... - http://www.mojam.com/ "On the academic side, effort is too often expended on finding precise answers to the wrong questions." Baxter & Rennie, in "Financial Calculus" _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: Ocrad on Windowsskip@... wrote:
> For this technique to be broadly useful in the SpamBayes > community, it will need to be available on Windows. A couple developers > have compiled ocrad on Windows using cygwin with one small code change > ("std::fprintf" -> "fprintf"). Can we distribute that executable on the > SpamBayes SF site (or convince you to do so) so that we can get Windows > users to test out my new additions? Of course you may distribute the executable, as long as you also distribute the modified source as required by the GPL. I have never tried ocrad with spam images, but I suppose they are created to be seen on a monitor, and perhaps the text size is too small for ocrad. Did you try to enlarge images with the --scale option? > Related to that, is there any interest in making an OCR library which can be > linked into other applications instead of requiring the program to be run? Sort answer, no. Ocrad is currently too experimental as to develop a consistent library interface based in it. Best regards, Antonio. _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: Ocrad on WindowsAntonio> skip@... wrote: >> For this technique to be broadly useful in the SpamBayes community, >> it will need to be available on Windows. A couple developers have >> compiled ocrad on Windows using cygwin with one small code change >> ("std::fprintf" -> "fprintf"). Can we distribute that executable on >> the SpamBayes SF site (or convince you to do so) so that we can get >> Windows users to test out my new additions? Antonio> Of course you may distribute the executable, as long as you Antonio> also distribute the modified source as required by the GPL. Cool. Once we're set up I'll send you a pointer as a courtesy. We have no intention of forking Ocrad, we just want to make it available so Windows users can help us test recent changes to SpamBayes' scoring. Antonio> I have never tried ocrad with spam images, but I suppose they Antonio> are created to be seen on a monitor, and perhaps the text size Antonio> is too small for ocrad. Did you try to enlarge images with the Antonio> --scale option? So far I've only run it with no command line args. A quick check with one image I have laying about suggests that scaling up by two or three should help recognition a bit. I'll do some more tests with it. Thanks for the suggestion. >> Related to that, is there any interest in making an OCR library which >> can be linked into other applications instead of requiring the >> program to be run? Antonio> Sort answer, no. Ocrad is currently too experimental as to Antonio> develop a consistent library interface based in it. Not a problem. We're in the early stages as well of our endeavour as well. Skip _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: Ocrad on WindowsAntonio> Of course you may distribute the executable, as long as you Antonio> also distribute the modified source as required by the GPL. Done. I added a release named "ocrad-cygwin" here: http://sourceforge.net/project/showfiles.php?group_id=61702 It's a simple zip file containing Ocrad 0.15, the compiled ocrad.exe file and a patch for the source. Antonio> I have never tried ocrad with spam images, but I suppose they Antonio> are created to be seen on a monitor, and perhaps the text size Antonio> is too small for ocrad. Did you try to enlarge images with the Antonio> --scale option? Scaling by a factor of two helped quite a bit. Thanks again for the suggestion. Skip _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
| Free embeddable forum powered by Nabble | Forum Help |