|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
0.18-pre2Hi, Antonio,
We have upgraded to 0.18-pre2 yesterday, and as a first, fast, impression, we encounter a bunch of troubles at recognizing the serif lower-case 't'. It comes out as lower-case 'l' in too many cases. If I zoom in on the images, I can clearly see the features of the 't', like the horizontal stroke clearly below the height of the letter, the round, upwards-pointing right-hand lower serif, etc. I'll send you the files in a separate mail. Uwe _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: 0.18-pre2Hello Uwe,
Uwe Dippel wrote: > We have upgraded to 0.18-pre2 yesterday, and as a first, fast, > impression, we encounter a bunch of troubles at recognizing the serif > lower-case 't'. It comes out as lower-case 'l' in too many cases. I have tried the option --scale=2 and it recognizes all the t's correctly. Think that 't' is a difficult letter, and the faxes you feed to ocrad are not the easiest images to recognize. Regards, Antonio. _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: 0.18-pre2Antonio Diaz Diaz wrote:
> I have tried the option --scale=2 and it recognizes all the t's > correctly. Think that 't' is a difficult letter, and the faxes you feed > to ocrad are not the easiest images to recognize. I know, I know. :) I never thought scaling would be of an advantage? Does it do some interpolation? (I read the info pages, but didn't find much.) It would be good to have a short write-up on what to do to increase the accuracy. I have tried some other options, but rather on the 'blind' side of 'just playing'. Different character sets have not resulted in any improvement, e.g. Any good reason, why you tried scaling at a factor of 2? And still another one: is there a chance to scale with different factors for x and y? I am asking, since we are doing OCR on faxes, and 'normal' at faxes means 204x98 dpi. Thanks again, Uwe _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: 0.18-pre2Uwe Dippel wrote:
> I never thought scaling would be of an advantage? Does it do some > interpolation? (I read the info pages, but didn't find much.) Scaling does some interpolation and rounding. It is very useful when letters are not big enough. > Any good reason, why you tried scaling at a factor of 2? Well, it is the first thing I try when I find small or not well defined letters, because many times improves things. > And still another one: is there a chance to scale with different factors > for x and y? I am asking, since we are doing OCR on faxes, and 'normal' > at faxes means 204x98 dpi. Ocrad can't currently scale differently in both axes, and I can't remember a case where this could help. But perhaps you could fin one. :) Regards, Antonio. _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: 0.18-pre2On Sat, Mar 29, 2008 at 8:08 AM, Antonio Diaz Diaz <ant_diaz@...> wrote:
> > Any good reason, why you tried scaling at a factor of 2? > > Well, it is the first thing I try when I find small or not well defined > letters, because many times improves things. Interesting, thanks. The scaling by 2 gives a number of better results. Unfortunately, the scaling done by ocrad has rendered a few 'l' (lower case L) into '}'. So I fell back to pbmpscale (N=2), which seems to do a good job. But scaling really *is* a great advantage! Does ocrad allow to set a 'dictionary' of valid characters? What I mean, is like valid characters are uppercase, lowercase, digits, punctuation, alphanumeric, or passing a string containing all valid characters? In some cases, this can be used to increase the accuracy as well. Like in our application, we use low resolution, but we only need alphanumeric, '.', '_','-'. Any ambiguity could fall back to one of these. > > And still another one: is there a chance to scale with different factors > > for x and y? I am asking, since we are doing OCR on faxes, and 'normal' > > at faxes means 204x98 dpi. > > Ocrad can't currently scale differently in both axes, and I can't > remember a case where this could help. But perhaps you could fin one. :) See above. I have now inserted pnmstretch -yscale=2, followed by pnmsmooth. It is still not optimal, but at least the ratio is correct. Uwe _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
|
|
Re: 0.18-pre2Uwe Dippel wrote:
> Does ocrad allow to set a 'dictionary' of valid characters? No. The closest thing to your needs that ocrad currently provides is "--charset=ascii". Regards, Antonio. _______________________________________________ Bug-ocrad mailing list Bug-ocrad@... http://lists.gnu.org/mailman/listinfo/bug-ocrad |
| Free embeddable forum powered by Nabble | Forum Help |