0.18-pre2

View: New views
6 Messages — Rating Filter:   Alert me  

0.18-pre2

by Uwe Dippel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, Antonio,

We have upgraded to 0.18-pre2 yesterday, and as a first, fast,
impression, we encounter a bunch of troubles at recognizing the serif
lower-case 't'. It comes out as lower-case 'l' in too many cases.

If I zoom in on the images, I can clearly see the features of the 't',
like the horizontal stroke clearly below the height of the letter, the
round, upwards-pointing right-hand lower serif, etc.

I'll send you the files in a separate mail.

Uwe


_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad

Re: 0.18-pre2

by Bugzilla from ant_diaz@teleline.es :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Uwe,

Uwe Dippel wrote:
> We have upgraded to 0.18-pre2 yesterday, and as a first, fast,
> impression, we encounter a bunch of troubles at recognizing the serif
> lower-case 't'. It comes out as lower-case 'l' in too many cases.

I have tried the option --scale=2 and it recognizes all the t's
correctly. Think that 't' is a difficult letter, and the faxes you feed
to ocrad are not the easiest images to recognize.


Regards,
Antonio.



_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad

Re: 0.18-pre2

by Uwe Dippel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Antonio Diaz Diaz wrote:

> I have tried the option --scale=2 and it recognizes all the t's
> correctly. Think that 't' is a difficult letter, and the faxes you feed
> to ocrad are not the easiest images to recognize.

I know, I know. :)

I never thought scaling would be of an advantage? Does it do some
interpolation? (I read the info pages, but didn't find much.)

It would be good to have a short write-up on what to do to increase the
accuracy. I have tried some other options, but rather on the 'blind'
side of 'just playing'. Different character sets have not resulted in
any improvement, e.g.
Any good reason, why you tried scaling at a factor of 2?

And still another one: is there a chance to scale with different factors
for x and y? I am asking, since we are doing OCR on faxes, and 'normal'
at faxes means 204x98 dpi.

Thanks again,

Uwe


_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad

Re: 0.18-pre2

by Bugzilla from ant_diaz@teleline.es :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Uwe Dippel wrote:
> I never thought scaling would be of an advantage? Does it do some
> interpolation? (I read the info pages, but didn't find much.)

Scaling does some interpolation and rounding. It is very useful when
letters are not big enough.


> Any good reason, why you tried scaling at a factor of 2?

Well, it is the first thing I try when I find small or not well defined
letters, because many times improves things.


> And still another one: is there a chance to scale with different factors
> for x and y? I am asking, since we are doing OCR on faxes, and 'normal'
> at faxes means 204x98 dpi.

Ocrad can't currently scale differently in both axes, and I can't
remember a case where this could help. But perhaps you could fin one. :)


Regards,
Antonio.


_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad

Re: 0.18-pre2

by Uwe Dippel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Mar 29, 2008 at 8:08 AM, Antonio Diaz Diaz <ant_diaz@...> wrote:

>  > Any good reason, why you tried scaling at a factor of 2?
>
>  Well, it is the first thing I try when I find small or not well defined
>  letters, because many times improves things.

Interesting, thanks. The scaling by 2 gives a number of better
results. Unfortunately, the scaling done by ocrad has rendered a few
'l' (lower case L) into '}'. So I fell back to pbmpscale (N=2), which
seems to do a good job. But scaling really *is* a great advantage!

Does ocrad allow to set a 'dictionary' of valid characters? What I
mean, is like valid characters are uppercase, lowercase, digits,
punctuation, alphanumeric, or passing a string containing all valid
characters? In some cases, this can be used to increase the accuracy
as well. Like in our application, we use low resolution, but we only
need alphanumeric, '.', '_','-'.  Any ambiguity could fall back to one
of these.

>  > And still another one: is there a chance to scale with different factors
>  > for x and y? I am asking, since we are doing OCR on faxes, and 'normal'
>  > at faxes means 204x98 dpi.
>
>  Ocrad can't currently scale differently in both axes, and I can't
>  remember a case where this could help. But perhaps you could fin one. :)

See above. I have now inserted pnmstretch -yscale=2, followed by
pnmsmooth. It is still not optimal, but at least the ratio is correct.

Uwe


_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad

Re: 0.18-pre2

by Bugzilla from ant_diaz@teleline.es :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Uwe Dippel wrote:
> Does ocrad allow to set a 'dictionary' of valid characters?

No. The closest thing to your needs that ocrad currently provides is
"--charset=ascii".


Regards,
Antonio.


_______________________________________________
Bug-ocrad mailing list
Bug-ocrad@...
http://lists.gnu.org/mailman/listinfo/bug-ocrad