problem with search results containing pdf files with unknown characters

View: New views
2 Messages — Rating Filter:   Alert me  

Parent Message unknown problem with search results containing pdf files with unknown characters

by yellowtrolley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

When pdf documents are indexed, images such as lines or bullets are stored as unknown characters (�). If these characters appear on the search result's excerpt then the rendering of the divs after the search result (e.g. footer) are misplaced.
Has anyone faced this problem?
Pablo



_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://lists.opencms.org/mailman/listinfo/opencms-dev

Re: problem with search results containing pdf files with unknown characters

by Marc Johnen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Pablo,

it looks like the encoding of your site is different from the encoding of the pdf.
In this case probably your site is encoded in ISO-8859-1 and the pdf in UTF-8,
as far as i know that's when "�" appears.
It might help setting the content-encoding property on the pdf to either
ISO-8859-1 or UTF-8.

Greetings
Marc


yellowtrolley wrote:
When pdf documents are indexed, images such as lines or bullets are stored
as unknown characters (�). If these characters appear on the search result's
excerpt then the rendering of the divs after the search result (e.g. footer)
are misplaced.
Has anyone faced this problem?
Pablo


_______________________________________________
This mail is sent to you from the opencms-dev mailing list
To change your list options, or to unsubscribe from the list, please visit
http://lists.opencms.org/mailman/listinfo/opencms-dev