« Return to Thread: Index weightings of different types of text node...h1, h2 anchor etc..

Index weightings of different types of text node...h1, h2 anchor etc..

by JoelGrrrr :: Rate this Message:

Reply to Author | View in Thread

Hi, Would I be correct in thinking that Nutch, when indexing an html
document, does not weight the different text nodes (h1, h2, anchor etc)
differently - instead it just lumps together all text as one? (this is
the impression I get from looking at
org.apache.nutch.parse.html.HtmlParser)

Rgs,
Joel

 « Return to Thread: Index weightings of different types of text node...h1, h2 anchor etc..