« Return to Thread: Framesets

Re: Framesets

by Charles François Rey :: Rate this Message:

Reply to Author | View in Thread

Here's an example:
---
// bare minimum, lots of ways to improve how things are handled
DefaultHttpClient httpclient = new DefaultHttpClient();
String url = "http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols 
";
HttpUriRequest request = new HttpGet(url);
HttpResponse response = httpclient.execute(request);
HtmlCleaner cleaner = new HtmlCleaner();
// note that HtmlCleaner is capable to download a URL, but let's assume
// that you need httpclient to do it .. (e.g. POST request,
// special settings, ...)
TagNode rootNode = cleaner.clean(response.getEntity().getContent());
Document doc = (new DomSerializer(cleaner.getProperties(),  
true)).createDOM(rootNode);
// we're just going to display the target urls of the frames
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath is very useful when dealing with HTML/XML ..
NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc,  
XPathConstants.NODESET);
for(int i = 0; i<nodes.getLength(); i++) {
        System.out.println(nodes.item(i).getNodeValue());
}
---

This example should display the 3 frames of this example at the given  
URL, i.e.:
        frame_a.htm
        frame_b.htm
        frame_c.htm

Those are relative paths, so you would have to prefix with the correct  
basepath to fetch them.

Note on the packages used: XPath comes from the standard package  
javax.xml.xpath, and the HtmlCleaner library comes from http://htmlcleaner.sourceforge.net/ 
, Document and NodeList come from the standard package org.w3c.dom.

On 4 juin 09, at 17:12, Charles François Rey wrote:

> Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML.
>
> That being said, it is possible to follow Framesets, just download the
> HTML file, parse it and follow the Frame definitions.
>
> If I had to do it, I'd use HttpClient to retrieve the HTML,
> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src"
> attributes.
>
> On 4 juin 09, at 16:44, Scott Ward wrote:
>
>> Is it possible to follow framesets using HttpClient?  I have
>> searched all
>> over and haven't found anything so I thought that I would try this.
>>
>> If it is can you direct me to it in the API or show an example.
>>
>> Any help is greatly appreciated.
>> __________________________________________
>>
>> ~Sward

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@...
For additional commands, e-mail: httpclient-users-help@...

 « Return to Thread: Framesets