Framesets

View: New views
4 Messages — Rating Filter:   Alert me  

Framesets

by Scott Ward :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is it possible to follow framesets using HttpClient?  I have searched all
over and haven't found anything so I thought that I would try this.

If it is can you direct me to it in the API or show an example.

Any help is greatly appreciated.
__________________________________________

~Sward

Re: Framesets

by Charles François Rey :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML.

That being said, it is possible to follow Framesets, just download the  
HTML file, parse it and follow the Frame definitions.

If I had to do it, I'd use HttpClient to retrieve the HTML,  
HTMLCleaner to clean the HTML, and XPath to filter the Frame "src"  
attributes.

On 4 juin 09, at 16:44, Scott Ward wrote:

> Is it possible to follow framesets using HttpClient?  I have  
> searched all
> over and haven't found anything so I thought that I would try this.
>
> If it is can you direct me to it in the API or show an example.
>
> Any help is greatly appreciated.
> __________________________________________
>
> ~Sward


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@...
For additional commands, e-mail: httpclient-users-help@...


Re: Framesets

by Charles François Rey :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Here's an example:
---
// bare minimum, lots of ways to improve how things are handled
DefaultHttpClient httpclient = new DefaultHttpClient();
String url = "http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols 
";
HttpUriRequest request = new HttpGet(url);
HttpResponse response = httpclient.execute(request);
HtmlCleaner cleaner = new HtmlCleaner();
// note that HtmlCleaner is capable to download a URL, but let's assume
// that you need httpclient to do it .. (e.g. POST request,
// special settings, ...)
TagNode rootNode = cleaner.clean(response.getEntity().getContent());
Document doc = (new DomSerializer(cleaner.getProperties(),  
true)).createDOM(rootNode);
// we're just going to display the target urls of the frames
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath is very useful when dealing with HTML/XML ..
NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc,  
XPathConstants.NODESET);
for(int i = 0; i<nodes.getLength(); i++) {
        System.out.println(nodes.item(i).getNodeValue());
}
---

This example should display the 3 frames of this example at the given  
URL, i.e.:
        frame_a.htm
        frame_b.htm
        frame_c.htm

Those are relative paths, so you would have to prefix with the correct  
basepath to fetch them.

Note on the packages used: XPath comes from the standard package  
javax.xml.xpath, and the HtmlCleaner library comes from http://htmlcleaner.sourceforge.net/ 
, Document and NodeList come from the standard package org.w3c.dom.

On 4 juin 09, at 17:12, Charles François Rey wrote:

> Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML.
>
> That being said, it is possible to follow Framesets, just download the
> HTML file, parse it and follow the Frame definitions.
>
> If I had to do it, I'd use HttpClient to retrieve the HTML,
> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src"
> attributes.
>
> On 4 juin 09, at 16:44, Scott Ward wrote:
>
>> Is it possible to follow framesets using HttpClient?  I have
>> searched all
>> over and haven't found anything so I thought that I would try this.
>>
>> If it is can you direct me to it in the API or show an example.
>>
>> Any help is greatly appreciated.
>> __________________________________________
>>
>> ~Sward

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@...
For additional commands, e-mail: httpclient-users-help@...


Re: Framesets

by Scott Ward :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thank you that helped a lot.
__________________________________________

~Sward


On Thu, Jun 4, 2009 at 11:14 AM, Charles François Rey <charlesfr.rey@...
> wrote:

> Here's an example:
> ---
> // bare minimum, lots of ways to improve how things are handled
> DefaultHttpClient httpclient = new DefaultHttpClient();
> String url = "
> http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols";
> HttpUriRequest request = new HttpGet(url);
> HttpResponse response = httpclient.execute(request);
> HtmlCleaner cleaner = new HtmlCleaner();
> // note that HtmlCleaner is capable to download a URL, but let's assume
> // that you need httpclient to do it .. (e.g. POST request,
> // special settings, ...)
> TagNode rootNode = cleaner.clean(response.getEntity().getContent());
> Document doc = (new DomSerializer(cleaner.getProperties(),
> true)).createDOM(rootNode);
> // we're just going to display the target urls of the frames
> XPath xpath = XPathFactory.newInstance().newXPath();
> // XPath is very useful when dealing with HTML/XML ..
> NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc,
> XPathConstants.NODESET);
> for(int i = 0; i<nodes.getLength(); i++) {
>        System.out.println(nodes.item(i).getNodeValue());
> }
> ---
>
> This example should display the 3 frames of this example at the given URL,
> i.e.:
>        frame_a.htm
>        frame_b.htm
>        frame_c.htm
>
> Those are relative paths, so you would have to prefix with the correct
> basepath to fetch them.
>
> Note on the packages used: XPath comes from the standard package
> javax.xml.xpath, and the HtmlCleaner library comes from
> http://htmlcleaner.sourceforge.net/, Document and NodeList come from the
> standard package org.w3c.dom.
>
>
> On 4 juin 09, at 17:12, Charles François Rey wrote:
>
>  Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML.
>>
>> That being said, it is possible to follow Framesets, just download the
>> HTML file, parse it and follow the Frame definitions.
>>
>> If I had to do it, I'd use HttpClient to retrieve the HTML,
>> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src"
>> attributes.
>>
>> On 4 juin 09, at 16:44, Scott Ward wrote:
>>
>>  Is it possible to follow framesets using HttpClient?  I have
>>> searched all
>>> over and haven't found anything so I thought that I would try this.
>>>
>>> If it is can you direct me to it in the API or show an example.
>>>
>>> Any help is greatly appreciated.
>>> __________________________________________
>>>
>>> ~Sward
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@...
> For additional commands, e-mail: httpclient-users-help@...
>
>