|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
FramesetsIs it possible to follow framesets using HttpClient? I have searched all
over and haven't found anything so I thought that I would try this. If it is can you direct me to it in the API or show an example. Any help is greatly appreciated. __________________________________________ ~Sward |
|
|
Re: FramesetsFrameset is an HTML concept. HttpClient takes care of HTTP, not HTML.
That being said, it is possible to follow Framesets, just download the HTML file, parse it and follow the Frame definitions. If I had to do it, I'd use HttpClient to retrieve the HTML, HTMLCleaner to clean the HTML, and XPath to filter the Frame "src" attributes. On 4 juin 09, at 16:44, Scott Ward wrote: > Is it possible to follow framesets using HttpClient? I have > searched all > over and haven't found anything so I thought that I would try this. > > If it is can you direct me to it in the API or show an example. > > Any help is greatly appreciated. > __________________________________________ > > ~Sward --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: FramesetsHere's an example:
--- // bare minimum, lots of ways to improve how things are handled DefaultHttpClient httpclient = new DefaultHttpClient(); String url = "http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols "; HttpUriRequest request = new HttpGet(url); HttpResponse response = httpclient.execute(request); HtmlCleaner cleaner = new HtmlCleaner(); // note that HtmlCleaner is capable to download a URL, but let's assume // that you need httpclient to do it .. (e.g. POST request, // special settings, ...) TagNode rootNode = cleaner.clean(response.getEntity().getContent()); Document doc = (new DomSerializer(cleaner.getProperties(), true)).createDOM(rootNode); // we're just going to display the target urls of the frames XPath xpath = XPathFactory.newInstance().newXPath(); // XPath is very useful when dealing with HTML/XML .. NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc, XPathConstants.NODESET); for(int i = 0; i<nodes.getLength(); i++) { System.out.println(nodes.item(i).getNodeValue()); } --- This example should display the 3 frames of this example at the given URL, i.e.: frame_a.htm frame_b.htm frame_c.htm Those are relative paths, so you would have to prefix with the correct basepath to fetch them. Note on the packages used: XPath comes from the standard package javax.xml.xpath, and the HtmlCleaner library comes from http://htmlcleaner.sourceforge.net/ , Document and NodeList come from the standard package org.w3c.dom. On 4 juin 09, at 17:12, Charles François Rey wrote: > Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML. > > That being said, it is possible to follow Framesets, just download the > HTML file, parse it and follow the Frame definitions. > > If I had to do it, I'd use HttpClient to retrieve the HTML, > HTMLCleaner to clean the HTML, and XPath to filter the Frame "src" > attributes. > > On 4 juin 09, at 16:44, Scott Ward wrote: > >> Is it possible to follow framesets using HttpClient? I have >> searched all >> over and haven't found anything so I thought that I would try this. >> >> If it is can you direct me to it in the API or show an example. >> >> Any help is greatly appreciated. >> __________________________________________ >> >> ~Sward --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@... For additional commands, e-mail: httpclient-users-help@... |
|
|
Re: FramesetsThank you that helped a lot.
__________________________________________ ~Sward On Thu, Jun 4, 2009 at 11:14 AM, Charles François Rey <charlesfr.rey@... > wrote: > Here's an example: > --- > // bare minimum, lots of ways to improve how things are handled > DefaultHttpClient httpclient = new DefaultHttpClient(); > String url = " > http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml_frame_cols"; > HttpUriRequest request = new HttpGet(url); > HttpResponse response = httpclient.execute(request); > HtmlCleaner cleaner = new HtmlCleaner(); > // note that HtmlCleaner is capable to download a URL, but let's assume > // that you need httpclient to do it .. (e.g. POST request, > // special settings, ...) > TagNode rootNode = cleaner.clean(response.getEntity().getContent()); > Document doc = (new DomSerializer(cleaner.getProperties(), > true)).createDOM(rootNode); > // we're just going to display the target urls of the frames > XPath xpath = XPathFactory.newInstance().newXPath(); > // XPath is very useful when dealing with HTML/XML .. > NodeList nodes = ( NodeList )xpath.evaluate("//frame/@src", doc, > XPathConstants.NODESET); > for(int i = 0; i<nodes.getLength(); i++) { > System.out.println(nodes.item(i).getNodeValue()); > } > --- > > This example should display the 3 frames of this example at the given URL, > i.e.: > frame_a.htm > frame_b.htm > frame_c.htm > > Those are relative paths, so you would have to prefix with the correct > basepath to fetch them. > > Note on the packages used: XPath comes from the standard package > javax.xml.xpath, and the HtmlCleaner library comes from > http://htmlcleaner.sourceforge.net/, Document and NodeList come from the > standard package org.w3c.dom. > > > On 4 juin 09, at 17:12, Charles François Rey wrote: > > Frameset is an HTML concept. HttpClient takes care of HTTP, not HTML. >> >> That being said, it is possible to follow Framesets, just download the >> HTML file, parse it and follow the Frame definitions. >> >> If I had to do it, I'd use HttpClient to retrieve the HTML, >> HTMLCleaner to clean the HTML, and XPath to filter the Frame "src" >> attributes. >> >> On 4 juin 09, at 16:44, Scott Ward wrote: >> >> Is it possible to follow framesets using HttpClient? I have >>> searched all >>> over and haven't found anything so I thought that I would try this. >>> >>> If it is can you direct me to it in the API or show an example. >>> >>> Any help is greatly appreciated. >>> __________________________________________ >>> >>> ~Sward >>> >> > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-users-unsubscribe@... > For additional commands, e-mail: httpclient-users-help@... > > |
| Free embeddable forum powered by Nabble | Forum Help |