best way to download an antire site with ruby

View: New views
3 Messages — Rating Filter:   Alert me  

Re: best way to download an antire site with ruby

by Christian Bradley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

med addame wrote:
> What is the best way to download all Web Site resources (html, images,
> ...) to a local directory using Ruby?
> Thanks,
> A.

I'd say Hpricot + Net::HTTP
http://github.com/whymirror/hpricot

something like...

class Page
  attr_accessor :html
  attr_accessor :links

  def open
     #.. do net http open here
  end

  def save
    # save html as .html
    # save images ...
    # for each link that are from anchor tags
    # - create a new page
    # - call open/parse/save
  end

  def parse_links
    #... use hpricot here to parse links
    #... add them to an @links array
    #... give them each a type (img,anchor,etc)
  end
end
--
Posted via http://www.ruby-forum.com/.


best way to download an antire site with ruby

by addame :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

What is the best way to download all Web Site resources (html, images,
...) to a local directory using Ruby?
Thanks,
A.
--
Posted via http://www.ruby-forum.com/.


Re: best way to download an antire site with ruby

by Brian Candler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

med addame wrote:
> What is the best way to download all Web Site resources (html, images,
> ...) to a local directory using Ruby?

Quickest and easiest:

  system("wget -p -np -r http://www.lua.org/manual/5.1/")

Also useful are the -I, -P, -nH options.
--
Posted via http://www.ruby-forum.com/.