database and xml parsers

View: New views
3 Messages — Rating Filter:   Alert me  

database and xml parsers

by Marnen Laibow-Koser-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm converting a RoR app to run under JRuby.  Things are going well so
far.

There are some data-intensive portions to this intranet site, parsing
large XML files and populating database (MySQL) tables with the content
from the file.  Previously, I was using a native XML parser, which I
cannot do in JRuby.

I am trying to find a replacement SAX-based XML parser (Hpricot?) that
works under JRuby...any suggestions?

Another possibility is to re-write the XML parsing and database
insertion routines in a Java/JAR package and call that from JRuby.  Is
it possible from Java to access the MySQL configuration information that
is contained in RoR database.yml?  Or does that information need to be
hard-coded in the Java code as well?

Thanks,
Kevin
--
Posted via http://www.ruby-forum.com/.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: database and xml parsers

by Nick Sieger-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Oct 22, 2009 at 9:53 AM, Kevin Tambascio <lists@...> wrote:
> Hi,
>
> I'm converting a RoR app to run under JRuby.  Things are going well so
> far.
>
> There are some data-intensive portions to this intranet site, parsing
> large XML files and populating database (MySQL) tables with the content
> from the file.  Previously, I was using a native XML parser, which I
> cannot do in JRuby.

By native, do you mean libxml?

> I am trying to find a replacement SAX-based XML parser (Hpricot?) that
> works under JRuby...any suggestions?

On the Ruby side, you could try Nokogiri [1] or  REXML + JREXML [2].
The latter is probably at best a stop-gap.

> Another possibility is to re-write the XML parsing and database
> insertion routines in a Java/JAR package and call that from JRuby.  Is
> it possible from Java to access the MySQL configuration information that
> is contained in RoR database.yml?  Or does that information need to be
> hard-coded in the Java code as well?

A third option you didn't mention would be to drive a Java SAX parser
from Ruby. You can even extend the DefaultHandler class in Ruby and
hand it to the Java XML parser:

class RubyHandler < org.xml.sax.helpers.DefaultHandler
  def startElement(namespace, local, qname, attrs)
  end
  # ...
end

factory = javax.xml.parsers.SAXParserFactory.newInstance
# configure factory if desired
parser = factory.newSAXParser
parser.parse("my/file.xml", RubyHandler.new)

Cheers,
/Nick

[1]: http://wiki.github.com/tenderlove/nokogiri
[2]: http://github.com/nicksieger/jrexml

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: database and xml parsers

by Marnen Laibow-Koser-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Nick Sieger wrote:

> On Thu, Oct 22, 2009 at 9:53 AM, Kevin Tambascio <lists@...>
> wrote:
>> Hi,
>>
>> I'm converting a RoR app to run under JRuby. �Things are going well so
>> far.
>>
>> There are some data-intensive portions to this intranet site, parsing
>> large XML files and populating database (MySQL) tables with the content
>> from the file. �Previously, I was using a native XML parser, which I
>> cannot do in JRuby.
>
> By native, do you mean libxml?

Yes, we were using libxml (expat) previously.

>> I am trying to find a replacement SAX-based XML parser (Hpricot?) that
>> works under JRuby...any suggestions?
>
> On the Ruby side, you could try Nokogiri [1] or  REXML + JREXML [2].
> The latter is probably at best a stop-gap.

I believe I tried to install the Nokogiri gem through JRuby, but it had
native elements (the web site you provided a link to says it uses
libxml2).  Is there a fully working JRuby solution yet?  I found some
posts from Jan 09 that said it was still in progress.

>
>> Another possibility is to re-write the XML parsing and database
>> insertion routines in a Java/JAR package and call that from JRuby. �Is
>> it possible from Java to access the MySQL configuration information that
>> is contained in RoR database.yml? �Or does that information need to be
>> hard-coded in the Java code as well?
>
> A third option you didn't mention would be to drive a Java SAX parser
> from Ruby. You can even extend the DefaultHandler class in Ruby and
> hand it to the Java XML parser:
>
> class RubyHandler < org.xml.sax.helpers.DefaultHandler
>   def startElement(namespace, local, qname, attrs)
>   end
>   # ...
> end
>
> factory = javax.xml.parsers.SAXParserFactory.newInstance
> # configure factory if desired
> parser = factory.newSAXParser
> parser.parse("my/file.xml", RubyHandler.new)

Since I already have the SAX state machine coded up in ruby, using the
Java SAX parser seems like a good idea as well.  Less dependencies/gems
to worry about.  Thanks so much, I would have never even thought of this
option.

--
Posted via http://www.ruby-forum.com/.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email