|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
minidom: Genius or just plain bad?I was puzzled when I tripped over the following:
>>> NS = 'http://phihag.de/2009/test/python/ns' >>> s = '<rootelem a="val" xmlns="' + NS + '" />' >>> import xml.dom.minidom >>> doc = xml.dom.minidom.parseString(s) >>> doc.documentElement.getAttributeNS(NS, 'a') '' # wtf? >>> doc.documentElement.getAttribute('a') u'val' Looking in the implementation, it seems that minidom is essentially a DOM Level 1 implementation, with very limited support for namespaces. Wouldn't be nice to have a full-fledged XML implementation in the Python stdlib? Probably not (yet) including validation, XSLT and similar auxiliary technologies, but come on, XML namespaces and DOM 3 L/S should be supported. I noticed that important minidom features such as http://bugs.python.org/issue1621421 are not going anywhere. Is this because of performance considerations or lack of manpower? Also, it seems strange that minidom.py is full of comments referencing outdated 2002 working drafts. I'm intrigued by the idea of overriding __setattr__ to do crazy stuff (including invalidating a document-wide cache that probably stays valid in >99% of the cases although a local check for attribute name = id would improve performance here) instead of using properties, and then avoiding actually using it "for performance" reasons. Additionally, the comment "nodeValue and value are set elsewhere" in Attr.__init__ neatly conveys the intention of allowing extremly fast creation of value-less attributes. Similarly, the opening comment of expatbuilder.py is excellent of the little-known Alternative Zen of Python Ugly is better than beautiful. Implicit is better than explicit. Performance is better than anything. Code needs comments explaining and defending it. Constants are great, especially when depending on their value.¹ Code first, then think about the interface.² Or don't think about the interface at all. Fixing bugs in dependencies is bad. Unless you fix by changing your code. But do not allow others to do that. Modularization is good. As long as you access internals of other modules. Import from many modules. Whose names all sound the same. If self.childnodes (:return True else return False) That's how I spell pain. ¹ minidom.prefix ² grep "not sure this is meaningful" Regards, Philipp _______________________________________________ XML-SIG maillist - XML-SIG@... http://mail.python.org/mailman/listinfo/xml-sig |
|
|
Re: minidom: Genius or just plain bad?Philipp Hagemeister wrote:
> I was puzzled when I tripped over the following: > >>>> NS = 'http://phihag.de/2009/test/python/ns' >>>> s = '<rootelem a="val" xmlns="' + NS + '" />' >>>> import xml.dom.minidom >>>> doc = xml.dom.minidom.parseString(s) >>>> doc.documentElement.getAttributeNS(NS, 'a') > '' # wtf? Why do you think this is incorrect? The root element has no attribute named 'a' in the NS namespace. Regards, Martin _______________________________________________ XML-SIG maillist - XML-SIG@... http://mail.python.org/mailman/listinfo/xml-sig |
|
|
Re: minidom: Genius or just plain bad?Martin v. Löwis wrote:
>>>>> NS = 'http://phihag.de/2009/test/python/ns' >>>>> s = '<rootelem a="val" xmlns="' + NS + '" />' >>>>> import xml.dom.minidom >>>>> doc = xml.dom.minidom.parseString(s) >>>>> doc.documentElement.getAttributeNS(NS, 'a') > > Why do you think this is incorrect? The root element > has no attribute named 'a' in the NS namespace. Oops, my bad. You are perfectly right, and this part of my argument is moot. http://www.rpbourret.com/xml/NamespaceMyths.htm#myth4 refutes my misconception in-depth. minidom's code is still yucky though. Cheers, Philipp _______________________________________________ XML-SIG maillist - XML-SIG@... http://mail.python.org/mailman/listinfo/xml-sig |
|
|
Re: minidom: Genius or just plain bad?Philipp Hagemeister wrote: > Wouldn't be nice to have a full-fledged XML implementation in the Python > stdlib? Probably not (yet) including validation, XSLT and similar > auxiliary technologies, but come on, XML namespaces and DOM 3 L/S should > be supported. This has been rejected on python-dev lately, given that such an implementation would almost certainly introduce a major dependency overhead if it's not written in plain Python. There's also the historical problem that the stdlib XML support is there and quite a bit of existing code depends on it. Replacing that with a new implementation would break all that. Extending it is a, well, rather large project, as would be any kind of major performance improvement. It's not too hard to install lxml these days, though. The fact that it *doesn't* use the DOM3 API is actually a major strength. http://codespeak.net/lxml/ Stefan _______________________________________________ XML-SIG maillist - XML-SIG@... http://mail.python.org/mailman/listinfo/xml-sig |
| Free embeddable forum powered by Nabble | Forum Help |