Hello

View: New views
4 Messages — Rating Filter:   Alert me  

Hello

by Tony Nelson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm Tony Nelson.  I've been using Python for several years and the email
package for a couple of years.  I have one patch in Python for the socket
module ("[issue1519025] New ver. of 1102879: Fix for 926423: socket
timeouts").  Recently I've found that some odd things I was blaming on
other causes are bugs in the email package.  I'll be filing issues with
patches.

Currently I've filed "[issue5610] email feedparser.py CRLFLF bug: $ vs \Z".
Repeatedly parsing and saving multipart messages was chewing off the
trailing lines from submessage bodies.  I'd like a procedural review of
that issue and its attached files before I file more issues, so that I do
them properly.

Next is probably a fix and test for "[issue1721862]
email.FeedParser.BufferedSubFile improperly handles '\r\n'" (when split
across calls to .feed()).  The error should be rare, only happening about
every 8K messages for messages longer than 8K when parsed via
parser.parse() or parser.parsestr() or email.message_from_file() or
email.message_from_string().  The fix is for .push() to treat a last line
ending with \r as ._partial.  The test will call feedparser.feed()
directly, so it can use short messages and not depend on the buffer size in
parser.parse().

I might be persuaded to review or fix other open issues.
--
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson@...>
      '                              <http://www.georgeanelson.com/>
_______________________________________________
Email-SIG mailing list
Email-SIG@...
Your options: http://mail.python.org/mailman/options/email-sig/lists%40nabble.com

Re: Hello

by Barry Warsaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mar 30, 2009, at 3:10 PM, Tony Nelson wrote:

> I'm Tony Nelson.  I've been using Python for several years and the  
> email
> package for a couple of years.  I have one patch in Python for the  
> socket
> module ("[issue1519025] New ver. of 1102879: Fix for 926423: socket
> timeouts").  Recently I've found that some odd things I was blaming on
> other causes are bugs in the email package.  I'll be filing issues  
> with
> patches.

Hi Tony, welcome to the email sig!  I'm actually sprinting on the  
email package today at Pycon.  Chris Withers joined me until he had to  
fly home.  Bug 1974 was Chris's particular itch and I now think I have  
a fix for this that isn't horrible, though unfortunately it will only  
land in 2.7 and shouldn't be back ported.

> Currently I've filed "[issue5610] email feedparser.py CRLFLF bug: $  
> vs \Z".
> Repeatedly parsing and saving multipart messages was chewing off the
> trailing lines from submessage bodies.  I'd like a procedural review  
> of
> that issue and its attached files before I file more issues, so that  
> I do
> them properly.
>
> Next is probably a fix and test for "[issue1721862]
> email.FeedParser.BufferedSubFile improperly handles '\r\n'" (when  
> split
> across calls to .feed()).  The error should be rare, only happening  
> about
> every 8K messages for messages longer than 8K when parsed via
> parser.parse() or parser.parsestr() or email.message_from_file() or
> email.message_from_string().  The fix is for .push() to treat a last  
> line
> ending with \r as ._partial.  The test will call feedparser.feed()
> directly, so it can use short messages and not depend on the buffer  
> size in
> parser.parse().
>
> I might be persuaded to review or fix other open issues.

Very cool, thanks.  I'll look at the above issues after I land the  
patch for 1974.

My plan for the email package is:

* Fix what we can for Python 2.7 but be very conservative with back  
ports to 2.6
* Ignore 3.0
* Work on a new API so that we can actually fix the horrible  
brokenness of email in Python 3.

Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSdE8eXEjvBPtnXfVAQLxnwP+JgOPzMyy/d41SQLYAgnJWkJLNfmrHmq6
KkgyCC2drzZdd1lZvK5IuiGKEYmS0kQZF/dHUviXkqZgW2OUIp40zB59gbCg8AYD
xAP21n+H/3bpD+xMuo3rbUh5Ft1GAsx/QGZQUUM1jyhlPU/xEY7QzbSVOf6L7xId
Na5W/CZwEpE=
=pS1Z
-----END PGP SIGNATURE-----
_______________________________________________
Email-SIG mailing list
Email-SIG@...
Your options: http://mail.python.org/mailman/options/email-sig/lists%40nabble.com

Re: Hello

by Tony Nelson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

At 16:41 -0500 2009/03/30, Barry Warsaw wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>On Mar 30, 2009, at 3:10 PM, Tony Nelson wrote:
>
>>I'm Tony Nelson. ...
 ...

>Hi Tony, welcome to the email sig!

Thank you.

>I'm actually sprinting on the
>email package today at Pycon.  Chris Withers joined me until he had to
>fly home.  Bug 1974 was Chris's particular itch and I now think I have
>a fix for this that isn't horrible, though unfortunately it will only
>land in 2.7 and shouldn't be back ported.

A worthy issue.  Hopefully header parsing and generation can be cleaned up
more befre 2.7/3.1 so that proper RFC2822 2.2.3 folding can be the norm.
For example, unstructured header fields such as Subject: have whitespace as
part of the unstructured token, and structured fields can skip whitespace,
so leading whitespace should not be stripped by
FeedParser._parse_headers().  This would help with idempotency.


>>Currently I've filed "[issue5610] email feedparser.py CRLFLF bug: $ vs
>>\Z". ...
 ...
>> I might be persuaded to review or fix other open issues.
>
>Very cool, thanks.  I'll look at the above issues after I land the
>patch for 1974.

Ack.  Only one issue yet [issue5610]; I want to know if I'm doing it right
before filing others.


>My plan for the email package is:
>
>* Fix what we can for Python 2.7 but be very conservative with back
>ports to 2.6
>* Ignore 3.0
>* Work on a new API so that we can actually fix the horrible
>brokenness of email in Python 3.

Hmm, I haven't used Python 3 yet, and didn't know about that.  I suppose it
is due to bytes/unicode confusion?

There should be an "obvious" place for users to get a current email package
suitable for at least the last few Python 2.x, at least if it starts
getting more love again.  I don't know quite where that should be, whether
a SourceForge (or similar) page, a listing on PyPI, both, or what.  Just
something simpler than a SVN checkout.
--
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson@...>
      '                              <http://www.georgeanelson.com/>
_______________________________________________
Email-SIG mailing list
Email-SIG@...
Your options: http://mail.python.org/mailman/options/email-sig/lists%40nabble.com

Re: Hello

by Barry Warsaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Commenting on parts of your message, since I haven't looked at 5610 yet.

On Mar 30, 2009, at 5:46 PM, Tony Nelson wrote:

> A worthy issue.  Hopefully header parsing and generation can be  
> cleaned up
> more befre 2.7/3.1 so that proper RFC2822 2.2.3 folding can be the  
> norm.
> For example, unstructured header fields such as Subject: have  
> whitespace as
> part of the unstructured token, and structured fields can skip  
> whitespace,
> so leading whitespace should not be stripped by
> FeedParser._parse_headers().  This would help with idempotency.

While I completely agree with you here, I don't think it will be  
possible to fix this in Python 2.7.  That doesn't mean that we can't  
provide a working email package for Python 2.x though.

I think doing structured folding will require API changes and I think  
I know the API I want.  I'm trying to get some cycles to write about  
it or create some working code.

>> My plan for the email package is:
>>
>> * Fix what we can for Python 2.7 but be very conservative with back
>> ports to 2.6
>> * Ignore 3.0
>> * Work on a new API so that we can actually fix the horrible
>> brokenness of email in Python 3.
>
> Hmm, I haven't used Python 3 yet, and didn't know about that.  I  
> suppose it
> is due to bytes/unicode confusion?

Yes.  The email package has a really broken notion of bytes vs. text.  
Grep for raw-unicode-escape for the brain-hurty.  Fixing this too  
really requires an API change, and again I've talked with folks so I  
think I know where to go with this.  I can haz free hacking cycles?

> There should be an "obvious" place for users to get a current email  
> package
> suitable for at least the last few Python 2.x, at least if it starts
> getting more love again.  I don't know quite where that should be,  
> whether
> a SourceForge (or similar) page, a listing on PyPI, both, or what.  
> Just
> something simpler than a SVN checkout.

We've done standalone email package releases in the past, and I think  
we'll do the same with the new version, distributing it on the  
cheeseshop.  We'll either do this out of the 3.1 tree or from the  
sandbox.  The tricky part will be dealing with the Python 2 back  
porting.  Hopefully we'll be able to use the mythical 3to2 tool that  
folks are starting to talk about/work on, otherwise we'll have to  
manually maintain a Python 2 port.  I definitely think it's better to  
work the details out for Py3 first though; it'll force us to be  
explicit about bytes vs. strings, so we won't fall into the sloppiness  
of the current code.

Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSdGIHnEjvBPtnXfVAQLPywQAjRs5JtxREGVyuG+eAJhh29ICrbMaucrz
/nVi8GBVTYzJWYJkzvvvc31VMY28xNLWPuO2uO10eVQd+zYfsa2oXOOXvvXM8PrH
taP+i1xzQ2b8ANbbehcBPosksOKCU8hpiMes7h43U9NuBGtf8NBaU50diT/N3jua
VQopywOTfEw=
=pmfa
-----END PGP SIGNATURE-----
_______________________________________________
Email-SIG mailing list
Email-SIG@...
Your options: http://mail.python.org/mailman/options/email-sig/lists%40nabble.com