Using different hash algorithm in purple_util_get_image_checksum()

View: New views
12 Messages — Rating Filter:   Alert me  

Using different hash algorithm in purple_util_get_image_checksum()

by Mark Doliner :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

The purple_util_get_image_checksum() function in libpurple/util.c
currently uses SHA-1 to generate a checksum for a chunk of image data.
 SHA-1 is a cryptographic hash function, which means it's hard for
someone to engineer a chunk of data that matches a given hash.  It
also means it's slow.

Do we need to be using a cryptographic hash function here?  This hash
function is one of the more expensive parts of libpurple.  I think
it's called once for each buddy icon we receive.  Adler-32 is much
faster when you're not concerned about security (it's maybe 8 times
faster than SHA-1).  zlib contains an Adler-32 implementation.  I
think GLib's g_string_hash() function is also pretty fast (but not as
fast as Adler-32 when hashing image data).  I haven't really
investigated what problems we would have switching hash functions... I
think we would have to migrate or purge buddy icons from
~/.purple/icons/, because the icon filename is the hash.  And there
might be other problems.

But, uh, how to people feel about this change?

-Mark

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Ethan Blanton-3 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Mark Doliner spake unto us the following wisdom:

> The purple_util_get_image_checksum() function in libpurple/util.c
> currently uses SHA-1 to generate a checksum for a chunk of image data.
>  SHA-1 is a cryptographic hash function, which means it's hard for
> someone to engineer a chunk of data that matches a given hash.  It
> also means it's slow.
>
> Do we need to be using a cryptographic hash function here?  This hash
> function is one of the more expensive parts of libpurple.  I think
> it's called once for each buddy icon we receive.  Adler-32 is much
> faster when you're not concerned about security (it's maybe 8 times
> faster than SHA-1).  zlib contains an Adler-32 implementation.  I
> think GLib's g_string_hash() function is also pretty fast (but not as
> fast as Adler-32 when hashing image data).  I haven't really
> investigated what problems we would have switching hash functions... I
> think we would have to migrate or purge buddy icons from
> ~/.purple/icons/, because the icon filename is the hash.  And there
> might be other problems.
>
> But, uh, how to people feel about this change?
A 32-bit hash brings collisions into the realm of possibility, as it
requires only about 2^16 icons in the store for the probability of
collision to rise to about 50%.

We don't necessarily need a SHA-sized hash, but we need more than
Adler32.  I suggest, however, that there's not much in between that
will have *practical* implementations which are faster than SHA.  (SHA
implementations in crypto libraries are heavily optimized, as are many
Adler32 implementations.  I am unaware of, for example, a 64-bit
checksum or hashing function with similar widespread optimization.)
It's possible that AES-128 would demonstrate benefit, or even DES with
a 56-bit key; you might want to benchmark.

(In reality, I submit that we *do* need a cryptographically secure hash
 function.  While this is not always the case for content-addressed
 storage, in our case we are storing objects which are provided by
 untrusted users from the network.  It is relatively easy to
 reverse-collide an Adler32 checksum, and possible for an individual
 to reverse-collide a DES "hashed" block.  This would mean that a
 malicious user with a timing advantage could pollute your icon store
 and prevent legitimate buddy icons from being fetched.  I sketch:

 Mallory sees that Alice has changed her buddy icon, but Bob is not
 online.  Mallory computes an icon which collides with Alice's in the
 Pidgin buddy icon CAS.  When Bob comes online, Mallory rapidly sends
 Bob this new buddy icon information as his own buddy icon, storing it
 in Bob's CAS.  When Bob engages in conversation with Alice, Mallory's
 buddy icon is now shown for Alice.

 The saving grace here is ... nobody cares that much about buddy
 icons.  Particularly now that Hulu is around.)

Ethan

--
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
                -- Cesare Beccaria, "On Crimes and Punishments", 1764


_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

signature.asc (492 bytes) Download Attachment

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Paul Aurich-4 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

And Ethan Blanton spoke on 06/30/2009 07:56 PM, saying:
<snip/>

> We don't necessarily need a SHA-sized hash, but we need more than
> Adler32.  I suggest, however, that there's not much in between that
> will have *practical* implementations which are faster than SHA.  (SHA
> implementations in crypto libraries are heavily optimized, as are many
> Adler32 implementations.  I am unaware of, for example, a 64-bit
> checksum or hashing function with similar widespread optimization.)
> It's possible that AES-128 would demonstrate benefit, or even DES with
> a 56-bit key; you might want to benchmark.

I believe we currently don't leverage the SSL libraries for hash functions,
which might be worth exploring. Or we might find out they're not
significantly faster than the current implementation.

>
> (In reality, I submit that we *do* need a cryptographically secure hash
>  function.

<snip/>

+1 to the whole part I cut out.

>  The saving grace here is ... nobody cares that much about buddy
>  icons.  Particularly now that Hulu is around.)

As this isn't an area where security is particularly important, we could
also use MD5, which is (for the moment still, I believe) resistant to
preimage attacks and faster than SHA1. : )

>
> Ethan
>

~Paul

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Mark Doliner :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 30, 2009 at 7:56 PM, Ethan Blanton<elb@...> wrote:

> Mark Doliner spake unto us the following wisdom:
>> The purple_util_get_image_checksum() function in libpurple/util.c
>> currently uses SHA-1 to generate a checksum for a chunk of image data.
>>  SHA-1 is a cryptographic hash function, which means it's hard for
>> someone to engineer a chunk of data that matches a given hash.  It
>> also means it's slow.
>>
>> Do we need to be using a cryptographic hash function here?  This hash
>> function is one of the more expensive parts of libpurple.  I think
>> it's called once for each buddy icon we receive.  Adler-32 is much
>> faster when you're not concerned about security (it's maybe 8 times
>> faster than SHA-1).  zlib contains an Adler-32 implementation.  I
>> think GLib's g_string_hash() function is also pretty fast (but not as
>> fast as Adler-32 when hashing image data).  I haven't really
>> investigated what problems we would have switching hash functions... I
>> think we would have to migrate or purge buddy icons from
>> ~/.purple/icons/, because the icon filename is the hash.  And there
>> might be other problems.
>>
>> But, uh, how to people feel about this change?
>
> A 32-bit hash brings collisions into the realm of possibility, as it
> requires only about 2^16 icons in the store for the probability of
> collision to rise to about 50%.
>
> We don't necessarily need a SHA-sized hash, but we need more than
> Adler32.  I suggest, however, that there's not much in between that
> will have *practical* implementations which are faster than SHA.  (SHA
> implementations in crypto libraries are heavily optimized, as are many
> Adler32 implementations.  I am unaware of, for example, a 64-bit
> checksum or hashing function with similar widespread optimization.)
> It's possible that AES-128 would demonstrate benefit, or even DES with
> a 56-bit key; you might want to benchmark.

At Meebo we chose to create two 32-bit hashes, one from one chunk of
the icon data and one from a different chunk, then concat them to
create a 64-bit checksum.  Vijay from Meebo tells me that he seems to
remember the sha-1 implementation in glib 2.16 is much faster than the
one in our cipher.c.

> (In reality, I submit that we *do* need a cryptographically secure hash
>  function.  While this is not always the case for content-addressed
>  storage, in our case we are storing objects which are provided by
>  untrusted users from the network.  It is relatively easy to
>  reverse-collide an Adler32 checksum, and possible for an individual
>  to reverse-collide a DES "hashed" block.  This would mean that a
>  malicious user with a timing advantage could pollute your icon store
>  and prevent legitimate buddy icons from being fetched.  I sketch:
>
>  Mallory sees that Alice has changed her buddy icon, but Bob is not
>  online.  Mallory computes an icon which collides with Alice's in the
>  Pidgin buddy icon CAS.  When Bob comes online, Mallory rapidly sends
>  Bob this new buddy icon information as his own buddy icon, storing it
>  in Bob's CAS.  When Bob engages in conversation with Alice, Mallory's
>  buddy icon is now shown for Alice.
>
>  The saving grace here is ... nobody cares that much about buddy
>  icons.  Particularly now that Hulu is around.)

Yeah, we thought about all of this and didn't deem it important enough
for us to keep using sha-1.  As you point out it is relatively easy to
create a chunk of data with an Adler32 checksum that matches another
chunk of data, and so it would be relatively easy to make someone's
icon stop appearing.  But it would be harder to replace someone's icon
with a different image because you would have to create a chunk of
data that 1. has the same checksum and 2. is a valid image.

I'm hoping a few other devs weight in, but it sounds like we want to
stick with sha-1.  I'm not sure the performance advantage of md5 over
sha1 justifies the code changes that would be required.  And the
checksumming matters a lot less when you're running one instance on
your personal computer than when you have a few thousand users on a
single server.

-Mark

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Ethan Blanton-3 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Mark Doliner spake unto us the following wisdom:

> On Tue, Jun 30, 2009 at 7:56 PM, Ethan Blanton<elb@...> wrote:
> > A 32-bit hash brings collisions into the realm of possibility, as it
> > requires only about 2^16 icons in the store for the probability of
> > collision to rise to about 50%.
> >
> > We don't necessarily need a SHA-sized hash, but we need more than
> > Adler32.  I suggest, however, that there's not much in between that
> > will have *practical* implementations which are faster than SHA.  (SHA
> > implementations in crypto libraries are heavily optimized, as are many
> > Adler32 implementations.  I am unaware of, for example, a 64-bit
> > checksum or hashing function with similar widespread optimization.)
> > It's possible that AES-128 would demonstrate benefit, or even DES with
> > a 56-bit key; you might want to benchmark.
>
> At Meebo we chose to create two 32-bit hashes, one from one chunk of
> the icon data and one from a different chunk, then concat them to
> create a 64-bit checksum.  Vijay from Meebo tells me that he seems to
> remember the sha-1 implementation in glib 2.16 is much faster than the
> one in our cipher.c.
This is a good compromise.  And, in fact, can probably be made "secure
enough", particularly as the actual hash values are not leaked outside
of Meebo.  It sounds to me like a good solution for 3.x is maybe to
allow for Pidgin's CAS to be offloaded to a UI-provided function.
This would let installations such as Meebo (with differing concerns
from single-instance Pidgin) take more complicated addressing measures.

We could quite possibly fruitfully make the specific hash functions we
use conditional on available library implementations.

> > (In reality, I submit that we *do* need a cryptographically secure hash
> >  function.  While this is not always the case for content-addressed
> >  storage, in our case we are storing objects which are provided by
> >  untrusted users from the network.  It is relatively easy to
> >  reverse-collide an Adler32 checksum, and possible for an individual
> >  to reverse-collide a DES "hashed" block.  This would mean that a
> >  malicious user with a timing advantage could pollute your icon store
> >  and prevent legitimate buddy icons from being fetched.  I sketch:
> >
> >  Mallory sees that Alice has changed her buddy icon, but Bob is not
> >  online.  Mallory computes an icon which collides with Alice's in the
> >  Pidgin buddy icon CAS.  When Bob comes online, Mallory rapidly sends
> >  Bob this new buddy icon information as his own buddy icon, storing it
> >  in Bob's CAS.  When Bob engages in conversation with Alice, Mallory's
> >  buddy icon is now shown for Alice.
> >
> >  The saving grace here is ... nobody cares that much about buddy
> >  icons.  Particularly now that Hulu is around.)
>
> Yeah, we thought about all of this and didn't deem it important enough
> for us to keep using sha-1.  As you point out it is relatively easy to
> create a chunk of data with an Adler32 checksum that matches another
> chunk of data, and so it would be relatively easy to make someone's
> icon stop appearing.  But it would be harder to replace someone's icon
> with a different image because you would have to create a chunk of
> data that 1. has the same checksum and 2. is a valid image.
... which, with a 24-bit color uncompressed bitmap, is likely a
reasonable thing to do.  Compressed images are, of course, another
matter.

> I'm hoping a few other devs weight in, but it sounds like we want to
> stick with sha-1.  I'm not sure the performance advantage of md5 over
> sha1 justifies the code changes that would be required.  And the
> checksumming matters a lot less when you're running one instance on
> your personal computer than when you have a few thousand users on a
> single server.

Agreed, re: MD5.  Using a more optimized version of SHA, if available,
seems like a good idea, however.

Ethan

--
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
                -- Cesare Beccaria, "On Crimes and Punishments", 1764


_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

signature.asc (492 bytes) Download Attachment

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Richard Laager :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Wed, 2009-07-01 at 01:29 -0700, Mark Doliner wrote:
> Vijay from Meebo tells me that he seems to
> remember the sha-1 implementation in glib 2.16 is much faster than the
> one in our cipher.c.

If this is true and it would be possible, it'd be nice if we just called
that from our cipher API when libpurple is built against glib >= 2.16.

Richard

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Marcus Lundblad :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

tis 2009-06-30 klockan 14:25 -0700 skrev Mark Doliner:

> The purple_util_get_image_checksum() function in libpurple/util.c
> currently uses SHA-1 to generate a checksum for a chunk of image data.
>  SHA-1 is a cryptographic hash function, which means it's hard for
> someone to engineer a chunk of data that matches a given hash.  It
> also means it's slow.
>
> Do we need to be using a cryptographic hash function here?  This hash
> function is one of the more expensive parts of libpurple.  I think
> it's called once for each buddy icon we receive.  Adler-32 is much
> faster when you're not concerned about security (it's maybe 8 times
> faster than SHA-1).  zlib contains an Adler-32 implementation.  I
> think GLib's g_string_hash() function is also pretty fast (but not as
> fast as Adler-32 when hashing image data).  I haven't really
> investigated what problems we would have switching hash functions... I
> think we would have to migrate or purge buddy icons from
> ~/.purple/icons/, because the icon filename is the hash.  And there
> might be other problems.
>
If we are going to change how hashing of icons is done, maybe we should
allow the hashing algorithm to be "encoded" into the resulting file
names in the icon cache?
This way, custom smileys could be store in there as well. As it is today
the function adding a custom smiley to a conversation takes two
arguments, "type" and "hash" (taken from my memory). It returns a
boolean telling if the conversation needs the data. Currently it will
return TRUE. I suppose this could be enhanced to take advantage of a
hash type-aware cache.
Currenly the type is set to "sha1" in the MSN prpl. In XMPP I have taken
the CID value from the BoB object and set the type to "cid", although
the recommended construction of CIDs in XEP-0231 is of the form algo
+hash@....
So potentially it could parse out the hash type and hash value if the
CID is of this form, maybe.
 
Though, I guess this would require quite extensive changes to
PurpleImgStore...

> But, uh, how to people feel about this change?
>
> -Mark
>
> _______________________________________________
> Devel mailing list
> Devel@...
> http://pidgin.im/cgi-bin/mailman/listinfo/devel


_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Kevin Stange :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Marcus Lundblad wrote:
> If we are going to change how hashing of icons is done, maybe we should
> allow the hashing algorithm to be "encoded" into the resulting file
> names in the icon cache?

I might have misunderstood your meaning, but we already use the hash for
the file name of the image?  Are you suggesting constructing a file name
containing more than just the hash?  Based on a quick glance, this is
what I see now:

kevin@bashir icons % ls ~/.purple/icons | head -n 3
000d74ac5c14566643eb26547675e11b5bde0b99.gif
00287eb6c6c7cfbb7b1acf3715b6ff6f889ebf94.gif
0081d358e6395ad089b2a538d8d9f83896eb4c57.gif

Mallory, don't get any ideas!

Kevin



_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

signature.asc (269 bytes) Download Attachment

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Paul Aurich-4 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

And Richard Laager spoke on 07/01/2009 10:41 PM, saying:
> On Wed, 2009-07-01 at 01:29 -0700, Mark Doliner wrote:
>> Vijay from Meebo tells me that he seems to
>> remember the sha-1 implementation in glib 2.16 is much faster than the
>> one in our cipher.c.
>
> If this is true and it would be possible, it'd be nice if we just called
> that from our cipher API when libpurple is built against glib >= 2.16.

I've attached a patch that will conditionally use Glib for MD5, SHA1, and
SHA256 when available. `make check` passes with all three and it seems to
work fine.

In a very simplistic benchmark, I loaded the largest image in my
.purple/icons/ (32KiB) and calculated the SHA1sum 100,000 times (and stuck
a g_str_equal() in there just to make sure gcc didn't get cute with me). On
my AMD Athlon 64 4400+, the cipher.c implementation took about 41 seconds
and glib's implementation took about 14 seconds.

Anyone have any problems with my checking this in? The only thing left to
do is deal with the issue in adding data to the checksum, where GChecksum
uses a gssize and we use a gsize for the length, which will be bad when I
decide I want to use Pidgin to calculate the checksum for my 3GB file
(after I dig up a 32-bit system).

~Paul

#
# old_revision [5668e3b116ab205b1a3270d6aff7fca0332bccc6]
#
# patch "libpurple/cipher.c"
#  from [2f4817fd21f86ce8ab710174ee2418da1591348a]
#    to [3ddc29acfb4c36d387e1947a46b6e701cc96a33c]
#
# patch "libpurple/tests/test_cipher.c"
#  from [eecb8040f3ea55a72301e71aded541240e22f0cf]
#    to [c0799fa73cfe6b1361a634cb5d5f4e74398afc1f]
#
============================================================
--- libpurple/cipher.c 2f4817fd21f86ce8ab710174ee2418da1591348a
+++ libpurple/cipher.c 3ddc29acfb4c36d387e1947a46b6e701cc96a33c
@@ -61,11 +61,138 @@
 #include "signals.h"
 #include "value.h"
 
+#if GLIB_CHECK_VERSION(2,16,0)
+static void
+purple_g_checksum_init(PurpleCipherContext *context, GChecksumType type)
+{
+ GChecksum *checksum;
+
+ checksum = g_checksum_new(type);
+ purple_cipher_context_set_data(context, checksum);
+}
+
+static void
+purple_g_checksum_reset(PurpleCipherContext *context, GChecksumType type)
+{
+ GChecksum *checksum;
+
+ checksum = purple_cipher_context_get_data(context);
+ g_return_if_fail(checksum != NULL);
+
+#if GLIB_CHECK_VERSION(2,18,0)
+ g_checksum_reset(checksum);
+#else
+ g_checksum_free(checksum);
+ checksum = g_checksum_new(type);
+ purple_cipher_context_set_data(context, checksum);
+#endif
+}
+
+static void
+purple_g_checksum_uninit(PurpleCipherContext *context)
+{
+ GChecksum *checksum;
+
+ checksum = purple_cipher_context_get_data(context);
+ g_return_if_fail(checksum != NULL);
+
+ g_checksum_free(checksum);
+}
+
+static void
+purple_g_checksum_append(PurpleCipherContext *context, const guchar *data,
+                         gsize len)
+{
+ GChecksum *checksum;
+
+ checksum = purple_cipher_context_get_data(context);
+ g_return_if_fail(checksum != NULL);
+
+ /* FIXME: Handle len being more than a gssize can handle */
+ g_checksum_update(checksum, data, len);
+}
+
+static gboolean
+purple_g_checksum_digest(PurpleCipherContext *context, GChecksumType type,
+                         gsize len, guchar *digest, gsize *out_len)
+{
+ GChecksum *checksum;
+ const gssize required_length = g_checksum_type_get_length(type);
+
+ checksum = purple_cipher_context_get_data(context);
+
+ g_return_val_if_fail(len >= required_length, FALSE);
+ g_return_val_if_fail(checksum != NULL, FALSE);
+
+ g_checksum_get_digest(checksum, digest, &len);
+
+ purple_cipher_context_reset(context, NULL);
+
+ if (out_len)
+ *out_len = len;
+
+ return TRUE;
+}
+#endif
+
+
 /*******************************************************************************
  * MD5
  ******************************************************************************/
 #define MD5_HMAC_BLOCK_SIZE 64
 
+static size_t
+md5_get_block_size(PurpleCipherContext *context)
+{
+ /* This does not change (in this case) */
+ return MD5_HMAC_BLOCK_SIZE;
+}
+
+#if GLIB_CHECK_VERSION(2,16,0)
+
+static void
+md5_init(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_init(context, G_CHECKSUM_MD5);
+}
+
+static void
+md5_reset(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_reset(context, G_CHECKSUM_MD5);
+}
+
+static gboolean
+md5_digest(PurpleCipherContext *context, gsize in_len, guchar digest[16],
+           size_t *out_len)
+{
+ return purple_g_checksum_digest(context, G_CHECKSUM_MD5, in_len,
+                                digest, out_len);
+}
+
+static PurpleCipherOps MD5Ops = {
+ NULL, /* Set Option */
+ NULL, /* Get Option */
+ md5_init, /* init */
+ md5_reset, /* reset */
+ purple_g_checksum_uninit, /* uninit */
+ NULL, /* set iv */
+ purple_g_checksum_append, /* append */
+ md5_digest, /* digest */
+ NULL, /* encrypt */
+ NULL, /* decrypt */
+ NULL, /* set salt */
+ NULL, /* get salt size */
+ NULL, /* set key */
+ NULL, /* get key size */
+ NULL, /* set batch mode */
+ NULL, /* get batch mode */
+ md5_get_block_size, /* get block size */
+ NULL /* set key with len */
+};
+
+#else /* GLIB_CHECK_VERSION(2,16,0) */
+
 struct MD5Context {
  guint32 total[2];
  guint32 state[4];
@@ -327,13 +454,6 @@ md5_digest(PurpleCipherContext *context,
  return TRUE;
 }
 
-static size_t
-md5_get_block_size(PurpleCipherContext *context)
-{
- /* This does not change (in this case) */
- return MD5_HMAC_BLOCK_SIZE;
-}
-
 static PurpleCipherOps MD5Ops = {
  NULL, /* Set option */
  NULL, /* Get option */
@@ -355,6 +475,8 @@ static PurpleCipherOps MD5Ops = {
  NULL /* set key with len */
 };
 
+#endif /* GLIB_CHECK_VERSION(2,16,0) */
+
 /*******************************************************************************
  * MD4
  ******************************************************************************/
@@ -1613,6 +1735,61 @@ static PurpleCipherOps DES3Ops = {
  * SHA-1
  ******************************************************************************/
 #define SHA1_HMAC_BLOCK_SIZE 64
+
+static size_t
+sha1_get_block_size(PurpleCipherContext *context)
+{
+ /* This does not change (in this case) */
+ return SHA1_HMAC_BLOCK_SIZE;
+}
+
+
+#if GLIB_CHECK_VERSION(2,16,0)
+
+static void
+sha1_init(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_init(context, G_CHECKSUM_SHA1);
+}
+
+static void
+sha1_reset(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_reset(context, G_CHECKSUM_SHA1);
+}
+
+static gboolean
+sha1_digest(PurpleCipherContext *context, gsize in_len, guchar digest[20],
+            gsize *out_len)
+{
+ return purple_g_checksum_digest(context, G_CHECKSUM_SHA1, in_len,
+                                digest, out_len);
+}
+
+static PurpleCipherOps SHA1Ops = {
+ NULL, /* Set Option */
+ NULL, /* Get Option */
+ sha1_init, /* init */
+ sha1_reset, /* reset */
+ purple_g_checksum_uninit, /* uninit */
+ NULL, /* set iv */
+ purple_g_checksum_append, /* append */
+ sha1_digest, /* digest */
+ NULL, /* encrypt */
+ NULL, /* decrypt */
+ NULL, /* set salt */
+ NULL, /* get salt size */
+ NULL, /* set key */
+ NULL, /* get key size */
+ NULL, /* set batch mode */
+ NULL, /* get batch mode */
+ sha1_get_block_size, /* get block size */
+ NULL /* set key with len */
+};
+
+#else /* GLIB_CHECK_VERSION(2,16,0) */
+
+#define SHA1_HMAC_BLOCK_SIZE 64
 #define SHA1_ROTL(X,n) ((((X) << (n)) | ((X) >> (32-(n)))) & 0xFFFFFFFF)
 
 struct SHA1Context {
@@ -1833,13 +2010,6 @@ sha1_digest(PurpleCipherContext *context
  return TRUE;
 }
 
-static size_t
-sha1_get_block_size(PurpleCipherContext *context)
-{
- /* This does not change (in this case) */
- return SHA1_HMAC_BLOCK_SIZE;
-}
-
 static PurpleCipherOps SHA1Ops = {
  sha1_set_opt, /* Set Option */
  sha1_get_opt, /* Get Option */
@@ -1861,10 +2031,65 @@ static PurpleCipherOps SHA1Ops = {
  NULL /* set key with len */
 };
 
+#endif /* GLIB_CHECK_VERSION(2,16,0) */
+
 /*******************************************************************************
  * SHA-256
  ******************************************************************************/
 #define SHA256_HMAC_BLOCK_SIZE 64
+
+static size_t
+sha256_get_block_size(PurpleCipherContext *context)
+{
+ /* This does not change (in this case) */
+ return SHA256_HMAC_BLOCK_SIZE;
+}
+
+#if GLIB_CHECK_VERSION(2,16,0)
+
+static void
+sha256_init(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_init(context, G_CHECKSUM_SHA256);
+}
+
+static void
+sha256_reset(PurpleCipherContext *context, void *extra)
+{
+ purple_g_checksum_reset(context, G_CHECKSUM_SHA256);
+}
+
+static gboolean
+sha256_digest(PurpleCipherContext *context, gsize in_len, guchar digest[20],
+            gsize *out_len)
+{
+ return purple_g_checksum_digest(context, G_CHECKSUM_SHA256, in_len,
+                                digest, out_len);
+}
+
+static PurpleCipherOps SHA256Ops = {
+ NULL, /* Set Option */
+ NULL, /* Get Option */
+ sha256_init, /* init */
+ sha256_reset, /* reset */
+ purple_g_checksum_uninit, /* uninit */
+ NULL, /* set iv */
+ purple_g_checksum_append, /* append */
+ sha256_digest, /* digest */
+ NULL, /* encrypt */
+ NULL, /* decrypt */
+ NULL, /* set salt */
+ NULL, /* get salt size */
+ NULL, /* set key */
+ NULL, /* get key size */
+ NULL, /* set batch mode */
+ NULL, /* get batch mode */
+ sha256_get_block_size, /* get block size */
+ NULL /* set key with len */
+};
+
+#else /* GLIB_CHECK_VERSION(2,16,0) */
+
 #define SHA256_ROTR(X,n) ((((X) >> (n)) | ((X) << (32-(n)))) & 0xFFFFFFFF)
 
 static const guint32 sha256_K[64] =
@@ -2088,13 +2313,6 @@ sha256_digest(PurpleCipherContext *conte
  return TRUE;
 }
 
-static size_t
-sha256_get_block_size(PurpleCipherContext *context)
-{
- /* This does not change (in this case) */
- return SHA256_HMAC_BLOCK_SIZE;
-}
-
 static PurpleCipherOps SHA256Ops = {
  sha256_set_opt, /* Set Option */
  sha256_get_opt, /* Get Option */
@@ -2116,6 +2334,8 @@ static PurpleCipherOps SHA256Ops = {
  NULL /* set key with len */
 };
 
+#endif /* GLIB_CHECK_VERSION(2,16,0) */
+
 /*******************************************************************************
  * RC4
  ******************************************************************************/
============================================================
--- libpurple/tests/test_cipher.c eecb8040f3ea55a72301e71aded541240e22f0cf
+++ libpurple/tests/test_cipher.c c0799fa73cfe6b1361a634cb5d5f4e74398afc1f
@@ -168,6 +168,11 @@ END_TEST
  purple_cipher_context_destroy(context); \
 }
 
+START_TEST(test_sha1_empty_string) {
+ SHA1_TEST("", "da39a3ee5e6b4b0d3255bfef95601890afd80709");
+}
+END_TEST
+
 START_TEST(test_sha1_a) {
  SHA1_TEST("a", "86f7e437faa5a7fce15d1ddcb9eaeaea377667b8");
 }
@@ -190,6 +195,66 @@ END_TEST
 END_TEST
 
 /******************************************************************************
+ * SHA-256 Tests
+ *****************************************************************************/
+#define SHA256_TEST(data, digest) { \
+ PurpleCipher *cipher = NULL; \
+ PurpleCipherContext *context = NULL; \
+ gchar cdigest[65]; \
+ gboolean ret = FALSE; \
+ \
+ cipher = purple_ciphers_find_cipher("sha256"); \
+ context = purple_cipher_context_new(cipher, NULL); \
+ \
+ if((data)) { \
+ purple_cipher_context_append(context, (guchar *)(data), strlen((data))); \
+ } else { \
+ gint j; \
+ guchar buff[1000]; \
+ \
+ memset(buff, 'a', 1000); \
+ \
+ for(j = 0; j < 1000; j++) \
+ purple_cipher_context_append(context, buff, 1000); \
+ } \
+ \
+ ret = purple_cipher_context_digest_to_str(context, sizeof(cdigest), cdigest, \
+                                        NULL); \
+ \
+ fail_unless(ret == TRUE, NULL); \
+ \
+ fail_unless(strcmp((digest), cdigest) == 0, NULL); \
+ \
+ purple_cipher_context_destroy(context); \
+}
+
+START_TEST(test_sha256_empty_string) {
+ SHA256_TEST("", "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855");
+}
+END_TEST
+
+START_TEST(test_sha256_a) {
+ SHA256_TEST("a", "ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb");
+}
+END_TEST
+
+START_TEST(test_sha256_abc) {
+ SHA256_TEST("abc", "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad");
+}
+END_TEST
+
+START_TEST(test_sha256_abcd_gibberish) {
+ SHA256_TEST("abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq",
+  "248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1");
+}
+END_TEST
+
+START_TEST(test_sha256_1000_as_1000_times) {
+ SHA256_TEST(NULL, "cdc76e5c9914fb9281a1c7e284d73e67f1809a48a497200e046d39ccc7112cd0");
+}
+END_TEST
+
+/******************************************************************************
  * DES Tests
  *****************************************************************************/
 #define DES_TEST(in, keyz, out, len) { \
@@ -726,12 +791,22 @@ cipher_suite(void) {
 
  /* sha1 tests */
  tc = tcase_create("SHA1");
+ tcase_add_test(tc, test_sha1_empty_string);
  tcase_add_test(tc, test_sha1_a);
  tcase_add_test(tc, test_sha1_abc);
  tcase_add_test(tc, test_sha1_abcd_gibberish);
  tcase_add_test(tc, test_sha1_1000_as_1000_times);
  suite_add_tcase(s, tc);
 
+ /* sha256 tests */
+ tc = tcase_create("SHA256");
+ tcase_add_test(tc, test_sha256_empty_string);
+ tcase_add_test(tc, test_sha256_a);
+ tcase_add_test(tc, test_sha256_abc);
+ tcase_add_test(tc, test_sha256_abcd_gibberish);
+ tcase_add_test(tc, test_sha256_1000_as_1000_times);
+ suite_add_tcase(s, tc);
+
  /* des tests */
  tc = tcase_create("DES");
  tcase_add_test(tc, test_des_12345678);

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Elliott Sales de Andrade-2 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Sat, Jul 4, 2009 at 1:33 AM, Paul Aurich <paul@...> wrote:
And Richard Laager spoke on 07/01/2009 10:41 PM, saying:
> On Wed, 2009-07-01 at 01:29 -0700, Mark Doliner wrote:
>> Vijay from Meebo tells me that he seems to
>> remember the sha-1 implementation in glib 2.16 is much faster than the
>> one in our cipher.c.
>
> If this is true and it would be possible, it'd be nice if we just called
> that from our cipher API when libpurple is built against glib >= 2.16.

I've attached a patch that will conditionally use Glib for MD5, SHA1, and
SHA256 when available. `make check` passes with all three and it seems to
work fine.

`make check` works for me as well. I can also login to XMPP (MD5?) and MSN (SHA1 in HMAC).
 

In a very simplistic benchmark, I loaded the largest image in my
.purple/icons/ (32KiB) and calculated the SHA1sum 100,000 times (and stuck
a g_str_equal() in there just to make sure gcc didn't get cute with me). On
my AMD Athlon 64 4400+, the cipher.c implementation took about 41 seconds
and glib's implementation took about 14 seconds.

Sounds great!
 

Anyone have any problems with my checking this in? The only thing left to
do is deal with the issue in adding data to the checksum, where GChecksum
uses a gssize and we use a gsize for the length, which will be bad when I
decide I want to use Pidgin to calculate the checksum for my 3GB file
(after I dig up a 32-bit system).

I think this works (though I didn't try any 3GiB files):
    while (len >= G_MAXSSIZE)
    {
        g_checksum_update(checksum, data, G_MAXSSIZE);
        len -= G_MAXSSIZE;
        data += G_MAXSSIZE;
    }
    if (len)
        g_checksum_update(checksum, data, len);


~Paul

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel



_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by Mark Doliner :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Fri, Jul 3, 2009 at 10:33 PM, Paul Aurich<paul@...> wrote:

> And Richard Laager spoke on 07/01/2009 10:41 PM, saying:
>> On Wed, 2009-07-01 at 01:29 -0700, Mark Doliner wrote:
>>> Vijay from Meebo tells me that he seems to
>>> remember the sha-1 implementation in glib 2.16 is much faster than the
>>> one in our cipher.c.
>>
>> If this is true and it would be possible, it'd be nice if we just called
>> that from our cipher API when libpurple is built against glib >= 2.16.
>
> I've attached a patch that will conditionally use Glib for MD5, SHA1, and
> SHA256 when available. `make check` passes with all three and it seems to
> work fine.
>
> In a very simplistic benchmark, I loaded the largest image in my
> .purple/icons/ (32KiB) and calculated the SHA1sum 100,000 times (and stuck
> a g_str_equal() in there just to make sure gcc didn't get cute with me). On
> my AMD Athlon 64 4400+, the cipher.c implementation took about 41 seconds
> and glib's implementation took about 14 seconds.
>
> Anyone have any problems with my checking this in? The only thing left to
> do is deal with the issue in adding data to the checksum, where GChecksum
> uses a gssize and we use a gsize for the length, which will be bad when I
> decide I want to use Pidgin to calculate the checksum for my 3GB file
> (after I dig up a 32-bit system).

Looks great to me.  I have no problems with you checking this in.  It
would be be nice to be able to use a gsize for file size, although we
should probably be using a 64-bit type for that.  At some point in the
distance future we'll probably make file transfer work on more
protocols, and by the time that happens it might be common for people
to want to transfer 5GB files :-P

-Mark

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel

Re: Using different hash algorithm in purple_util_get_image_checksum()

by datallah :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Sun, Jul 5, 2009 at 11:23 PM, Mark Doliner<mark@...> wrote:
> Looks great to me.  I have no problems with you checking this in.  It
> would be be nice to be able to use a gsize for file size, although we
> should probably be using a 64-bit type for that.  At some point in the
> distance future we'll probably make file transfer work on more
> protocols, and by the time that happens it might be common for people
> to want to transfer 5GB files :-P

There is a 3.0.0 ticket related to this already (but it needs to be
retargeted to libpurple instead of bonjour):
http://developer.pidgin.im/ticket/8477

-D

_______________________________________________
Devel mailing list
Devel@...
http://pidgin.im/cgi-bin/mailman/listinfo/devel