Re: Accepting curly quotes from McWord in textareas

View: New views
4 Messages — Rating Filter:   Alert me  

Re: Accepting curly quotes from McWord in textareas

by t vainisi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi List,

The problem of invalid characters from MS Word has plagued me for  
years.  Because of the nature of our web app, users are very apt to  
try to copy and paste from word.  Its not just curly quotes, its math  
symbols (such as greater than or equal to or pretty much anything you  
might find in an algebraic equation).

I've used the test source code that Bil was kind enough to offer in  
this thread (the emdash example) and everything I've pasted into the  
textarea comes back fine.

I built a new MySQL db using phpMyAdmin, it says the database is  
using utf8_unicode_ci collation, and the only table inside also  
reports utf8_unicode_ci.  The varchar field in the table is also  
labeled as utf8_unicode_ci collation.

In Lasso 8.5 SiteAdmin, I used the "Table Batch Change" to insure the  
UTF-8 encoding.  Then verified that in the table listing - table  
detail the encoding is UTF-8 (Unicode).  Lasso is running on a Unix  
Box (CentOS 5, I believe) as is the MySQL db.

I added inlines to Bil's page, and immediately the output contains  
nasty boxes instead of the proper characters.  Here is the altered  
source:

[
lp_header_serveHTML('UTF-8');
var('text') = action_param('text')->trim&->substring(1,10);

if($text->Size);
var('sql')="Insert into wordchars (thestring) values ('" +  
(Encode_SQL:$text) + "');";
inline(-database="curricul_wordchars", -table="wordchars", -sql=($sql));
/inline;

/if;

var('emdash') = bytes;
$emdash->import8bits(226);
$emdash->import8bits(128);
$emdash->import8bits(148);

]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/ 
TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Charset Demo</title>
</head>
<body style="background: white;">
<form method="post">
Paste some word chars like this dash "[$emdash]"<br><textarea  
name="text"></textarea>
<input type="submit" name="submit" value="submit">
</form>
[if($text->size);
'<hr>You pasted:<br><br>'+encode_html($text)+'<br>';
lp_string_radixprint($text);
var('sql')="Select * from wordchars order by createdate desc;";
inline(-database="curricul_wordchars", -table="wordchars", -sql=($sql));
"<hr>The db output:<br><br>"+(encode_html(field('thestring')))+"<br>";
/inline;
/if]
</body>
</html>


The pre-db output is still fine.  Its only the encode_html(field
('thestring')) output that has the bad chars.  Where is my missing step?


Todd V


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/



Parent Message unknown Re: Accepting curly quotes from McWord in textareas

by t vainisi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi List,

After quite a bit more exploration, I discovered that if I remove the  
Encode_SQL formating when inserting into the db, suddenly all the  
characters come back out of the db perfect.  The problem then was  
vulnerability to SQL Injection attacks.  So, I ditched my inline  
which used a sql statement:

var('sql')="Insert into wordchars (thestring) values ('" +  
(Encode_SQL: $text) + "');";
inline(-database="curricul_wordchars", -table="wordchars", -sql=($sql));

and used a lasso command inline (I don't know what you call it) like so:

inline(-database="curricul_wordchars", -table="wordchars", -
keyfield="id", "thestring"=($text), -Add);

And it works great!

So, I guess my question now is what is the difference - what is lasso  
doing to the string to prevent the SQL Injection and can I do that in  
a self written sql command?  Any ideas?

Todd V



--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/



Re: Accepting curly quotes from McWord in textareas

by bilcorry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Todd Vainisi wrote on 7/9/2009 9:55 AM:
> The pre-db output is still fine.  Its only the
> encode_html(field('thestring')) output that has the bad chars.  Where is
> my missing step?

You didn't mention if the character is being correctly stored in the db -- so it could be it's corrupted on the way to the db, or it could be it's corrupted on the way back from the db, or both.

So take a look at the db using a tool other than Lasso, does it look correct?  If you manually enter the correct char into the db using a tool other than Lasso, does Lasso then display it properly?

One thing to try first: change your insert inline to this, does it then work?

        if($text->Size);
        inline(-database="curricul_wordchars", -table="wordchars", 'thestring'=$text, -add);
                if(error_code != 0);
                        'Error: ' + error_msg;
                        abort;
                /if;
        /inline;
        /if;



- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/



Re: Accepting curly quotes from McWord in textareas

by bilcorry :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Todd Vainisi wrote on 7/9/2009 2:17 PM:
> After quite a bit more exploration, I discovered that if I remove the
> Encode_SQL formating when inserting into the db, suddenly all the
> characters come back out of the db perfect.

I knew there are issues with extended chars in SQL inlines, but I never knew it was due to encode_sql.  Maybe that's enough information for LassoSoft to finally track down and fix the issue.


- Bil


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/