« Return to Thread: Re: Accepting curly quotes from McWord in textareas

Re: Accepting curly quotes from McWord in textareas

by t vainisi :: Rate this Message:

Reply to Author | View in Thread

Hi List,

The problem of invalid characters from MS Word has plagued me for  
years.  Because of the nature of our web app, users are very apt to  
try to copy and paste from word.  Its not just curly quotes, its math  
symbols (such as greater than or equal to or pretty much anything you  
might find in an algebraic equation).

I've used the test source code that Bil was kind enough to offer in  
this thread (the emdash example) and everything I've pasted into the  
textarea comes back fine.

I built a new MySQL db using phpMyAdmin, it says the database is  
using utf8_unicode_ci collation, and the only table inside also  
reports utf8_unicode_ci.  The varchar field in the table is also  
labeled as utf8_unicode_ci collation.

In Lasso 8.5 SiteAdmin, I used the "Table Batch Change" to insure the  
UTF-8 encoding.  Then verified that in the table listing - table  
detail the encoding is UTF-8 (Unicode).  Lasso is running on a Unix  
Box (CentOS 5, I believe) as is the MySQL db.

I added inlines to Bil's page, and immediately the output contains  
nasty boxes instead of the proper characters.  Here is the altered  
source:

[
lp_header_serveHTML('UTF-8');
var('text') = action_param('text')->trim&->substring(1,10);

if($text->Size);
var('sql')="Insert into wordchars (thestring) values ('" +  
(Encode_SQL:$text) + "');";
inline(-database="curricul_wordchars", -table="wordchars", -sql=($sql));
/inline;

/if;

var('emdash') = bytes;
$emdash->import8bits(226);
$emdash->import8bits(128);
$emdash->import8bits(148);

]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/ 
TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Charset Demo</title>
</head>
<body style="background: white;">
<form method="post">
Paste some word chars like this dash "[$emdash]"<br><textarea  
name="text"></textarea>
<input type="submit" name="submit" value="submit">
</form>
[if($text->size);
'<hr>You pasted:<br><br>'+encode_html($text)+'<br>';
lp_string_radixprint($text);
var('sql')="Select * from wordchars order by createdate desc;";
inline(-database="curricul_wordchars", -table="wordchars", -sql=($sql));
"<hr>The db output:<br><br>"+(encode_html(field('thestring')))+"<br>";
/inline;
/if]
</body>
</html>


The pre-db output is still fine.  Its only the encode_html(field
('thestring')) output that has the bad chars.  Where is my missing step?


Todd V


--
This list is a free service of LassoSoft: http://www.LassoSoft.com/
Search the list archives: http://www.ListSearch.com/Lasso/Browse/
Manage your subscription: http://www.ListSearch.com/Lasso/


 « Return to Thread: Re: Accepting curly quotes from McWord in textareas