Integrating ICU with a plugin?

View: New views
7 Messages — Rating Filter:   Alert me  

Integrating ICU with a plugin?

by David Dancy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi experts

Does anyone have any comments/recommendations/advice about integrating
the ICU system into a 4D plugin?

On Windows (Vista + 7) I've managed to

* download & compile the ICU project (currently 4.2; I believe 4D v11 uses 3.8)
* reference it in a plugin (#include various files)
* use the UnicodeString class in the plugin to convert ANSI text to
Unicode and back again with the default system code page

I've followed the recommendation of setting the bin directory in the
ICU project in my PATH environment variable, so I assume that's what
enables the plugin to find the routines it uses in the various ICU DLL
files.

I've also noticed that the DLLs are quite big. The translation DLL is
over 15MB, and some of the code DLLs are over 1MB. If I include all of
them next to my plugin's .4DX file, will that mean I don't need the
PATH variable to be set any more?

Is the process much harder for MacOS? I think I remember seeing that
some aspects of text handling in MacOS are actually done using ICU, so
would I still need to include the ICU build with my plugin? I know
hardly anything about how MacOS uses libraries, so some simple steps
to follow (even if they are sub-optimal) would be very welcome.

Also, how compatible is the data stored in a PA_Unistring struct with
the ICU UnicodeString class? ICU seems to imply that std::wstring (at
least on Windows) is binary-compatible with its UnicodeString class,
and that's been borne out by my (albeit limited) experience. I assume
that since a PA_Unistring simply stores UTF-16, and a UnicodeString
does too, that you can just store data from one in the other and vice
versa. Is this correct?

Thanks heaps for any and all contributions.

TIA

David Dancy
Sydney, Australia
**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by MIYAKO :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

some time ago I posted on the Forums,
a plugin that exploits the 4D-embedded OpenSSL framework from a plugin,
cross platform.

perhaps the same can be done for ICU as well ?
haven't tried yet, but seems worth exploring...

wchar_t* is UTF-16 on Windows but UTF-32 on Mac,
so I guess its better to work with PA_Unistring all along.

is there a particular API in the ICU you want to call?

miyako

On 2009/11/06, at 10:45, David Dancy wrote:

> Hi experts
>
> Does anyone have any comments/recommendations/advice about integrating
> the ICU system into a 4D plugin?
>
> On Windows (Vista + 7) I've managed to
>
> * download & compile the ICU project (currently 4.2; I believe 4D  
> v11 uses 3.8)
> * reference it in a plugin (#include various files)
> * use the UnicodeString class in the plugin to convert ANSI text to
> Unicode and back again with the default system code page
>
> I've followed the recommendation of setting the bin directory in the
> ICU project in my PATH environment variable, so I assume that's what
> enables the plugin to find the routines it uses in the various ICU DLL
> files.
>
> I've also noticed that the DLLs are quite big. The translation DLL is
> over 15MB, and some of the code DLLs are over 1MB. If I include all of
> them next to my plugin's .4DX file, will that mean I don't need the
> PATH variable to be set any more?
>
> Is the process much harder for MacOS? I think I remember seeing that
> some aspects of text handling in MacOS are actually done using ICU, so
> would I still need to include the ICU build with my plugin? I know
> hardly anything about how MacOS uses libraries, so some simple steps
> to follow (even if they are sub-optimal) would be very welcome.
>
> Also, how compatible is the data stored in a PA_Unistring struct with
> the ICU UnicodeString class? ICU seems to imply that std::wstring (at
> least on Windows) is binary-compatible with its UnicodeString class,
> and that's been borne out by my (albeit limited) experience. I assume
> that since a PA_Unistring simply stores UTF-16, and a UnicodeString
> does too, that you can just store data from one in the other and vice
> versa. Is this correct?
>
> Thanks heaps for any and all contributions.
>
> TIA
**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by David Dancy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

SSL Plugin: I'd be interested in looking at that to see how it's done
(had a brief look with "ssl" on the forums and couldn't find it).

ICU Strings: the rationale here is that I want to keep as much code as
possible identical for Mac and Win versions of each plugin that I
write. I'm also keen to support 4D v2004 (and by extension v2003),
which know nothing about Unicode. In summary, I want one code base
(within reason) to provide support for

* ANSI and UTF-16 (Win)
* MacOS native text format (not sure what that is - is it UTF-8?)
* 4D v2004--
* 4D v11++

I want to work in C++ rather than in C. To this end I've created a C++
Plugin API that sits on top of the C Plugin API provided by 4D. It's
not very sophisticated, but it allows me to create simple plugins very
quickly, especially on Windows.

I'd like now to extend my framework so it works on MacOS as well as
Windows. To date I have used the std:: library for strings, vectors,
maps, and such; and the boost:: library for shared pointers. These
have made life pretty easy when working with strings especially.

I've used std::string for ANSI text and std::wstring for Unicode
UTF-16 in my code up to now in the (what now seems to be forlorn) hope
that these will shield me from platform-specific nasties and enable me
to have one code base that adapts itself to each platform. I don't
really want to have to create (yet another) string class to do this
for me, but if I have to, I will.

That's where I was hoping that UnicodeString would come in. I was
hoping it would be the cross-platform class I'm looking for, that can
abstract away all the specifics of Unicode strings on each platform.
Ideally it would be compatible with the Windows native API (i.e. be
able to produce a null-terminated UTF-16 string on demand) and also
the MacOS API (i.e. be able to produce a CFString or whatever the API
requires on demand).

What I have now goes something like this:

4D calls my PluginMain in my plugin shell. The InitPlugin and
DeinitPlugin methods create and delete an instance of the plugin's
class. The constructor of the plugin class initialises its data
structures, and its destructor gets rid of them.
The plugin class itself instantiates a secondary class which is the
"worker" class for the plugin. In the worker class I put
platform-specific code.

The plugin class is therefore responsible for getting data out of 4D's
parameter blocks, translating it into whatever format the worker class
needs, getting results back from the worker, and translating them back
to what 4D needs. My plugin framework abstracts a lot of those
details.

The PluginMain method, the plugin class and the worker class are all
generated from a customised version of the 4D Plugin Wizard.

I really don't want to have to code two separate interfaces between my
plugin shell and the working code unless I really have to. Of course
the specifics will depend heavily on the functionality of the plugin,
but in general I'd like to have as small a code base as possible. I
hate duplicated code!

* The 4D v2004 requirement means I can't pass PA_UniString variables
between the plugin class and the worker class.
* The MacOS requirement seems to mean that I shouldn't use
std::wstring when I'm working in Unicode (it's wasteful of space,
because its underlying wchar_t type is 32 bits; and it's potentially
incompatible with the MacOS strings API).
* I can use the Windows-native ::MultiByteToWideChar and
::WideCharToMultiByte converters to get to Unicode and back, but they
don't produce identical results to those given by 4D (since it uses
ICU internally, and Windows does its own thing).
* From what I understand, MacOS uses aspects of ICU internally, so its
native strings API should get closer to 4D's results.

What to do? Maybe a wrapper class that uses std::wstring on Windows
and CFString on MacOS for its string storage, but that provides std::
library-compatible functions to manipulate them? That I think would
mean the fewest changes to my existing plugin code base. But it would
be a ton of work. Hence the thrashing about to see if UnicodeString
has the required functionality. On Windows, at least, it appears to;
but then I have concerns about the size of the text manipulation code
(possibly never used) that I have to include with the plugin to make
it work.

Cheers


David Dancy
Sydney, Australia



2009/11/6 miyako <miyako@...>:

> Hi,
>
> some time ago I posted on the Forums,
> a plugin that exploits the 4D-embedded OpenSSL framework from a plugin,
> cross platform.
>
> perhaps the same can be done for ICU as well ?
> haven't tried yet, but seems worth exploring...
>
> wchar_t* is UTF-16 on Windows but UTF-32 on Mac,
> so I guess its better to work with PA_Unistring all along.
>
> is there a particular API in the ICU you want to call?
>
> miyako
>
**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by aparajita :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Does anyone have any comments/recommendations/advice about integrating
> the ICU system into a 4D plugin?

If you want to use Unicode, it's really the only way to go.


> I've also noticed that the DLLs are quite big. The translation DLL is
> over 15MB, and some of the code DLLs are over 1MB.

No way around that unfortunately. The ICU data files add about 25MB to  
4D's size, because 4D includes both little endian and big endian data.


> If I include all of
> them next to my plugin's .4DX file, will that mean I don't need the
> PATH variable to be set any more?

In 4D 2004, the DLLs will have to be installed in the System32  
directory on Windows. When an *application* loads in Windows, any DLLs  
in the same directory are accessible. But a plugin is itself a DLL  
loaded dynamically by the 4D application. If the working directory is  
set to your plugin, then the DLLs can be put next to it. I  
successfully pushed 4D into implementing that in v11.5 so that I could  
put the ICU DLLs in my plugin bundle. I don't know if you will have  
any success getting them to do that for 4D 2004.


> Is the process much harder for MacOS?

Much easier. I built ICU as a framework (which is basically a special  
form of DLL that allows you to package code and data together) in such  
a way that it can be found by the OS when the plugin is loaded. I  
probably could be persuaded to make that Xcode project available.


> Also, how compatible is the data stored in a PA_Unistring struct with
> the ICU UnicodeString class?

PA_Unistring stores NULL terminated UTF-16. UnicodeString is non-NULL  
terminated UTF-16. You can construct a UnicodeString from a  
PA_Unistring as follows:

UnicodeString mystr(unistring.fString, unistring.fLength);

If you can count on the source PA_Unistring being available for the  
life of the UnicodeString, you can create a read-only alias of it.

UnicodeString mystr(TRUE, unistring.fString, unistring.fLength);

Or if you have an existing UnicodeString,

mystr.setTo(unistring.fString, unistring.fLength);

Kind regards,

    Aparajita
    www.aparajitaworld.com

    "If you dare to fail, you are bound to succeed."
    - Sri Chinmoy   |   www.srichinmoy.org

**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by David Dancy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Aparajita

Thanks very much - all that info is really helpful. As you're no doubt
aware, 4D is publishing the Plugin API as an open source project.
Would you be willing to contribute your ICU package XCode project
there? I'm well aware of the effort involved in achieving something
like that, so if you're not inclined, no problem; it's really
something that 4D could usefully do themselves to help us all, and it
would also be nice coming from them as an indication that they are
serious about plugins.

What you've shared here is a great leg-up and it will help enormously
with my own learning process.

Cheers

David Dancy
Sydney, Australia



2009/11/6 Aparajita Fishman <aparajita@...>:

>> Does anyone have any comments/recommendations/advice about integrating
>> the ICU system into a 4D plugin?
>
> If you want to use Unicode, it's really the only way to go.
>
>
>> I've also noticed that the DLLs are quite big. The translation DLL is
>> over 15MB, and some of the code DLLs are over 1MB.
>
> No way around that unfortunately. The ICU data files add about 25MB to 4D's
> size, because 4D includes both little endian and big endian data.
>
>
>> If I include all of
>> them next to my plugin's .4DX file, will that mean I don't need the
>> PATH variable to be set any more?
>
> In 4D 2004, the DLLs will have to be installed in the System32 directory on
> Windows. When an *application* loads in Windows, any DLLs in the same
> directory are accessible. But a plugin is itself a DLL loaded dynamically by
> the 4D application. If the working directory is set to your plugin, then the
> DLLs can be put next to it. I successfully pushed 4D into implementing that
> in v11.5 so that I could put the ICU DLLs in my plugin bundle. I don't know
> if you will have any success getting them to do that for 4D 2004.
>
>
>> Is the process much harder for MacOS?
>
> Much easier. I built ICU as a framework (which is basically a special form
> of DLL that allows you to package code and data together) in such a way that
> it can be found by the OS when the plugin is loaded. I probably could be
> persuaded to make that Xcode project available.
>
>
>> Also, how compatible is the data stored in a PA_Unistring struct with
>> the ICU UnicodeString class?
>
> PA_Unistring stores NULL terminated UTF-16. UnicodeString is non-NULL
> terminated UTF-16. You can construct a UnicodeString from a PA_Unistring as
> follows:
>
> UnicodeString mystr(unistring.fString, unistring.fLength);
>
> If you can count on the source PA_Unistring being available for the life of
> the UnicodeString, you can create a read-only alias of it.
>
> UnicodeString mystr(TRUE, unistring.fString, unistring.fLength);
>
> Or if you have an existing UnicodeString,
>
> mystr.setTo(unistring.fString, unistring.fLength);
>
**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by MIYAKO :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

apparently you have to search for "openssl"
http://forums.4d.fr/Post/FR/3014839/1/3032827#3032827

you can also find the same project in this week's Tech Note.

miyako

On 2009/11/06, at 13:07, David Dancy wrote:

> SSL Plugin: I'd be interested in looking at that to see how it's done
> (had a brief look with "ssl" on the forums and couldn't find it).
**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************


Re: Integrating ICU with a plugin?

by aparajita :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Would you be willing to contribute your ICU package XCode project
> there?

I'll send it to you privately first, if you can get it working feel  
free to contribute it.


> it's really
> something that 4D could usefully do themselves to help us all

Along with better documentation, bug fixes, etc...  :-)

In general 4D has been very helpful. I asked them long ago to give me  
the Xcode project they used to build ICU as a framework. They  
basically said, "Do it yourself." You win some, you lose some.

Kind regards,

    Aparajita
    www.aparajitaworld.com

    "If you dare to fail, you are bound to succeed."
    - Sri Chinmoy   |   www.srichinmoy.org

**********************************************************************
4D Plugins hosted by 4D, Inc.                      http://www.4D.com/

    Get the speed and power of 4D v11 SQL
    before upgrade prices increase - http://www.4d.com

To Unsubscribe:                      mailto:4D-Plugins-off@...
***********************************************************************