"ocaml_beginners"::[] De-unifying variant types

View: New views
15 Messages — Rating Filter:   Alert me  

"ocaml_beginners"::[] De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Here's the situation:  I am building a small module to encapsulate the
nasty type-unsafe task of extracting data stored in a Postgresql database.
(I am using the postgresql-ocaml bindings by Markus Mottl to connect
to Postgresql).

For simplicity sake, let us assume there are only two tables in
the database: "movies" (containing 3 fields: title, year, and B&W,
respectively of types text, integer, and boolean), and "actors"
(containing 2 fields: name and age, of text and integer types).

Now, as far as I can tell, regardless of the Postgresql datatypes, any
row resulting from a SELECT statement will be seen by the Ocaml program
as a "string array".  Therefore, I have made two simple functions that
convert the "string array" into proper tuples of movies and actors:


  type movie_t = string * int * bool
  type actor_t = string * int
  type query_result = Movie of movie_t | Actor of actor_t

  let array_to_movie a = Movie (a.(0), int_of_string a.(1),
bool_of_string a.(2))
  let array_to_actor a = Actor (a.(0), int_of_string a.(2))


The reason why the movie_t and actor_t were unified under that
"query_result" variant is because there is a "process_query" function
which should accept results based both on movies or actors:


  let process_query array_converter converter_template result =
      let size = Array.length result in
      let converted = Array.make size converter_template in
      for i = 0 to size-1 do
          converted.(i) <- array_converter (result.(i))
      done;
      converted


Now, all of the these functions are fairly low-level and are not meant
to be used
directly by the users of the module.  Instead, the user should handle
all tasks
via two other functions, process_movies and process_actors, as follows:


  let process_movies result =
      let converter_template = Movie ("", 0, false) in
      process_query array_to_movie converter_template result

  let process_actors result =
      let converter_template = Actor ("", 0) in
      process_query array_to_actor converter_template result


This is all fine and dandy, except that because of the unification,
the signature of both process_movies and process_actors is the same:


  val process_movies : string array array -> query_result array
  val process_actors : string array array -> query_result array


My question is the following: how do I undo the unification of movie_t
and actor_t, so that externally, the signatures of these functions are
different? (listed below).  I don't want the caller to be bothering
with matches, and quite frankly, the fact that movie_t and actor_t were
unified is an implementation detail, used only to avoid the duplication
of the process_query code.


  val process_movies : string array array -> movie_t array
  val process_actors : string array array -> actor_t array


Thanks a lot for your help!!
Cheers,
C.S.




Re: "ocaml_beginners"::[] De-unifying variant types

by Jon Harrop :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 25 July 2007 13:52:55 cultural_sublimation wrote:
>   type movie_t = string * int * bool
>   type actor_t = string * int

Use records here to be more descriptive:

module Movie = struct
  type t = {name: string; foo: int; bar : bool}

  let of a = function
    | [|name; foo; bar|] ->
        {name=name; foo=int_of_string foo; bar=bool_of_string bar}
    | _ -> invalid_arg "Movie.of"
end

>   type query_result = Movie of movie_t | Actor of actor_t

Remove this type.

>   let process_query array_converter converter_template result =
>       let size = Array.length result in
>       let converted = Array.make size converter_template in
>       for i = 0 to size-1 do
>           converted.(i) <- array_converter (result.(i))
>       done;
>       converted

That is "Array.map".

>   let process_movies result =
>       let converter_template = Movie ("", 0, false) in
>       process_query array_to_movie converter_template result
>
>   let process_actors result =
>       let converter_template = Actor ("", 0) in
>       process_query array_to_actor converter_template result

The conversion is now trivial:

  Array.map Movie.of

so I would not bother factoring it out. Instead, have a query_movie function:

  let query_movie db query = Array.map Movie.of (Sql.query db query)
  let query_actor db query = Array.map Actor.of (Sql.query db query)

> This is all fine and dandy, except that because of the unification,
> the signature of both process_movies and process_actors is the same:
>
>
>   val process_movies : string array array -> query_result array
>   val process_actors : string array array -> query_result array
>
>
> My question is the following: how do I undo the unification of movie_t
> and actor_t, so that externally, the signatures of these functions are
> different? (listed below).

You must make them separate types (like my records).

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e


Re: "ocaml_beginners"::[] De-unifying variant types

by Richard Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Take a look at PG'OCaml (http://merjis.com/developers/pgocaml) as it
solves this problem already in a type-safe way.

Rich.

--
Richard Jones
Red Hat

Re: "ocaml_beginners"::[] De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>
> Take a look at PG'OCaml (http://merjis.com/developers/pgocaml) as it
> solves this problem already in a type-safe way.
>

Hi,

Thanks for the reply.  I have looked at PG'OCaml before, and
though I liked the concept, I was a bit put off by the lock-in
into Postgresql.

But regardless of this particular example, is there a general
solution to my problem, that of "de-unifying" type variants?

Cheers,
C.S.



Re: "ocaml_beginners"::[] De-unifying variant types

by William D. Neumann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 25 Jul 2007, cultural_sublimation wrote:

> My question is the following: how do I undo the unification of movie_t
> and actor_t, so that externally, the signatures of these functions are
> different? (listed below).  I don't want the caller to be bothering
> with matches, and quite frankly, the fact that movie_t and actor_t were
> unified is an implementation detail, used only to avoid the duplication
> of the process_query code.
>
>
>  val process_movies : string array array -> movie_t array
>  val process_actors : string array array -> actor_t array

Well, unfortunately, if the output of the same function, you're out of
luck, as they have to share the same output type.  You'd need to offer up
a set of refinement functions for each of the cases you want to separate
out, e.g.

let refine_movie = function Movie m -> m | _ -> assert false;;
let refine_actor = function Actor a -> a | _ -> assert false;;

and so on.

I know it's not what you want, but unless you do a lot more work to hide
the way the type system works in OCaml, you're stuck with it, as you won't
be able to come up with a useful function of type string array array -> 'a
array, which is what you're asking for.

William D. Neumannn

---

"There's just so many extra children, we could just feed the
children to these tigers.  We don't need them, we're not doing
anything with them.

Tigers are noble and sleek; children are loud and messy."

         -- Neko Case

Life is unfair.  Kill yourself or get over it.
  -- Black Box Recorder

"ocaml_beginners"::[] Re: De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jon (and other readers),

Thanks for the reply.  Your suggestions did clean up the code quite a
lot, though I am afraid the crux of the matter still remains.  I may be
missing something really important here, but I still cannot find a way
to share the "process_query" function without first creating a variant
type unifying actors and movies, though the code is *exactly* the same
for both types.  Here's the new version, non-working because it lacks
the variant type:

module Actor = struct
  type t = {name: string; age:integer}

  let create = function
    | [|name; age|] -> {name = name; age = int_of_string age}
    | _             -> invalid_arg "Actor.create"
end

module Movie = ... (* similar to Actor, just with the fields changed. *)

type

let process_query query create_fun =
  let result = conn#exec ~expect:[Tuples_ok] query in
  Array.map create_fun result#get_all

let get_actors () =
  let query = "SELECT * from actors" in
  process_query query Actor.create

let get_movies () =
  let query = "SELECT * from movies" in
  process_query query Movie.create


Now, as William D. Neumannn mentioned, there may be no solution to this
problem in Ocaml.  I will have to create a variant type, and then use
a specially crafted function such as the one bellow to extract only
the variant I am interested in:

let refine_movie = function Movie m -> m | _ -> assert false;;

Or is there a more elegant way?

Again, thanks in advance!
Cheers,
C.S.



Re: "ocaml_beginners"::[] De-unifying variant types

by Richard Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 25, 2007 at 02:53:45PM -0000, cultural_sublimation wrote:
> >
> > Take a look at PG'OCaml (http://merjis.com/developers/pgocaml) as it
> > solves this problem already in a type-safe way.
> Thanks for the reply.  I have looked at PG'OCaml before, and
> though I liked the concept, I was a bit put off by the lock-in
> into Postgresql.

That made me smile :-)

The idea of being "locked in" to free software.  Well, I guess
PG'OCaml does require PostgreSQL, but that's only because it is the
only database which has the necessary 'DESCRIBE' statement (which,
given a statement, parses it and tells you what types it takes and
returns).  If other databases have it, then you could add support for
them.

Rich.

--
Richard Jones
Red Hat

Re: "ocaml_beginners"::[] De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> The idea of being "locked in" to free software.  Well, I guess
> PG'OCaml does require PostgreSQL, but that's only because it is the
> only database which has the necessary 'DESCRIBE' statement (which,
> given a statement, parses it and tells you what types it takes and
> returns).  If other databases have it, then you could add support for
> them.

Hi,

Well, if it means that you can't easily plug-out Postgresql and
plug-in another DB, it is a sort of lock-in, even if it is much
softer than the lock-in you get with proprietary products!

But anyway, does the type verification of PG'OCaml go all the
way into ensuring that SQL injection attacks are not possible?
And does it provide also for such things as prepared statements,
and the entire range of SQL statements?  If so, I am willing
to take another look into it...

Cheers,
C.S.



"ocaml_beginners"::[] Re: De-unifying variant types

by Zheng Li-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

Maybe I misunderstood something, polymorphic variants seem to be able to
describe what you want. By defining the follows (not tested)

type movie = [`Movie of string * int * bool]
type actor = [`Actor of string * int]
type result = [movie | actor]

then easily you get

val array_to_movie: string array -> movie
val array_to_actor: string array -> actor
val process_query: (string array -> 'a) -> 'a -> string array array -> 'a array
val process_movie: string array array -> movie array
val process_actor: string array array -> actor array

Just in case you're afraid that the 'a in process_query is not restrictive
enough, you can restrict it with annotation:

val process_query:
  (string array -> ([< result] as 'a)) -> 'a -> string array array -> 'a array

HTH.


"cultural_sublimation" <cultural_sublimation@...> writes:

> type movie_t = string * int * bool
> type actor_t = string * int
> type query_result = Movie of movie_t | Actor of actor_t
>
> let array_to_movie a = Movie (a.(0), int_of_string a.(1),
> bool_of_string a.(2))
> let array_to_actor a = Actor (a.(0), int_of_string a.(2))
>
> The reason why the movie_t and actor_t were unified under that
> "query_result" variant is because there is a "process_query" function
> which should accept results based both on movies or actors:
>
> let process_query array_converter converter_template result =
> let size = Array.length result in
> let converted = Array.make size converter_template in
> for i = 0 to size-1 do
> converted.(i) <- array_converter (result.(i))
> done;
> converted
>
> Now, all of the these functions are fairly low-level and are not meant
> to be used
> directly by the users of the module. Instead, the user should handle
> all tasks
> via two other functions, process_movies and process_actors, as follows:
>
> let process_movies result =
> let converter_template = Movie ("", 0, false) in
> process_query array_to_movie converter_template result
>
> let process_actors result =
> let converter_template = Actor ("", 0) in
> process_query array_to_actor converter_template result
>
> This is all fine and dandy, except that because of the unification,
> the signature of both process_movies and process_actors is the same:
>
> val process_movies : string array array -> query_result array
> val process_actors : string array array -> query_result array
>
> My question is the following: how do I undo the unification of movie_t
> and actor_t, so that externally, the signatures of these functions are
> different? (listed below). I don't want the caller to be bothering
> with matches, and quite frankly, the fact that movie_t and actor_t were
> unified is an implementation detail, used only to avoid the duplication
> of the process_query code.
>
> val process_movies : string array array -> movie_t array
> val process_actors : string array array -> actor_t array



Re: "ocaml_beginners"::[] Re: De-unifying variant types

by Jon Harrop :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 25 July 2007 22:06:17 cultural_sublimation wrote:

> Hi Jon (and other readers),
>
> Thanks for the reply.  Your suggestions did clean up the code quite a
> lot, though I am afraid the crux of the matter still remains.  I may be
> missing something really important here, but I still cannot find a way
> to share the "process_query" function without first creating a variant
> type unifying actors and movies, though the code is *exactly* the same
> for both types.  Here's the new version, non-working because it lacks
> the variant type:
>
> module Actor = struct
>   type t = {name: string; age:integer}
>
>   let create = function
>
>     | [|name; age|] -> {name = name; age = int_of_string age}
>     | _             -> invalid_arg "Actor.create"
>
> end
>
> module Movie = ... (* similar to Actor, just with the fields changed. *)
>
> type
>
> let process_query query create_fun =
>   let result = conn#exec ~expect:[Tuples_ok] query in
>   Array.map create_fun result#get_all
>
> let get_actors () =
>   let query = "SELECT * from actors" in
>   process_query query Actor.create
>
> let get_movies () =
>   let query = "SELECT * from movies" in
>   process_query query Movie.create
>
>
> Now, as William D. Neumannn mentioned, there may be no solution to this
> problem in Ocaml.  I will have to create a variant type, and then use
> a specially crafted function such as the one bellow to extract only
> the variant I am interested in:
>
> let refine_movie = function Movie m -> m | _ -> assert false;;
>
> Or is there a more elegant way?

There are two ways to interpret your question:

1. Must I box the result in the variant type only to unbox it?

2. Are the run-time tests necessary?

To which the separate answers are:

1. You do not need to box and then unbox. You can avoid this using
higher-order functions as I showed (generating the unboxed type directly).

2. The run-time tests are necessary because you've reach the boundary of
static typing: the database is not under OCaml's control and is not
statically typed so you must introduce run-time type checks yourself at some
point (but not at two points as you currently are).

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e

Re: "ocaml_beginners"::[] De-unifying variant types

by Richard Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jul 25, 2007 at 09:49:05PM -0000, cultural_sublimation wrote:

> > The idea of being "locked in" to free software.  Well, I guess
> > PG'OCaml does require PostgreSQL, but that's only because it is the
> > only database which has the necessary 'DESCRIBE' statement (which,
> > given a statement, parses it and tells you what types it takes and
> > returns).  If other databases have it, then you could add support for
> > them.
>
> Hi,
>
> Well, if it means that you can't easily plug-out Postgresql and
> plug-in another DB, it is a sort of lock-in, even if it is much
> softer than the lock-in you get with proprietary products!

Once you start to strongly type your database statements you'll find
that most code requires porting, even between different versions of
the same DBMS, nevermind between entirely different DBMSes.

For example we discovered that Postgres like to play with the length
of various integer/serial/... types between releases.  For this reason
COCANWIKI needs some source code modifications if you use it with PG
8.x versus PG 7.4 which is what I'm using.  The changes are trivial
and mechanical -- statements return int64's instead of int32's -- but
they are there nevertheless.

> But anyway, does the type verification of PG'OCaml go all the
> way into ensuring that SQL injection attacks are not possible?

Of course.  I'd regard this as a basic requirement of _any_ database
binding, whether or not it was type safe.

> And does it provide also for such things as prepared statements,
> and the entire range of SQL statements?  If so, I am willing
> to take another look into it...

Yes & yes.  Prepared statements are mandatory and invisible to the
programmer if you are using the high level (ie. normal) PG'OCaml
interface.  You can use the low level PG'OCaml interface which is not
type safe and allows you to separately prepare & execute statements if
you wish.  Any (well, almost any) Postgres statement can be used
directly with PG'OCaml.  There are a few exceptions but they are
pretty esoteric, non-portable, PG-specific features.

Rich.

--
Richard Jones
Red Hat

Re: "ocaml_beginners"::[] De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Yes & yes.  Prepared statements are mandatory and invisible to the
> programmer if you are using the high level (ie. normal) PG'OCaml
> interface.  You can use the low level PG'OCaml interface which is not
> type safe and allows you to separately prepare & execute statements if
> you wish.  Any (well, almost any) Postgres statement can be used
> directly with PG'OCaml.  There are a few exceptions but they are
> pretty esoteric, non-portable, PG-specific features.

Hi,

Well, I have downloaded PG'OCaml and managed to compile it after
some trial and error.  The first impression was not all that
positive: it feels a lot like work-in-progress, and since there
is no documentation whatsoever, progress is slow and one has to dig
through the source to figure out how to use it.

However, the concept itself is of course interesting.  I guess that
once it is finished and polished it will definitely be the best way
to connect to Postgresql from OCaml.  Keep up the good work!

Cheers,
C.S.



Re: "ocaml_beginners"::[] De-unifying variant types

by Richard Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 26, 2007 at 05:02:38PM -0000, cultural_sublimation wrote:

> > Yes & yes.  Prepared statements are mandatory and invisible to the
> > programmer if you are using the high level (ie. normal) PG'OCaml
> > interface.  You can use the low level PG'OCaml interface which is not
> > type safe and allows you to separately prepare & execute statements if
> > you wish.  Any (well, almost any) Postgres statement can be used
> > directly with PG'OCaml.  There are a few exceptions but they are
> > pretty esoteric, non-portable, PG-specific features.
>
> Hi,
>
> Well, I have downloaded PG'OCaml and managed to compile it after
> some trial and error.  The first impression was not all that
> positive: it feels a lot like work-in-progress, and since there
> is no documentation whatsoever, progress is slow and one has to dig
> through the source to figure out how to use it.

The version on the site was a bit old.  I wrote some documentation a
while back which is now in 0.8, here:
http://merjis.com/developers/pgocaml

Rich.

--
Richard Jones
Red Hat

Re: "ocaml_beginners"::[] De-unifying variant types

by cultural_sublimation :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> The version on the site was a bit old.  I wrote some documentation a
> while back which is now in 0.8, here:
> http://merjis.com/developers/pgocaml

Hi,

Excellent!  Now that I've played a bit more with PG'OCaml, I am liking it more and more.
I think that with more extensive documentation and a GODI package this could very well
become the de facto Ocaml standard for connecting to Postgresql.

I'll keep you posted of suggestions/impressions once I start using it in "the real world".

Cheers,
C.S.

P.S. The dependencies list in the project homepage is missing the CSV library.

Re: "ocaml_beginners"::[] De-unifying variant types

by Richard Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Jul 27, 2007 at 08:27:54AM -0700, cultural_sublimation wrote:
> P.S. The dependencies list in the project homepage is missing the CSV
> library.

Now fixed - thanks.

Rich.

--
Richard Jones
Red Hat