functions with 'named' arguments

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

There was a recent discussion about evolutions for Octave 3.1. May I suggest the following one ? But before, the context. High level programming sometimes requires to write functions with various number of arguments. In C, there are constructs for this, but you have to be very carefull about the type of arguments. Try a 'printf("%d\n", 1.23) and enjoy. Object-oriented languages introduce polymorphism, i.e. the ability to have one function with many 'signatures', that is using the same name, but differenciating over the number and type of arguments. The deal is that the right implementation is called, and this is decided either at compile or run-time.

Now let's turn to a simple problem: a function requires a variable number of vector or matrices, and some optionnal flags. With the varagin construct, you can loop over arguments, and taking actions according to their types. Matrices are concatenated in a big matrix, and scalars should be used as flags. The first one has this meaning, the second has that meaning, ... But wait. optionnal means they should take default values if none is provided, their purpose is to fine-tune the algorithm inner working. So far so good, but how to modify only some of the optionnal arguments ?

I see two ways. In Perl, interfaces like DBI accept hashes as input, so some call may looks like
 $dbh = DBI->connect($dsn, $user, $password, { RaiseError => 1, AutoCommit => 0 });
The last part is a hash, specifying some optionnal args and their value. From the programmer's perspective, a simple interation on the keys of the hash is enough to scan those args. In R, arguments specification contains 3 parts: "usual" args, i.e. without default values, "optionnal", and "named". Named arguments is a special type, they are optionnal with default value. F.i, the prototype of "mean" is
mean(x, trim = 0, na.rm = FALSE, ...). The 3 sorts are present, and the default for the named ones is defined at the prototype.

Would itbe possible to implement one of those mechanisms in Octave ? The problem I don't see a solution at present is passing multiple optionnal args of the same type, with different roles, in a free-form fashion.

Regards

Pascal

Re: functions with 'named' arguments

by Søren Hauberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


man, 17 03 2008 kl. 09:19 -0700, skrev Dupuis:
> Would itbe possible to implement one of those mechanisms in Octave ? The
> problem I don't see a solution at present is passing multiple optionnal args
> of the same type, with different roles, in a free-form fashion.
We already have default arguments. The following gives an example

function hello (who = "World")
  printf("Hello, %s\n", who);
endfunction

>> hello()
Hello, World
>> hello("Wonderful World")
Hello, Wonderful World

Matlab has (kinda) always had named arguments, and you can do the same
in Octave. The standard syntax for calling a function the looks
something like this:

  my_function(some_normal_argument, "name_of_var", value_of_var)

That is, the user provides the name of variable/option as a string, and
its values as the next parameter. The developer who implements the
function must then check the input for strings, and handle the values
correctly. This can be a bit tedious, but it works. And it's what Matlab
does.

Søren


Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Søren Hauberg wrote:
man, 17 03 2008 kl. 09:19 -0700, skrev Dupuis:
> Would itbe possible to implement one of those mechanisms in Octave ? The
> problem I don't see a solution at present is passing multiple optionnal args
> of the same type, with different roles, in a free-form fashion.
We already have default arguments. The following gives an example

function hello (who = "World")
  printf("Hello, %s\n", who);
endfunction

>> hello()
Hello, World
>> hello("Wonderful World")
Hello, Wonderful World
Hey ! But this is exactly what I ask for ! Since how long is it implemented ?

Regards

Pascal

Re: functions with 'named' arguments

by Søren Hauberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


tir, 18 03 2008 kl. 01:41 -0700, skrev Dupuis:
> Hey ! But this is exactly what I ask for ! Since how long is it implemented
> ?
This thread:
http://www.nabble.com/Default-arguments-td7869282.html#a7869282 dates
back to late 2006 where John added support for default arguments. So,
its been there for some time, but its not used that much.

Søren


Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Søren Hauberg wrote:
This thread:
http://www.nabble.com/Default-arguments-td7869282.html#a7869282 dates
back to late 2006 where John added support for default arguments. So,
its been there for some time, but its not used that much.
I played a bit with it and found inconsistencies with regard to R:

function hello (who = "World", closing = "!")
  printf("Hello, %s %s\n", who, closing);
endfunction

octave> hello(who = "Me")
Hello, Me !

octave> hello(closing="and the rest")
Hello, and the rest ! <= 'who' was set; 'closing' takes default value

octave> hello(who = "Me", closing="and the rest")
Hello, Me and the rest

octave> hello(closing="and the rest", who = "Me")
Hello, and the rest Me <= arguments were permuted

would it be possible to have a strict mapping when using the named parameters paradigm ?

octave> function test(varargin, flag=false)
parse error: syntax error, invalid parameter list

function test(flag=false, varargin)
  printf('flag is %d\n', flag);
  for indi=1:length(varargin), disp(varargin{indi}); endfor
endfunction

octave> test(2, 3, 4, flag=true)
flag is 2
 3
 4
 1

would it be possible
1) to be able to declare named arguments AFTER varargin
2) to ensure the same strict mapping, i.e. unnamed parameters go to varargin, named ones are matched against the list of named arguments, named ones without matching being flagged as errors ?

The purpose would be to achieve something similar to:
meanemp<-function(n,...,dist="Uniforme") {
+ switch(dist,Normale=rnorm(n,...),Uniforme=runif(n,...),Exponentielle=rexp(n,...))
+}
meanemp(5, sd=.1, dist="Normale") % normal distrib, 5 samples, mean 0, standard deviation .1
[1]  0.02443245  0.07669428 -0.11214011 -0.01001278 -0.17726786
meanemp(5, 1, dist="Normale") % normal distrib, 5 samples, mean 1, sd 1
[1] 2.725002  1.775500  2.354803 -0.623507  1.873243

notice that meanemp(5, dist="Normale", 1) and meanemp(dist="Normale", 5, 1) are identical ways of calling this function

The mechanism is as follows:
meanemp has a normal parameter, 'n', varargins, '...', and a named argument, 'dist'. Named argument(s) matching the function definition is (are) extracted from the list, whatever their position. Then normal parameters are matched based upon their position. The rest becomes varargins, both unnamed and named, which are passed to daughters functions. There, they are matched based first upon their name, second upon their position.

My guess is that this respects to principle of least surprise, delivering results as the programmer expect them.

Greetings

Pascal

Re: functions with 'named' arguments

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>  would it be possible
>  1) to be able to declare named arguments AFTER varargin
>  2) to ensure the same strict mapping, i.e. unnamed parameters go to
>  varargin, named ones are matched against the list of named arguments, named
>  ones without matching being flagged as errors ?

No. This would break compatibility. The call you used above:

hello(closing="and the rest")

is not a call by named argument like in R, Python or Fortran, rather
it is equivalent to

closing="and the rest" ; hello(closing)

i.e. you use an assignment expression. I guess there's little point
arguing which feature is more useful; certainly, both are useful, but
unfortunately they cannot be used clearly together.



--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jaroslav Hajek-2 wrote:
No. This would break compatibility. The call you used above:

hello(closing="and the rest")

is not a call by named argument like in R, Python or Fortran, rather
it is equivalent to

closing="and the rest" ; hello(closing)

i.e. you use an assignment expression. I guess there's little point
arguing which feature is more useful; certainly, both are useful, but
unfortunately they cannot be used clearly together.
Excuse me ? What am I doing when I use, in R:
meanemp(dist="Normale", 4, mean=10)
meanemp(mean=10, dist="Normale", 4)
? What is the difference with the suggested
hello(closing='bla') ?

Pascal

Re: functions with 'named' arguments

by shaiay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Mar 18, 2008 at 1:44 PM, Dupuis <Pascal.Dupuis@...> wrote:

>
>
>
>  Jaroslav Hajek-2 wrote:
>  >
>  > No. This would break compatibility. The call you used above:
>  >
>  > hello(closing="and the rest")
>  >
>  > is not a call by named argument like in R, Python or Fortran, rather
>  > it is equivalent to
>  >
>  > closing="and the rest" ; hello(closing)
>  >
>  > i.e. you use an assignment expression. I guess there's little point
>  > arguing which feature is more useful; certainly, both are useful, but
>  > unfortunately they cannot be used clearly together.
>  >
>  >
>
>  Excuse me ? What am I doing when I use, in R:
>  meanemp(dist="Normale", 4, mean=10)
>  meanemp(mean=10, dist="Normale", 4)
>  ? What is the difference with the suggested
>  hello(closing='bla') ?

In R thsi is a language feature, where R assigns the value "Normale"
to the PARAMETER dist of the function meanemp. In octave, the same
syntax assigns the value "Normale" to the VARIABLE dist of the calling
function (NOT meanemp).
They are written the same, but produce different results. As Jaroslav
ex[plained, in octave both your examples are equivalent to:
dist="Normale"; mean=10 ; meanemp("Normale",4,10);
dist="Normale"; mean=10 ; meanemp(10,"Normale",4);

which I assume is not the behavior you are looking for

Shai

Re: functions with 'named' arguments

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>
>  Excuse me ? What am I doing when I use, in R:
>  meanemp(dist="Normale", 4, mean=10)
>  meanemp(mean=10, dist="Normale", 4)
>  ? What is the difference with the suggested
>  hello(closing='bla') ?
>

What I'm trying to explain is that
hello(closing='bla') already *has* a meaning in Octave, and it is
*not* what you have
on mind. In R, this is no problem because the assignment is not "=".
In Fortran and Python, this is no problem because assignment is not an
expression (not sure about Python, but I think so).

It is perfectly legal in Octave (and sometimes useful) to write things like
x = exp (y = -y)
A possible solution would be to implement keyword arguments using
different character than "=", but I think that the standard workaround
using a cell array (of the form {"name", value, "name", value}) is
already good enough.

--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jaroslav Hajek-2 wrote:
What I'm trying to explain is that
hello(closing='bla') already *has* a meaning in Octave, and it is
*not* what you have
on mind. In R, this is no problem because the assignment is not "=".
In Fortran and Python, this is no problem because assignment is not an
expression (not sure about Python, but I think so).

It is perfectly legal in Octave (and sometimes useful) to write things like
x = exp (y = -y)
A possible solution would be to implement keyword arguments using
different character than "=", but I think that the standard workaround
using a cell array (of the form {"name", value, "name", value}) is
already good enough.
OK, I see more clearly the point. In R, "=" is an assignment operator, which can be used at top level, or as a subexpression in a braced list of expression.

My opinion is that having an assignment of an variable expressed in the middle of a function call is a bit strange. OTOH, the semantic meaning "initialize the function parameter with the corresponding name" makes more sense. If you look at R, Python or Perl, this adds a level of clarity, both for the programmer, because the bindings are performed automatically, and for the user, because each arguments and their default value are evident from the function definition.

With the list of names and values approach, and given the availability of functions like "parseparams", it is up to the programmer to test for valid and invalid cases. From the user perspective, default values and allowed names are to be searched for inside the function body.
 
So, having named args 'à la Python/R/...' would add a level of clarity. Regarding the other use, i.e. mixing variable assignment and function call, this can be accomplished by x = exp((y=-y)) : the parens ensure the assignment is performed first, and the "return" value, the new assigned var, used as function parameter.

I insist on this because I was Java programmer in a previous life, and dealing with objects usually means playing with a lot of so-called attributes. To this end, objects were created, and numerous calls were performed to modify attributes one-by-one. This was a nightmare to devise and to use. Later, I had opportunities to touch Python scripts, and found their approach of named args cleaner and safer.

Greetings

Pascal  

Re: functions with 'named' arguments

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Mar 18, 2008 at 3:01 PM, Dupuis <Pascal.Dupuis@...> wrote:

>
>
>
>  Jaroslav Hajek-2 wrote:
>  >
>  >
>  > What I'm trying to explain is that
>  > hello(closing='bla') already *has* a meaning in Octave, and it is
>  > *not* what you have
>  > on mind. In R, this is no problem because the assignment is not "=".
>  > In Fortran and Python, this is no problem because assignment is not an
>  > expression (not sure about Python, but I think so).
>  >
>  > It is perfectly legal in Octave (and sometimes useful) to write things
>  > like
>  > x = exp (y = -y)
>  > A possible solution would be to implement keyword arguments using
>  > different character than "=", but I think that the standard workaround
>  > using a cell array (of the form {"name", value, "name", value}) is
>  > already good enough.
>  >
>
>  OK, I see more clearly the point. In R, "=" is an assignment operator, which
>  can be used at top level, or as a subexpression in a braced list of
>  expression.
>
>  My opinion is that having an assignment of an variable expressed in the
>  middle of a function call is a bit strange.
It is simply modelled after C and its descendants - C++, Java, D, JavaScript ...
There are a lot of syntactic features of C/C++ that may be considered strange.
Still, many programmers are used to them.

> OTOH, the semantic meaning
>  "initialize the function parameter with the corresponding name" makes more
>  sense.
... to you.

> If you look at R, Python or Perl, this adds a level of clarity, both
>  for the programmer, because the bindings are performed automatically, and
>  for the user, because each arguments and their default value are evident
>  from the function definition.
>

Indeed. But in Octave (and C,C++) it would create a syntax clash.

>  With the list of names and values approach, and given the availability of
>  functions like "parseparams", it is up to the programmer to test for valid
>  and invalid cases. From the user perspective, default values and allowed
>  names are to be searched for inside the function body.
>

Not really. They should be documented in the function's inline documentation,
and in good functions, they will be. And notice that there may be more
complicated
rules than a simple expression. (like: "if x is not present but y is
supplied, x is calculated via the following formula ....)

>  So, having named args 'à la Python/R/...' would add a level of clarity.

Perhaps. Unfortunately, it would also add a level of confusion for
those not used to
keyword argument calls, but used to assignment expressions.

>  Regarding the other use, i.e. mixing variable assignment and function call,
>  this can be accomplished by x = exp((y=-y)) : the parens ensure the
>  assignment is performed first, and the "return" value, the new assigned var,
>  used as function parameter.
>

Sure, it's a workaround. The cell array is a workaround for keyword arguments.

>  I insist on this because I was Java programmer in a previous life, and
>  dealing with objects usually means playing with a lot of so-called
>  attributes. To this end, objects were created, and numerous calls were
>  performed to modify attributes one-by-one. This was a nightmare to devise
>  and to use. Later, I had opportunities to touch Python scripts, and found
>  their approach of named args cleaner and safer.
>
>  Greetings
>
>  Pascal
>  --
>  View this message in context: http://www.nabble.com/functions-with-%27named%27-arguments-tp16095584p16121881.html
>
>
> Sent from the Octave - Maintainers mailing list archive at Nabble.com.
>
>
>



--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jaroslav Hajek-2 wrote:
Indeed. But in Octave (and C,C++) it would create a syntax clash.

Perhaps. Unfortunately, it would also add a level of confusion for
those not used to
keyword argument calls, but used to assignment expressions.
For the first remark: a grep on all files under /usr/share/octave/3.0.0/ showed that the pattern
'([^<>!=)]*=[^=)]*)' is only found within strings or if clauses. So the pattern of default arguments seems not yet used inside Octave sources.

For the second remark: after reading a R tutorial, the notion doesn't look confusing at all.

Now for some more substancials comments. A possible target for this 'named arguments' paradigm is low-level functions. That is, functions which are called often. Using such paradigm within the octave interpreter means faster execution. Using parseparams means more code, more function calls, more computer cycles, and so forth.
My vision is that this named argument paradigm brings a level of simplicity. You have f.i. to cope with various statistical functions, you know they have parameters called 'mean' and 'sd'.  You just construct  the calls this way: somefunc(mean = this, sd = that). The interpreter will perform the matching. So if you have two sets of functions written by two teams with different calling conventions, you don't have to construct the call accordingly and use either if construct either eval(). Handled within the Octave interpreter, it's faster.

Regards

Pascal

Re: functions with 'named' arguments

by shaiay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 19, 2008 at 11:31 AM, Dupuis <Pascal.Dupuis@...> wrote:

>
>
>
>  Jaroslav Hajek-2 wrote:
>  >
>  > Indeed. But in Octave (and C,C++) it would create a syntax clash.
>  >
>
> > Perhaps. Unfortunately, it would also add a level of confusion for
>  > those not used to
>  > keyword argument calls, but used to assignment expressions.
>  >
>  >
>  For the first remark: a grep on all files under /usr/share/octave/3.0.0/
>  showed that the pattern
>  '([^<>!=)]*=[^=)]*)' is only found within strings or if clauses. So the
>  pattern of default arguments seems not yet used inside Octave sources.
>
>  For the second remark: after reading a R tutorial, the notion doesn't look
>  confusing at all.
>
>  Now for some more substancials comments. A possible target for this 'named
>  arguments' paradigm is low-level functions. That is, functions which are
>  called often. Using such paradigm within the octave interpreter means faster
>  execution. Using parseparams means more code, more function calls, more
>  computer cycles, and so forth.
>  My vision is that this named argument paradigm brings a level of simplicity.
>  You have f.i. to cope with various statistical functions, you know they have
>  parameters called 'mean' and 'sd'.  You just construct  the calls this way:
>  somefunc(mean = this, sd = that). The interpreter will perform the matching.
>  So if you have two sets of functions written by two teams with different
>  calling conventions, you don't have to construct the call accordingly and
>  use either if construct either eval(). Handled within the Octave
>  interpreter, it's faster.
>
>  Regards
>

Pascal,

I think that the main problem is that you are proposing an addition
which is an improvement over matlab. You will have a hard time finding
someone to implement it since most of the effort is currently aimed at
providing better matlab compatibility.
So as a feature request this is nice, but it probably won't get
implemented in the near future.
However, If you instead submit a patch then it will greatly enhance
the probability of having this in octave.

Shai

Re: functions with 'named' arguments

by Jaroslav Hajek-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Mar 19, 2008 at 10:31 AM, Dupuis <Pascal.Dupuis@...> wrote:

>
>
>
>  Jaroslav Hajek-2 wrote:
>  >
>  > Indeed. But in Octave (and C,C++) it would create a syntax clash.
>  >
>
> > Perhaps. Unfortunately, it would also add a level of confusion for
>  > those not used to
>  > keyword argument calls, but used to assignment expressions.
>  >
>  >
>  For the first remark: a grep on all files under /usr/share/octave/3.0.0/
>  showed that the pattern
>  '([^<>!=)]*=[^=)]*)' is only found within strings or if clauses. So the
>  pattern of default arguments seems not yet used inside Octave sources.
>

I didn't mean it was used in Octave itself. Even if it was, that could
easily be remedied. Still, there may be many scripts out there using
this construct.

>  For the second remark: after reading a R tutorial, the notion doesn't look
>  confusing at all.
>

Nor does it seems after reading a Python tutorial or Fortran (95)
tutorial. It is even simulated in the Boost library for C++ (though I
doubt it is used much).

>  Now for some more substancials comments. A possible target for this 'named
>  arguments' paradigm is low-level functions. That is, functions which are
>  called often. Using such paradigm within the octave interpreter means faster
>  execution. Using parseparams means more code, more function calls, more
>  computer cycles, and so forth.
>  My vision is that this named argument paradigm brings a level of simplicity.
>  You have f.i. to cope with various statistical functions, you know they have
>  parameters called 'mean' and 'sd'.  You just construct  the calls this way:
>  somefunc(mean = this, sd = that). The interpreter will perform the matching.
>  So if you have two sets of functions written by two teams with different
>  calling conventions, you don't have to construct the call accordingly and
>  use either if construct either eval(). Handled within the Octave
>  interpreter, it's faster.

Yeah, no objections to that. No doubt it would be faster than using
cell arrays (& parseparams), but the question is how faster. I'd say
that unless someone implements
this and evaluates the speed improvements, this is too little of an argument.
Further, even if you did, one can as well implement parseparams as a
built-in or DLD,
and I bet it would be a competitive speed improvement (but much
smaller scale change).

>
>  Regards
>
>  Pascal
>  --
>  View this message in context: http://www.nabble.com/functions-with-%27named%27-arguments-tp16095584p16141847.html
>
>
> Sent from the Octave - Maintainers mailing list archive at Nabble.com.
>
>



--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

Re: functions with 'named' arguments

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 19-Mar-2008, Dupuis wrote:

| For the first remark: a grep on all files under /usr/share/octave/3.0.0/
| showed that the pattern
| '([^<>!=)]*=[^=)]*)' is only found within strings or if clauses. So the
| pattern of default arguments seems not yet used inside Octave sources.

It's used in one file in the current sources, and I would like to se
it used in more frequently to replace the code like

  if (nargin < 4)
    arg4 = default_val;
  endif

that we currently have in many Octave functions.

In any case, I think the use of default function parameter values in
function definitions is completely seprate from the issue of whether
to support named arguments which are used in function calls, not
function definitions.

The only way that we could support named arguments in funtion calls is
with some syntax other than "name = value".  I think we could use
"name := value" but as Shai noted, this would be an extension to
Matlab and those things tend to come back to bite us later.

Also, as others have said, the typical Matlab style is to either use
keyword/value pairs or an option structure and although it is a bit
more work for the person writing the function, I think it does provide
approximately the same functionality as named arguments for the person
calling the function.

Finally, if you really want this feature, then I think you should
submit a patch.  But I think that adding this feature might not be
trivial.  Here are some things to think about that may cause trouble:

  * Is there a way to separate positional arguments from named
    arguments, or are all arguments allowed to be passed as named
    arguments?

  * Do functions need to declare that they accept named arguments, or
    are named arguments automatically allowed for all function calls?

  * How should named arguments be handled when the argument is not
    supplied when the function is called?  For example, given the
    function definition

      foo (a, b)
        ...
      endfunction

    what happens when you call it with

      foo (b = 13)

    ?  What value is assigned to A?  What is the value of nargin?  If
    all functions are allowed to be called with named arguments, and
    some are allowed to be omitted when the function is called, then I
    think this will cause some trouble with the way nargin is usually
    used, since normally if nargin == 1, that means that the first
    argument has been provided.

So maybe adding named arguments is not just a simple matter of syntax.

jwe

Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


John W. Eaton wrote:
In any case, I think the use of default function parameter values in
function definitions is completely seprate from the issue of whether
to support named arguments which are used in function calls, not
function definitions.

The only way that we could support named arguments in funtion calls is
with some syntax other than "name = value".  I think we could use
"name := value" but as Shai noted, this would be an extension to
Matlab and those things tend to come back to bite us later.

Also, as others have said, the typical Matlab style is to either use
keyword/value pairs or an option structure and although it is a bit
more work for the person writing the function, I think it does provide
approximately the same functionality as named arguments for the person
calling the function.

Finally, if you really want this feature, then I think you should
submit a patch.  But I think that adding this feature might not be
trivial.  Here are some things to think about that may cause trouble:

  * Is there a way to separate positional arguments from named
    arguments, or are all arguments allowed to be passed as named
    arguments?

  * Do functions need to declare that they accept named arguments, or
    are named arguments automatically allowed for all function calls?

  * How should named arguments be handled when the argument is not
    supplied when the function is called?  For example, given the
    function definition

      foo (a, b)
        ...
      endfunction

    what happens when you call it with

      foo (b = 13)

    ?  What value is assigned to A?  What is the value of nargin?  If
    all functions are allowed to be called with named arguments, and
    some are allowed to be omitted when the function is called, then I
    think this will cause some trouble with the way nargin is usually
    used, since normally if nargin == 1, that means that the first
    argument has been provided.

So maybe adding named arguments is not just a simple matter of syntax.
Hello John,

It seems that R uses this approach:
1) create two lists, one (latter called the first) with the arguments from the function definition, one (latter called the second list) with the arguments in the actual call
2) move the named arguments from the second list to a third list, and remove the arguments with corresponding names from the first. Perform a second step of matching in a first come, first serve fashion.
3)  match the arguments from the third list to the arguments of the first list.
4) check for errors:
   - unmatched arguments in the first list without default value => error message, stop processing
   - extraneous, unmatched arguments, either named or not, and no varargin => error message, stop processing
    - extraneous arguments are moved to varargin
So:
- an argument without default value is mandatory
- extraneous arguments are either processed by subfunctions, or cause error message if unrecognised.

This means:
1) arguments are either positionnal or named, without anything special at the function definition
2)  foo = function(a, b) {
+ print(a);
+ print(b);
+ }
foo(b=3) => error message, a is missing and no default value given
foo(b=3, a=2) => disp a then b
foo(b=3, a=2, b=5) => error: formal argument b corresponding to more than one input argument

Finally, about the nargin: there should be something equivalent, but I still didn't find where.
Here are some examples of R calls: http://blog.moertel.com/articles/2006/01/20/wondrous-oddities-rs-function-call-semantics

Regards

Pascal

Re: functions with 'named' arguments

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 21-Mar-2008, Dupuis wrote:

| This means:
| 1) arguments are either positionnal or named, without anything special at
| the function definition
| 2)  foo = function(a, b) {
| + print(a);
| + print(b);
| + }
| foo(b=3) => error message, a is missing and no default value given
| foo(b=3, a=2) => disp a then b
| foo(b=3, a=2, b=5) => error: formal argument b corresponding to more than
| one input argument
|
| Finally, about the nargin: there should be something equivalent, but I still
| didn't find where.
| Here are some examples of R calls:
| http://blog.moertel.com/articles/2006/01/20/wondrous-oddities-rs-function-call-semantics

Does R provide the equivalent of a nargin function?  If not, maybe
this is the reason?  If it does, then how does it work, and would it
be compatible with what Octave and Matlab currently do?

jwe

Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


John W. Eaton wrote:

Does R provide the equivalent of a nargin function?  If not, maybe
this is the reason?  If it does, then how does it work, and would it
be compatible with what Octave and Matlab currently do?
Hello John,
sorry for the delay -- a few days of vacation.

I analysed R internals, more specifically the function src/bind.c which provides the implementation of the c(...) function, a general-purpose constructor for vectors. The only input argument is the "dotdotdot" operator, and it returns a vector. Basically similar to "[".

It works as follows: the internal engine receives a list built the Lipsy way: a set of nodes, each node made of a header and data. The header contains:
- a type indicator (float, complex, expression (formula), closure, environment, ...)
- an optional tag (the intended variable name, the left part of the = operator)
- pointers to previous and next element
The data is an union of the various types

This list ends with a special node serving as end-of-list marker. The iteration is quite simple, process the actual node, then its CDR, until the special marker is encountered.

With this approach, there is no (or better: I didn't find evidence of) nargin function, because the list is just processed linearly. The drawback is that the list length is not known in advance.

As I understand,  with a function like
function RP(x=r*cos(theta), y=r*sin(theta), r=sqrt(x*x+y*y), theta= atan2(x,y)) {
  return c(x, y, r, theta)
}

Examples of this function calls:
- RP(3, 5) : there are data for two arguments (positional matching), then r and theta are evaluated.
- RP(r = 1, theta= pi/3) : r and theta are initialised from the elements with corresponding tags, then x and y are evaluated.
What is called 'lazy loading' is just delayed evaluation of expressions, permitting a very concise approach in this case at the level of definition and default values for the arguments.

Regards

Pascal  

Re: functions with 'named' arguments

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On  3-Apr-2008, Dupuis wrote:

| I analysed R internals, more specifically the function src/bind.c which
| provides the implementation of the c(...) function, a general-purpose
| constructor for vectors. The only input argument is the "dotdotdot"
| operator, and it returns a vector. Basically similar to "[".
|
| It works as follows: the internal engine receives a list built the Lipsy
| way: a set of nodes, each node made of a header and data. The header
| contains:
| - a type indicator (float, complex, expression (formula), closure,
| environment, ...)
| - an optional tag (the intended variable name, the left part of the =
| operator)
| - pointers to previous and next element
| The data is an union of the various types
|
| This list ends with a special node serving as end-of-list marker. The
| iteration is quite simple, process the actual node, then its CDR, until the
| special marker is encountered.
|
| With this approach, there is no (or better: I didn't find evidence of)
| nargin function, because the list is just processed linearly. The drawback
| is that the list length is not known in advance.
|
| As I understand,  with a function like
| function RP(x=r*cos(theta), y=r*sin(theta), r=sqrt(x*x+y*y), theta=
| atan2(x,y)) {
|   return c(x, y, r, theta)
| }
|
| Examples of this function calls:
| - RP(3, 5) : there are data for two arguments (positional matching), then r
| and theta are evaluated.
| - RP(r = 1, theta= pi/3) : r and theta are initialised from the elements
| with corresponding tags, then x and y are evaluated.
| What is called 'lazy loading' is just delayed evaluation of expressions,
| permitting a very concise approach in this case at the level of definition
| and default values for the arguments.

We don't need nargin to make argument lists work in Octave.  But we do
need to be able to provide a nargin function, and it must remain
compatible with the way that nargin currently works in Octave and
Matlab.  It doesn't matter what R does or doesn't do.

If you want named arguments to work in Octave, then I think you should
provide a patch that makes it work.  If you don't want to do that, or
you can't do it, then at the very least provide a propsal or how the
feature is supposed to work *in Octave*.  You don't need to get into
implementation details.  You need to explain what the syntax and the
semantics of the feature should be.  Your proposal must not break the
current semantics of nargin, as doing that would break too much
existing code.  If you provide a complete proposal for how the feature
is supposed to work, then perhaps someone else who can implement it
will agree that it is a useful feature and do the work to prepare a
patch.

jwe

Re: functions with 'named' arguments

by Dupuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


John W. Eaton wrote:

We don't need nargin to make argument lists work in Octave.  But we do
need to be able to provide a nargin function, and it must remain
compatible with the way that nargin currently works in Octave and
Matlab.  It doesn't matter what R does or doesn't do.

If you want named arguments to work in Octave, then I think you should
provide a patch that makes it work.  If you don't want to do that, or
you can't do it, then at the very least provide a propsal or how the
feature is supposed to work *in Octave*.  You don't need to get into
implementation details.  You need to explain what the syntax and the
semantics of the feature should be.  Your proposal must not break the
current semantics of nargin, as doing that would break too much
existing code.  If you provide a complete proposal for how the feature
is supposed to work, then perhaps someone else who can implement it
will agree that it is a useful feature and do the work to prepare a
patch.
For the first part: historically, Matlab introduced mechanisms to provide generic functions, first at a fixed number of variables arguments, think about optim which in the past accepted the function to be optimised to have extra arguments P0 up to P9; latter it was replaced by the varargin mechanism. In both cases, this mean
- for the programmer: to decide the role and the meaning of each supplemental arg. Inside the function, (s)he has to take actions based upon the number of inputs args, their types, and so on.
- for the user: (s)he must know that the third supplemental arg is for such variable, that for some special treatment the fourth arg must be at logical one, and so on.
This mechanism is error prone: f.i., is some args may be either 'LS', either 'TLS', care has to be exercised in order to avoid confusion. Furthermore, unrecognised arguments must be explicitelly detected. A last point is that evaluation occurs early: f(3*sin(x)) requires to perform the computation before calling the function, the mechanism is basically call-by-value

With the suggested approach, the function interface is clearly defined, the list of accepted supplemental args is public, and the matching between user-supplied values and function variables is performed by the internal engine. Unrecognised arguments are automatically trapped, default values are evident from the function definition. This is a comfort both for the programmer and the user. The programmer knows the right variable will be initialised, no need to care for nargin or testing. The user has the choice to modify default variables with a great flexiblity. Evaluation occurs latter, and only when required.

From a performance perspective:
- compilation will require a bit more work
- run-time won't change because the bindings must be known at compile-time  
- in both cases, the parsing and error-detection code will disappear from the function body

About the nargin function, the number of arguments is detectable at compile time, so this doesn't introduce compatibily problems. To me, the biggest change is a switch between two schemes:
- actual: early evaluation, variables evaluated in the context of the caller
- proposed: delayed evaluation, binding occurs in the context of the callee

For the second part: I don't have enough free time to program the suggested features in a reasonnable delay, and I strongly agree that we should think first, and program later. So here is my tentative proposal:
This is a suggestion to introduce tagged arguments into Octave. The purpose is to define a mechanism where a function interface can be totally determined at compile-time. The interface will be defined by the three following parts, in the specified order:
1) [optionnal] required arguments
2) [optionnal] variable number of unspecified arguments
3) [optionnal] optionnal arguments, with  a known default value if they're not initialised within the function call

-Required arguments are in the form tag_1, tag_2, and so on. Failure to not provide them all must be detected as a compile-time error.
-Unspecified arguments are in the form "..."
-A tagged argument is of the form: tag_1 =  some expression, tag_2 = another expression, and so on. The tag will be used to perform the binding between some user-supplied value or expression and the corresponding local variable in the function environment. The binding will be order-insensitive : f(tag_1=some val, tag_2 = other val)
and
 f(tag_2 = other val, tag_1 = some val)
will result into the same initialisation for the local variables; and partial match will be allowed, if unambiguous: let some function be defined as f(mean=1, median=2). Then
f(m=5); f(me=7) are ambigous
f(mea=4); f(med=9) are unambigous,
and f(meanvalue=3); f(sd=4) will result in error. In the same way, f(mean=2, mean=4) must result in an error, 'Repeated input value'

At the function call, the binding must occur this way:
1) tagged arguments are tentativelly matched with the function definition. Mismatched or ambiguous ones must be affected to the varargin list, if any, or flagged as error.
2) untagged arguments are then matched on a first come, first serve basis, but skipping over the arguments already bound thanks to a matching tag

The initialisation must be performed this way:
1) expressions in the function call must be performed in the caller context, in order to obtain a value for each argument. F.i., f(3*sin(x)) must become f(some_value)
2) local variables from the argument list, where a bounding is available through the actual function call, will receive the bounded value
3) local variables from the argument list without bounding and a constant initialiser, will receive this initialiser
4) local variables from the argument list without bounding and where the initialiser is an expression at the level of the function DEFINITION, will receive the value computed from the expression, where the variables will be evaluated in the function context first. If, at the time of the evaluation, some of the variables are still not initialised, this must be trapped as an error

Basically:
1) is similar to actual behaviour
2) is not available actually, and we have to choose wether we introduce a new operator or break compatibility, as now f(x=3) will result in x being set in the caller context
3) this is similar to the new syntax where initialiser are permitted
4) this a behavioural change: the function definition may provide some expression which will be evaluated, if required, in the function context


Regards

Pascal
< Prev | 1 - 2 | Next >