On particles, Sources, fields

View: New views
7 Messages — Rating Filter:   Alert me  

Parent Message unknown On particles, Sources, fields

by Christos KK Loverdos :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[moving to scala-debate, was Re: [scala-user] feedback on  
scala.io.Source]
Hi all,

This email is inspired by the Source.getLines situation.

I feel a little awkward with the Source API. I admit I have rarely  
used Source and also I have not followed the implementation details.  
So, think of my comments as being of a general design-oriented nature.

I believe we MAY need to decide the target audience of the Source API  
and I see two distinctions here. The reason I say this is because  
Source appears to be a low-level API, with all of its support for  
defining line endings etc AT THE CALL SITE, but it seems its main uses  
are more higher-level ones.

First, we have the regular, application user, who just wants to  
iterate over the lines of a text file in the most straightforward way.  
Such an application user does not even have to specify the line  
ending, simply because he is lazy and does not care. This is common  
grounds. Note that "does not have to specify" should mean "he doesn't  
get a chance to specify" and the API must enforce the latter.
==> I feel this higher-level API is what Source should be.

Second, we have some library user, strictly lower-level than the  
application user, who truly wants some better control of what is going  
underneath. This kind of user may need to use a few more parameters to  
define the exact behavior of Source's implementation. Thus, we give  
him the ability to control line endings etc at the call site.
==> I feel that currently the Source API fails to support such a lower-
level goal, although it pretends to. For instance, a lower-level user  
would want the line endings to appear at the returned lines.

But now look what is happening. I have a getLines method (actually  
"getLines()" with the parentheses for the casual, application-level  
user) which is broken. The line-ending defaults to the platform one,  
instead of defaulting to WHATEVER LINE ENDING THE FILE OF INTEREST  
has. So, If I am on Unix and read a file generated with the \r\n  
Windows endings, then I get the \r in my lines. This violates the  
principle of least surprise.

I think the heart of the problem is we have not sat down and WRITE the  
requirements of the Source API from a user perspective. We are in such  
a hurry in delving into implementation and abstracting away  
implementation things, instead of abstracting away at a higher level.  
Of course, you should understand by now (via my general involvement in  
the Scala community) that I have deepest respect for all an each one  
contributor, so the above statement is not a (poisonous) arrow. Or, to  
put it otherwise, if your brain insists on considering it as such,  
please count me in the target group.

I have the impression that another possible cause may be the fact that  
Source is burdened with too many concerns (is it true?). It could be  
better to separate those concerns into different implementation units.  
But then again, we need to write down those concerns first.

I am making all these comments because I do not want to see another  
lame (as the community has decided, AFAIK) Source creep into Scala  
3.0, which is a major release. I generally feel we do not have to  
stuff as many new features as possible into 3.0.

Of course, I may be off... But then again, this is debate and I beg  
for your...

...Thoughts?


Christos

P.S.1 As a side-note, the default line-endings approach could be more  
meaningful when we WRITE stuff, than when we READ.
P.S.2 The funny subject is inspired by Julian Schwinger's book title http://bit.ly/uUJSH

On Aug 18, 2009, at 2:36 AM, Paul Phillips wrote:

> On Mon, Aug 17, 2009 at 04:30:40PM -0700, Silvio Bierman wrote:
>> If this is something that is a consequence of default parameter  
>> values
>
> It is.
>
>> Is this a temporal imperfection of the current implementation?
>
> This was a conscious compromise among several competing concerns.  It
> was determined that you can't leave off the parameter list entirely if
> there are defaults.  You can probably find all the discussion in the
> scala-internals archive.
>
> --
> Paul Phillips      | Where there's smoke, there's mirrors!
> Apatheist          |
> Empiricist         |
> ha! spill, pupil   |----------* http://www.improving.org/paulp/ 
> *----------

--
  __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com






Re: On particles, Sources, fields

by Paul Phillips-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

FYI I totally agree that it should just deal with whatever line ending
it finds, and that's the way I originally wrote it, and that's what will
be the default if it's up to me.  However most of the annoying work I'm
doing here revolves around the decision to leave the platform default
encoding in place when it's left unspecified.  Line ending defaults
aren't the same as charset defaults, but it is all tied together if one
seeks anything resembling consistency.

--
Paul Phillips      | One way is to make it so simple that there are
Stickler           | obviously no deficiencies. And the other way is to make
Empiricist         | it so complicated that there are no obvious deficiencies.
i pull his palp!   |     -- Hoare

Re: On particles, Sources, fields

by Kevin Wright-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Even at system level, I'd want to just have a bunch of lines that I can iterate over - without caring for the exact nature of the delimiters.  Including the line terminators in this scenario would be equivalent to putting "," into every other element of a string array.

I'm struggling to imagine a scenario where I could open a file that contained mixed line endings, or where I'd need to handle \r, \n, \r\n, or \n\r differently within a single file, not unless you truly intend for \r to be a carriage return without a line feed so that you can overwrite text to give underlined or bold characters.  I believe this technique died out with punch cards, although I fully appreciate the need to process and import old data sets...

Agreed, once a file is opened then any subsequent appended lines should match whatever encoding is already in place.  This is an important requirement and should be clearly specified.  It should also be transparent to the user of the library.


Anything deeper than this I would argue that you aren't actually working with lines at all, and should really be looking at a character-level API.



On Wed, Aug 19, 2009 at 4:57 PM, Christos KK Loverdos <loverdos@...> wrote:
[moving to scala-debate, was Re: [scala-user] feedback on scala.io.Source]
Hi all,

This email is inspired by the Source.getLines situation.

I feel a little awkward with the Source API. I admit I have rarely used Source and also I have not followed the implementation details. So, think of my comments as being of a general design-oriented nature.

I believe we MAY need to decide the target audience of the Source API and I see two distinctions here. The reason I say this is because Source appears to be a low-level API, with all of its support for defining line endings etc AT THE CALL SITE, but it seems its main uses are more higher-level ones.

First, we have the regular, application user, who just wants to iterate over the lines of a text file in the most straightforward way. Such an application user does not even have to specify the line ending, simply because he is lazy and does not care. This is common grounds. Note that "does not have to specify" should mean "he doesn't get a chance to specify" and the API must enforce the latter.
==> I feel this higher-level API is what Source should be.

Second, we have some library user, strictly lower-level than the application user, who truly wants some better control of what is going underneath. This kind of user may need to use a few more parameters to define the exact behavior of Source's implementation. Thus, we give him the ability to control line endings etc at the call site.
==> I feel that currently the Source API fails to support such a lower-level goal, although it pretends to. For instance, a lower-level user would want the line endings to appear at the returned lines.

But now look what is happening. I have a getLines method (actually "getLines()" with the parentheses for the casual, application-level user) which is broken. The line-ending defaults to the platform one, instead of defaulting to WHATEVER LINE ENDING THE FILE OF INTEREST has. So, If I am on Unix and read a file generated with the \r\n Windows endings, then I get the \r in my lines. This violates the principle of least surprise.

I think the heart of the problem is we have not sat down and WRITE the requirements of the Source API from a user perspective. We are in such a hurry in delving into implementation and abstracting away implementation things, instead of abstracting away at a higher level. Of course, you should understand by now (via my general involvement in the Scala community) that I have deepest respect for all an each one contributor, so the above statement is not a (poisonous) arrow. Or, to put it otherwise, if your brain insists on considering it as such, please count me in the target group.

I have the impression that another possible cause may be the fact that Source is burdened with too many concerns (is it true?). It could be better to separate those concerns into different implementation units. But then again, we need to write down those concerns first.

I am making all these comments because I do not want to see another lame (as the community has decided, AFAIK) Source creep into Scala 3.0, which is a major release. I generally feel we do not have to stuff as many new features as possible into 3.0.

Of course, I may be off... But then again, this is debate and I beg for your...

...Thoughts?


Christos

P.S.1 As a side-note, the default line-endings approach could be more meaningful when we WRITE stuff, than when we READ.
P.S.2 The funny subject is inspired by Julian Schwinger's book title http://bit.ly/uUJSH

On Aug 18, 2009, at 2:36 AM, Paul Phillips wrote:

On Mon, Aug 17, 2009 at 04:30:40PM -0700, Silvio Bierman wrote:
If this is something that is a consequence of default parameter values

It is.

Is this a temporal imperfection of the current implementation?

This was a conscious compromise among several competing concerns.  It
was determined that you can't leave off the parameter list entirely if
there are defaults.  You can probably find all the discussion in the
scala-internals archive.

--
Paul Phillips      | Where there's smoke, there's mirrors!
Apatheist          |
Empiricist         |
ha! spill, pupil   |----------* http://www.improving.org/paulp/ *----------

--
 __~O
-\ <,       Christos KK Loverdos
(*)/ (*)      http://ckkloverdos.com







Re: On particles, Sources, fields

by Ricky Clarkson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I'm struggling to imagine a scenario where I could open a file that
> contained mixed line endings

Most of our code and related activity uses whatever Linux uses for
line endings.  One pom.xml contains Windows line endings.  I edited
this with vim in Linux, adding in Linux-style line endings.  This file
now has mixed line endings.  I am quite glad that this does not
perturb anything I use to open files.

--
Ricky Clarkson
Java Programmer, AD Holdings
+44 1565 770804
Skype: ricky_clarkson
Google Talk: ricky.clarkson@...

Re: On particles, Sources, fields

by Kevin Wright-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Wed, Aug 19, 2009 at 6:36 PM, Ricky Clarkson <ricky.clarkson@...> wrote:
> I'm struggling to imagine a scenario where I could open a file that
> contained mixed line endings

Most of our code and related activity uses whatever Linux uses for
line endings.  One pom.xml contains Windows line endings.  I edited
this with vim in Linux, adding in Linux-style line endings.  This file
now has mixed line endings.  I am quite glad that this does not
perturb anything I use to open files.

Wow, now that's fast feedback with an example.  So close to home as well...
 

--
Ricky Clarkson
Java Programmer, AD Holdings
+44 1565 770804
Skype: ricky_clarkson
Google Talk: ricky.clarkson@...


Re: On particles, Sources, fields

by Randall Schulz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday August 19 2009, Ricky Clarkson wrote:
> > I'm struggling to imagine a scenario where I could open a file that
> > contained mixed line endings
>
> Most of our code and related activity uses whatever Linux uses for
> line endings.  One pom.xml contains Windows line endings.  I edited
> this with vim in Linux, adding in Linux-style line endings.  This
> file now has mixed line endings.  I am quite glad that this does not
> perturb anything I use to open files.

I guess Vim has lots of modes, 'cause in my experience (on Linux) it
adapts to the line-endings of the file being edited and preserves it
when you save it unless you explicitly do something like:

:set ff=unix

or

:set ff=dos


And if issuing one of these commands actually changes the files's bytes,
you'll have to save it (or explicitly discard the changes) to quit,
just as if you'd used a Vim editing command.


Randall Schulz

Re: On particles, Sources, fields

by Kevin Wright-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Wed, Aug 19, 2009 at 6:36 PM, Ricky Clarkson <ricky.clarkson@...> wrote:
> I'm struggling to imagine a scenario where I could open a file that
> contained mixed line endings

Most of our code and related activity uses whatever Linux uses for
line endings.  One pom.xml contains Windows line endings.  I edited
this with vim in Linux, adding in Linux-style line endings.  This file
now has mixed line endings.  I am quite glad that this does not
perturb anything I use to open files.
 
I just thought... Doesn't SVN reformat line endings as appropriate for the local environment?