|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 | Next > |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Thu, 16 Jul 2009, Gerhard Fiedler wrote: > sergio masci wrote: > > >>> It is now clear to me that you are talking about intrinsic > >>> functions. Yes the function is defined in a /standard/ library but > >>> the compiler also knows about the function independently of the > >>> library. > >> > >> Would you call the C++ standard library functions (like > >> std::list::insert) "intrinsic functions"? At least the meaning that > >> this term seems to have in the Microsoft VC++ and gcc compiler > >> documentation is not what I'm talking about. > > > > No. > > But this is what I'm talking about. So I was right with my suspicion > that I wasn't talking about what you called "intrinsics". > > > > home in on "'knows' the intent of the line" > > This is what I'm trying to do. This is the reason why I want to > understand what it is that makes a function that is defined by the > standard (and is implemented in a library) different from a construct > that is defined in the same standard and implemented inside the > compiler. I haven't yet seen an example by you that I could understand > -- and that wasn't about something else. *** YOU *** are saying "'knows' the intent of the line". *** I *** am saying"'knows' the intent of several seperate lines as one unit" You are happy to see: X = X + 0; reduced to NOTHING (not a critisim). You have an expectation of this because the compiler understands the intent of the line. I am saying something like: j = x * 2 + y * 2; temp = arr[j]; j = (x + y) * 2; arr[j] = temp; should also be reduced to nothing *** AND *** the compiler should warn the user that this piece of code has no effect and so may be in error. This is not possible if you understand only the intent of single lines but can be fudged by the compiler by keeping track of what is evaluated and where the result is placed. So although the compiler might be able to reduce it to zero it doesn't understand that this might be an error. Things get much more difficult if we re-write the above as: int func_get(int arr[], int x, int y) { int j; j = x * 2 + y * 2; return arr[j]; } void func_set(int arr[], int x, int y, int val) { k = (x + y) * 2; arr[k] = val; } temp = func_get(arr, x, y); func_set(arr, x, y, temp); Ok the compiler *** MIGHT *** still be able to hack this if it is very smart and all the source is available for the functions. BUT change the type of the array from a straight forward int to a struct and things get incredibly complicated (I mean they are really complicated now with int but they will get MUCH worse with a struct :) Here having built-in type like STRING, LIST etc reduces the complication because the lanuage and compiler control the way they are used. It's like the difference between a menu interface and a command line interface. With the menu interface you guide the users interactions and there can be no unexpected commands issued that will break things (kind of). > >> Right. My point was that if 8-bit strings are built-in and Unicode > >> strings are in a library, and you are claiming (elsewhere) that the > >> built-in syntax can be different from library syntax, then I need to > >> make the Unicode string syntax completely different -- structurally > >> different -- from 8-bit string syntax. Can you imagine that? What a > >> pain. > > > > Yes you are right it would be a pain. But it is a move in the right > > direction. > > I'm not sure. I think I wouldn't like it. For example, it seems that for > some things Pascal-style strings are more efficient than C-style > strings. There's nothing that prevents me from working with Pascal-style > strings efficiently in C++ (and no matter whether ASCII, 8-bit with > different codepages, different encodings of Unicode) -- in the same > idiom that is used for standard strings in C++. This is because strings > are /not/ built into the language (among other things). > > Being able to work with similar types in a similar way is important for > code quality. If you need to use a different idiom for different string > encodings, code quality goes down -- and code quality is important. But the same argument could be made for floating point. Many people require different precission (wheather greater or less than that provided by C) yet they either use what is available in a way that suits them of they use a specialised library *** AS WELL *** as the supported floating point. Often people will fit the solution to the tools available. > > > > I think I understand what you mean. But just to clarify: You are > > suggesting that instead of simply defining a function as: > > > > substr(char *, int); > > substr(char *, int, int); > > > > That I could instead write: > > > > substr(char * 'FROM' int); > > substr(char * 'FROM' int 'TO' int); > > substr(char * 'FROM' int 'LENGTH' int); > > > > Interesting. > > That's one possible form, but it's not quite what I meant. What I meant > is that you have a way to declare the function and syntax so that you > actually can write > > SUBSTR str1 FROM pos1 TO end > > and the compiler knows what library to call in which way. This doesn't > look too complicated, and it's not actually that different from > > SUBSTR_FROM_TO str1, pos1, end > > Just a slightly different syntax. > > > But you'd still need to be able to attach a boat load of attributes to > > the function to give the compiler the same capabilities it would have > > if these functions were actually built-in language constructs and the > > compile times would be horrendous. > > I don't understand why. This doesn't seem to be much more complicated to > parse than a normal function call. It's a starting token SUBSTR, > followed by five tokens that have to be three expressions interspersed > with FROM and TO. It's not the parsing that's the problem. The problem is providing information to the compiler about the way these functions interact. I'm not talking about the calling protocol (the way parameters are evaluated and pased on the stack or the result is returned) I'm talking about how functions relate to each other over several lines of code. Consider this: void process_substr(char *str, int pos1, int pos2) { char *tmp_str; int len; len = pos2 - pos1; tmp_str = malloc(len + 1); strncpy(tmp_str, (str + pos1), len); ... ... ... free(tmp_str); } Now turn this into a real example: char * string_alloc(int len) { char *tmp_str; tmp_str = malloc(len + 1); return tmp_str; } void string_assign(char *dst_str, char *src_str) { strcpy(dst_str, src_str); } char * string_substr(char *str, int pos1, int pos2) { static char *own_str; int len; len = pos2 - pos1; strncpy(own_str, (str + pos1), len); return own_str; } void string_release(char *str) { free(str); } void process_substr(char *str, int pos1, int pos2) { char *tmp_str; int len; len = pos2 - pos1; tmp_str = string_alloc(len + 1); string_assign(tmp_str, string_substr(str, (str + pos1), len); ... ... ... string_release(tmp_str); } in the above, how would I tell the compiler the relationship between string_alloc, string_assign, string_substr and string_release such that the compiler is able to: (1) ... indicate an error if tmp_str is not initialised with string_alloc before string_assign is used on it (simply assigning a value to tmp_str is not what I mean, I mean actually using string_alloc to initialise it). (2) ... indicate an error if tmp_str goes out of scope before it is cleaned up with string_release (3) ... optimise the combined use of string_assign and string_substr (4) ... recognise that tmp_str is undefined after the use of string_release This is why you need to be able to add special attributes to the functions, to be able to tell the compiler about all this stuff. And yes I know about C++ (I've been using it for many years) but creating a boat load of classes to *** TRY *** to do the same thing is not the same. > > > I think it would be do-able but you'd still need some kind of meta > > language to describe how these functions would interact with each > > other, their parameters and local variables that are used with them > > (from outside the call and not just as a parameter). > > I'm not sure what you mean by "interaction". The interaction of > functions is defined by the language standard that defines what they do. > Their arguments would be described just as they are in any procedural > language. You'd need a bit of a meta language to define such a > construct, but not much I think. I don't think this goes much further > than a normal function declaration; just add to the "<typename> > <identifier>" pairs a possible pair "KEYWORD <identifier>". > One of the reasons we have a simplified common protocol that allows any function to be called from anywhere in the program is to eliminated the complexity that would otherwise come about by trying to juggle the actual prameters used in-line (where the function is called) with the formal parameters (how the parameters are defined within the function). Consider this int func_A(float x) { ... } int val_1; unsigned char val_2; float val_3; func_A(val_1); func_A(val_2); func_A(val_3); in each of these cases C ensures that the actual paraemeters val_1, val_2 and val_3 are floats before passing them to func_A. If they are not already floats, it promotes them. Furthermore actual the parameter (val_1, val_2 and val_3) is copied (promoted first if necessary) to a place where the function expects it to be when it is called. The function does not operate on val_1, val_2 or val_3 directly. Something similar happens for the return value. All of this is the calling protocol. So yes the the compiler "understands" a standard library function in as much as it "understands" its parameter requirements. Other than ensuring that actual parameters are promoted to the correct type and copied to the correct place the compiler does very little else across the function boundry. Some compilers are much more intelligent than others and try to see through (or even punch through) the function boundry but there again there is only so much that the compiler can do because of the complexity of the function. Regardless it is still very difficult for the compiler to combine information across the function boundry because the function needs to stick to the calling protocol if it is to be used elsewhere. Inline functions reduce the calling protocol burden because it becomes possible to tailor each instance of the called function to the place where it is called. Howver there is only so much you can do with this because these inline functions may themselves call other system functions which are not inline. AND this doesn't help with the "understanding" of the relationships between functions (see the string_assign example I gave above). For this you still need some way of adding specialised attributes to a function which tell the compiler about these relationships. C++ templates give the illusion that the compiler really understands what's going on - it still doesn't. What's happening is that a lot of stuff gets expanded "inline" (not just functions). So the compiler is able to do lots of optimisations. For example it might see that an int is being promoted to a float so that it can be passed to a function (method) which is then converting it to an int to pass it to another function (method). The compiler might then simply allow the promoting / converting to cancel each other and use the original. This looks to the user as if the compiler understood what is going on. It doesn't. It no more understands than a calculator does when you press the -/+ key twice to get to the original value. The C++ compiler still doesn't understand the relationships between functions. The programmer does and he arranges all the objects to get sensible optimised results. > > > An intrinsic function is one that the compiler understands intimately. > > Not just what parameters it takes and what result it returns. > > Next step is to define what "understands intimately" means. Does a > specification (like in a language standard) qualify? > No. > > In C++ you can actually write your own add ('+') operator function. > > Which would be different from the standard operator+() functions, in > that the compiler doesn't "know" what they are supposed to be doing and > has to treat them as normal function calls (in the case of C++ with the > special rules that the language standard defines for operator+(), of > course). > > > >> Take any standardized language with a standardized library. For > >> example, the C++ standard (the real one costs money, but here is a > >> not quite up to date one <http://www.open-std.org/jtc1/sc22/open/n2356/>). > >> And of course, anything that's missing /could/ be there -- it just > >> isn't. > >> > >> Or C# (ECMA-336) and the .NET CLI (ECMA-335), even though that's > >> possibly a bit different. > > > > Ok I've had a look on the net and it's just template stuff. As I've > > said before, the compiler isn't really understanding intent here it's > > just mechanically churning out code and reducing it as much as > > possible. It gives the illusion that it understands what is going on > > because so much source code is being condensed into a small > > executable but the reallity is that all that source code is hand > > holding the compiler and telling it exactly what to generate. > > Not sure whether the standard can tell you what kind of optimizations > actual compilers implement. Much of this is probably a trade secret. > Then I'm not convinced a casual look at this is enough to find out what > /could/ be implemented. They don't talk about optimizations in the > standard; they just say what has to be the result. > > Going back to one of your original arguments, the one that prompted my > question about what the difference is between list operations that are > implemented by the compiler and list operations that are implemented in > a standard library: the substitution of a delete from a list followed by > an insert into a list at the same location by a replace. > > If you want to implement such an optimization in your compiler, what's > the difference between having these functions as part of the compiler, > or having them defined as standard library functions in the standard > that the compiler is based upon? In both cases, the compiler "knows" > that this substitution is possible and the exact implementation of the > three functions is not necessary to be known for this optimization to be > possible. Yes I understand what you mean here. But in the case of functions this relies on the compiler actually recognising the function names and executing some internal code to perform checks, issue warning (or errors) and perform optimisations. That internal compiler code needs to be activated somehow and it needs to perform a very specific task (not something that would be there anyway - somthing that is written specially for that one perpose). If you really wanted to you could say that the function names are undefined to the compiler but then you need some way of telling the compiler which functions are special (what their names are) and what is special about them (under special circustances you can replace func_A + func_B with func_C). This is why you would need to be able to add special attributes to a function, so that you can tell the compiler that these are special functions that need to be handled in a special way (execute the special internal code for them). (sorry about all the specials :-) Other than having special attributes you really would need to make these functions intrinsic (known and understood by the compiler) so that it can execute its special internal code when it sees them in the users source code. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
> *** YOU *** are saying "'knows' the intent of the line". *** I *** > am saying"'knows' the intent of several seperate lines as one unit" Ok. But so far you didn't explain how this is different for language elements that are implemented in the compiler versus language elements that are implemented in a library. The compiler can only "know" a priori what a single language element is doing (for both the built-in elements and the ones that are implemented in a standard library); the rest it has to figure out based on the program -- independently of where exactly the language elements are implemented. > I am saying something like: > > j = x * 2 + y * 2; > > temp = arr[j]; > > j = (x + y) * 2; > > arr[j] = temp; > > should also be reduced to nothing *** AND *** the compiler should warn > the user that this piece of code has no effect and so may be in > error. This is not possible if you understand only the intent of > single lines but can be fudged by the compiler by keeping track of > what is evaluated and where the result is placed. So although the > compiler might be able to reduce it to zero it doesn't understand > that this might be an error. How is this dependent on where the involved operators are defined -- as long as the compilers "knows" what they are supposed to do? > Things get much more difficult if we re-write the above as: > > int func_get(int arr[], int x, int y) > { > int j; > > j = x * 2 + y * 2; > > return arr[j]; > } > > void func_set(int arr[], int x, int y, int val) > { > k = (x + y) * 2; > > arr[k] = val; > } > > temp = func_get(arr, x, y); > > func_set(arr, x, y, temp); > > Ok the compiler *** MIGHT *** still be able to hack this if it is very > smart and all the source is available for the functions. BUT change > the type of the array from a straight forward int to a struct and > things get incredibly complicated (I mean they are really complicated > now with int but they will get MUCH worse with a struct :) I don't doubt that it's incredibly complicated, but I don't see how the exact implementation of the various operators affects this optimization. This optimization is based on the compiler "knowing" that (x * 2 + y * 2) == ((x + y) * 2), no matter how exactly they are calculated. If this identity can be derived from what's defined in the standard about the involved operators, then this optimization can be made, no matter where the operators are implemented. If it's not defined in the standard (for example, for typical floating point variables, this identity is generally not given), such an optimization can't be made -- no matter where the operators are implemented. > Here having built-in type like STRING, LIST etc reduces the > complication because the lanuage and compiler control the way they > are used. It's like the difference between a menu interface and a > command line interface. With the menu interface you guide the users > interactions and there can be no unexpected commands issued that will > break things (kind of). I still don't see this. You repeat that there is a difference, but don't really say where the difference is. The analogy with the menu doesn't seem to be very helpful; if I have a list defined with the operations insert, delete, replace, find, the "menu" is the same whether they are implemented in the compiler or in a standard library. It seems you forget that the library I'm talking about has an interface that's as strictly and precisely defined as the language itself, by the same language standard. Can you give an example where a programmer could give an "unexpected command" in the case the list is implemented in a standard library, versus the programmer not being able to give that "unexpected command" in the case it is implemented in the compiler? (Keep in mind that in both cases it is defined in the language standard what the programmer can do with the list -- and it's the exact same definition.) > But the same argument could be made for floating point. Many people > require different precission (wheather greater or less than that > provided by C) yet they either use what is available in a way that > suits them of they use a specialised library *** AS WELL *** as the > supported floating point. > > Often people will fit the solution to the tools available. I didn't understand the point you're making here (WRT the discussion about built-in versus standard library). Yes, I think it's helpful for both strings and floating-point when a user can use custom libraries instead of the implementations built into the compiler. And I also think that this only makes real sense if the syntax is the same in both cases. And IMO this means that there can't be any syntax advantage for the implementations that are built into the compiler. > in the above, how would I tell the compiler the relationship between > string_alloc, string_assign, string_substr and string_release such that > the compiler is able to: > > (1) ... indicate an error if tmp_str is not initialised with string_alloc > before string_assign is used on it (simply assigning a value to tmp_str is > not what I mean, I mean actually using string_alloc to initialise it). > > (2) ... indicate an error if tmp_str goes out of scope before it is > cleaned up with string_release > > (3) ... optimise the combined use of string_assign and string_substr > > (4) ... recognise that tmp_str is undefined after the use of > string_release > > This is why you need to be able to add special attributes to the > functions, to be able to tell the compiler about all this stuff. If I understand you correctly, this wouldn't be defined as attribute of the function, but it would be defined in the language standard, /before/ any implementation of a compiler or a standard library. Every library function I'm talking about is assumed to be defined in the same language standard where you define how your built-in list works. > in each of these cases C ensures that the actual paraemeters val_1, > val_2 and val_3 are floats before passing them to func_A. FWIW, I'm not a friend of this. I prefer explicit casts, and if I had a way to switch off all implicit casts in C or C++, I'd probably do it. (At least I'd start doing it to see how it works out.) But this is not related to our discussion about built-in vs library. > [...] All of this is the calling protocol. > > So yes the the compiler "understands" a standard library function in > as much as it "understands" its parameter requirements. This is not what I mean. This is what the compiler "knows" about any function, standard or user-defined. It "knows" more (or could "know" more) about standard library functions -- as much as there is defined in the language standard. > The C++ compiler still doesn't understand the relationships between > functions. The programmer does and he arranges all the objects to get > sensible optimised results. Whether it actual does or doesn't doesn't really matter here. What matters is whether this is /possible/ -- and whether there's a difference between implemented in the compiler or in a standard library. One of your first points was the replacement of a delete followed by an insert with a replace, for a list. This is possible because the compiler "knows" what delete, insert and replace do to a list, and how the list is supposed to work. It needs a definition for this. If this definition is given in the language standard, I still fail to see how it matters whether the list is implemented in a standard library or in the compiler itself. >>> An intrinsic function is one that the compiler understands >>> intimately. Not just what parameters it takes and what result it >>> returns. >> >> Next step is to define what "understands intimately" means. Does a >> specification (like in a language standard) qualify? > > No. I guess we don't really need to define what intrinsic functions are (and consequently what "understands intimately" means for you), as it's not what seems to be important in the question what exactly is the difference in optimizing sequences of commands, depending on whether they are implemented in the compiler or in a standard library (always given that both implementations follow the same language standard). >> Not sure whether the standard can tell you what kind of optimizations >> actual compilers implement. Much of this is probably a trade secret. >> Then I'm not convinced a casual look at this is enough to find out >> what /could/ be implemented. They don't talk about optimizations in >> the standard; they just say what has to be the result. >> >> Going back to one of your original arguments, the one that prompted >> my question about what the difference is between list operations >> that are implemented by the compiler and list operations that are >> implemented in a standard library: the substitution of a delete from >> a list followed by an insert into a list at the same location by a >> replace. >> >> If you want to implement such an optimization in your compiler, >> what's the difference between having these functions as part of the >> compiler, or having them defined as standard library functions in >> the standard that the compiler is based upon? In both cases, the >> compiler "knows" that this substitution is possible and the exact >> implementation of the three functions is not necessary to be known >> for this optimization to be possible. > > Yes I understand what you mean here. But in the case of functions > this relies on the compiler actually recognising the function names > and executing some internal code to perform checks, issue warning (or > errors) and perform optimisations. Yes. > That internal compiler code needs to be activated somehow and it needs > to perform a very specific task (not something that would be there > anyway - somthing that is written specially for that one perpose). Yes. Just as with your insert+delete==replace example. The compiler can do this, independently of where these functions are implemented. Pretty much every C++ compiler does this, for example, for custom implementations of the operator new. This is a special operator, and no matter whether I use a built-in version or a standard library version that came with the compiler or my own custom version, there are a number of special checks that are related to this operator that are executed by the compiler, based on what the language standard says about operator new. This internal code is activated by the use of an operator new; this is a reserved word, just like all other tokens/identifiers in the language standard. > If you really wanted to you could say that the function names are > undefined to the compiler but then you need some way of telling the > compiler which functions are special (what their names are) and what > is special about them (under special circustances you can replace > func_A + func_B with func_C). No, I don't want to. The function names are defined in the language standard, just as the language elements are defined in the (same) language standard. For example, the C++ std::list::insert. In C++, this /is/ a special function, and it always has this name. There's nothing that prevents a C++ compiler from implementing some optimization that is based on what the standard says this function is doing. There's also nothing that would prevent a compiler writer from providing an implementation of this function that is built into the compiler itself. But it still couldn't do anything with this function that's contrary to the standard. (If there's something lacking in the standard to make its definitions more useful, it should be added to the standard -- not to a single implementation.) > This is why you would need to be able to add special attributes to a > function, so that you can tell the compiler that these are special > functions that need to be handled in a special way (execute the > special internal code for them). (sorry about all the specials :-) No. I'm talking /exclusively/ about /standard/ libraries, which contain implementations for functions that have their behavior defined in the language standard. Like the C++ std::list member functions, for example. Their names are known to the compiler implementers; no special attributes are needed to identify these functions. > Other than having special attributes you really would need to make > these functions intrinsic (known and understood by the compiler) so > that it can execute its special internal code when it sees them in > the users source code. No. If their interface is defined accordingly, the compiler can assume that insert followed by delete can be replaced by replace in a given situation. The functions must be defined in a way in the language standard that the compiler can assume this -- but this must be the case no matter whether the list is implemented in the compiler or in a standard library. This optimization can then be performed, independently of whether the functions are implemented by the compiler or in a standard library. (The compiler's optimization stage that deals with this type of high-level optimization doesn't even have to "know" this.) Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
> Gerhard Fiedler wrote: >>> I'd rather write >>> >>> strb <-- stra + stru + " more stuff" >>> >>> than >>> >>> strb <-- StringConcatenate(stra, stru, " more stuff"); >> >> So do I, but I fail to understand why this requires that the string >> handling is built into the compiler. >> >> Your preferred syntax looks (in principle) quite similar to the C++ >> std::string syntax, where normal string handling is implemented in a >> standard library. > > First, that only works when a language allows user defined operators > like C++ does, but most other compiled languages don't. We're not talking here about specific features of specific languages, but about what's exactly the difference of e.g. a string implementation being built into the compiler or implemented in a standard library -- focus on /standard/ library (which means that the string behavior is in both cases defined by the same language standard). (In any case, allowing user-defined operators has a tremendous advantage -- no matter whether the original operator is implemented by the compiler or by a standard library. But this is somewhat besides the point here.) > Even then, the compiler only knows that some library routine has to be > called to perform whatever the operation is. It doesn't know it's > string concatenation and therefore doesn't know as much about the > intent of the statement as it would if it was built in. Yes, it does. In C++, for example, the commonly used string implementation is std::string. I can define my own std::string class (using Pascal-style strings, for example), and the compiler can (and should) assume that whatever is defined in the standard is true. If the standard defines the operator '+' to be concatenation, this is what it is -- built-in, compiler-vendor-supplied library, or custom implementation. The key here is "/standard/ library", as in "library with an interface that's defined in the language standard, just like all the other language elements". Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Sat, 18 Jul 2009, Gerhard Fiedler wrote: > sergio masci wrote: > > > *** YOU *** are saying "'knows' the intent of the line". *** I *** > > am saying"'knows' the intent of several seperate lines as one unit" > > Ok. But so far you didn't explain how this is different for language > elements that are implemented in the compiler versus language elements > that are implemented in a library. The compiler can only "know" a priori > what a single language element is doing (for both the built-in elements > and the ones that are implemented in a standard library); the rest it > has to figure out based on the program -- independently of where exactly > the language elements are implemented. > > > > I am saying something like: > > > > j = x * 2 + y * 2; > > > > temp = arr[j]; > > > > j = (x + y) * 2; > > > > arr[j] = temp; > > > > should also be reduced to nothing *** AND *** the compiler should warn > > the user that this piece of code has no effect and so may be in > > error. This is not possible if you understand only the intent of > > single lines but can be fudged by the compiler by keeping track of > > what is evaluated and where the result is placed. So although the > > compiler might be able to reduce it to zero it doesn't understand > > that this might be an error. > > How is this dependent on where the involved operators are defined -- as > long as the compilers "knows" what they are supposed to do? > > > Things get much more difficult if we re-write the above as: > > > > int func_get(int arr[], int x, int y) > > { > > int j; > > > > j = x * 2 + y * 2; > > > > return arr[j]; > > } > > > > void func_set(int arr[], int x, int y, int val) > > { > > k = (x + y) * 2; > > > > arr[k] = val; > > } > > > > temp = func_get(arr, x, y); > > > > func_set(arr, x, y, temp); > > > > Ok the compiler *** MIGHT *** still be able to hack this if it is very > > smart and all the source is available for the functions. BUT change > > the type of the array from a straight forward int to a struct and > > things get incredibly complicated (I mean they are really complicated > > now with int but they will get MUCH worse with a struct :) > > I don't doubt that it's incredibly complicated, but I don't see how the > exact implementation of the various operators affects this optimization. > This optimization is based on the compiler "knowing" that (x * 2 + y * > 2) == ((x + y) * 2), no matter how exactly they are calculated. If this > identity can be derived from what's defined in the standard about the > involved operators, then this optimization can be made, no matter where > the operators are implemented. If it's not defined in the standard (for > example, for typical floating point variables, this identity is > generally not given), such an optimization can't be made -- no matter > where the operators are implemented. > > > > Here having built-in type like STRING, LIST etc reduces the > > complication because the lanuage and compiler control the way they > > are used. It's like the difference between a menu interface and a > > command line interface. With the menu interface you guide the users > > interactions and there can be no unexpected commands issued that will > > break things (kind of). > > I still don't see this. You repeat that there is a difference, but don't > really say where the difference is. The analogy with the menu doesn't > seem to be very helpful; if I have a list defined with the operations > insert, delete, replace, find, the "menu" is the same whether they are > implemented in the compiler or in a standard library. > > It seems you forget that the library I'm talking about has an interface > that's as strictly and precisely defined as the language itself, by the > same language standard. Can you give an example where a programmer could > give an "unexpected command" in the case the list is implemented in a > standard library, versus the programmer not being able to give that > "unexpected command" in the case it is implemented in the compiler? > (Keep in mind that in both cases it is defined in the language standard > what the programmer can do with the list -- and it's the exact same > definition.) > > > > But the same argument could be made for floating point. Many people > > require different precission (wheather greater or less than that > > provided by C) yet they either use what is available in a way that > > suits them of they use a specialised library *** AS WELL *** as the > > supported floating point. > > > > Often people will fit the solution to the tools available. > > I didn't understand the point you're making here (WRT the discussion > about built-in versus standard library). Yes, I think it's helpful for > both strings and floating-point when a user can use custom libraries > instead of the implementations built into the compiler. And I also think > that this only makes real sense if the syntax is the same in both cases. > And IMO this means that there can't be any syntax advantage for the > implementations that are built into the compiler. > > > > in the above, how would I tell the compiler the relationship between > > string_alloc, string_assign, string_substr and string_release such that > > the compiler is able to: > > > > (1) ... indicate an error if tmp_str is not initialised with string_alloc > > before string_assign is used on it (simply assigning a value to tmp_str is > > not what I mean, I mean actually using string_alloc to initialise it). > > > > (2) ... indicate an error if tmp_str goes out of scope before it is > > cleaned up with string_release > > > > (3) ... optimise the combined use of string_assign and string_substr > > > > (4) ... recognise that tmp_str is undefined after the use of > > string_release > > > > This is why you need to be able to add special attributes to the > > functions, to be able to tell the compiler about all this stuff. > > If I understand you correctly, this wouldn't be defined as attribute of > the function, but it would be defined in the language standard, /before/ > any implementation of a compiler or a standard library. Every library > function I'm talking about is assumed to be defined in the same language > standard where you define how your built-in list works. > > > > in each of these cases C ensures that the actual paraemeters val_1, > > val_2 and val_3 are floats before passing them to func_A. > > FWIW, I'm not a friend of this. I prefer explicit casts, and if I had a > way to switch off all implicit casts in C or C++, I'd probably do it. > (At least I'd start doing it to see how it works out.) But this is not > related to our discussion about built-in vs library. > > > [...] All of this is the calling protocol. > > > > So yes the the compiler "understands" a standard library function in > > as much as it "understands" its parameter requirements. > > This is not what I mean. This is what the compiler "knows" about any > function, standard or user-defined. It "knows" more (or could "know" > more) about standard library functions -- as much as there is defined in > the language standard. > > > > The C++ compiler still doesn't understand the relationships between > > functions. The programmer does and he arranges all the objects to get > > sensible optimised results. > > Whether it actual does or doesn't doesn't really matter here. What > matters is whether this is /possible/ -- and whether there's a > difference between implemented in the compiler or in a standard library. > > One of your first points was the replacement of a delete followed by an > insert with a replace, for a list. This is possible because the compiler > "knows" what delete, insert and replace do to a list, and how the list > is supposed to work. It needs a definition for this. If this definition > is given in the language standard, I still fail to see how it matters > whether the list is implemented in a standard library or in the compiler > itself. > > > >>> An intrinsic function is one that the compiler understands > >>> intimately. Not just what parameters it takes and what result it > >>> returns. > >> > >> Next step is to define what "understands intimately" means. Does a > >> specification (like in a language standard) qualify? > > > > No. > > I guess we don't really need to define what intrinsic functions are (and > consequently what "understands intimately" means for you), as it's not > what seems to be important in the question what exactly is the > difference in optimizing sequences of commands, depending on whether > they are implemented in the compiler or in a standard library (always > given that both implementations follow the same language standard). > > > >> Not sure whether the standard can tell you what kind of optimizations > >> actual compilers implement. Much of this is probably a trade secret. > >> Then I'm not convinced a casual look at this is enough to find out > >> what /could/ be implemented. They don't talk about optimizations in > >> the standard; they just say what has to be the result. > >> > >> Going back to one of your original arguments, the one that prompted > >> my question about what the difference is between list operations > >> that are implemented by the compiler and list operations that are > >> implemented in a standard library: the substitution of a delete from > >> a list followed by an insert into a list at the same location by a > >> replace. > >> > >> If you want to implement such an optimization in your compiler, > >> what's the difference between having these functions as part of the > >> compiler, or having them defined as standard library functions in > >> the standard that the compiler is based upon? In both cases, the > >> compiler "knows" that this substitution is possible and the exact > >> implementation of the three functions is not necessary to be known > >> for this optimization to be possible. > > > > Yes I understand what you mean here. But in the case of functions > > this relies on the compiler actually recognising the function names > > and executing some internal code to perform checks, issue warning (or > > errors) and perform optimisations. > > Yes. > > > That internal compiler code needs to be activated somehow and it needs > > to perform a very specific task (not something that would be there > > anyway - somthing that is written specially for that one perpose). > > Yes. Just as with your insert+delete==replace example. The compiler can > do this, independently of where these functions are implemented. > > Pretty much every C++ compiler does this, for example, for custom > implementations of the operator new. This is a special operator, and no > matter whether I use a built-in version or a standard library version > that came with the compiler or my own custom version, there are a number > of special checks that are related to this operator that are executed by > the compiler, based on what the language standard says about operator > new. > > This internal code is activated by the use of an operator new; this is a > reserved word, just like all other tokens/identifiers in the language > standard. > > > If you really wanted to you could say that the function names are > > undefined to the compiler but then you need some way of telling the > > compiler which functions are special (what their names are) and what > > is special about them (under special circustances you can replace > > func_A + func_B with func_C). > > No, I don't want to. The function names are defined in the language > standard, just as the language elements are defined in the (same) > language standard. For example, the C++ std::list::insert. In C++, this > /is/ a special function, and it always has this name. There's nothing > that prevents a C++ compiler from implementing some optimization that is > based on what the standard says this function is doing. There's also > nothing that would prevent a compiler writer from providing an > implementation of this function that is built into the compiler itself. > But it still couldn't do anything with this function that's contrary to > the standard. (If there's something lacking in the standard to make its > definitions more useful, it should be added to the standard -- not to a > single implementation.) > > > This is why you would need to be able to add special attributes to a > > function, so that you can tell the compiler that these are special > > functions that need to be handled in a special way (execute the > > special internal code for them). (sorry about all the specials :-) > > No. I'm talking /exclusively/ about /standard/ libraries, which contain > implementations for functions that have their behavior defined in the > language standard. Like the C++ std::list member functions, for example. > Their names are known to the compiler implementers; no special > attributes are needed to identify these functions. > > > Other than having special attributes you really would need to make > > these functions intrinsic (known and understood by the compiler) so > > that it can execute its special internal code when it sees them in > > the users source code. > > No. If their interface is defined accordingly, the compiler can assume > that insert followed by delete can be replaced by replace in a given > situation. The functions must be defined in a way in the language > standard that the compiler can assume this -- but this must be the case > no matter whether the list is implemented in the compiler or in a > standard library. This optimization can then be performed, independently > of whether the functions are implemented by the compiler or in a > standard library. (The compiler's optimization stage that deals with > this type of high-level optimization doesn't even have to "know" this.) > If I understand you correctly what you are saying is that if the compiler writer is already aware of the /standard/ library before he starts writing the compiler AND the /standard/ library definition is set in concrete just as the language definition is then the compiler writer is able to use intimate knowledge of the /standard/ library functions within the compiler without incorporating the code generation of the /standard/ library functions within the compiler but instead leaving this implemented external to the compiler proper (so that these functions can be written by someone else and code generated at compile time). Please confirm this. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
> If I understand you correctly what you are saying is that if the > compiler writer is already aware of the /standard/ library before he > starts writing the compiler AND the /standard/ library definition is > set in concrete just as the language definition is then the compiler > writer is able to use intimate knowledge of the /standard/ library > functions within the compiler without incorporating the code > generation of the /standard/ library functions within the compiler > but instead leaving this implemented external to the compiler proper > (so that these functions can be written by someone else and code > generated at compile time). Correct. I thought this all is basically understood when talking about standard libraries (that is, libraries with an interface that is part of the language standard). Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
|
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Tue, 21 Jul 2009, Gerhard Fiedler wrote: > sergio masci wrote: > > > If I understand you correctly what you are saying is that if the > > compiler writer is already aware of the /standard/ library before he > > starts writing the compiler AND the /standard/ library definition is > > set in concrete just as the language definition is then the compiler > > writer is able to use intimate knowledge of the /standard/ library > > functions within the compiler without incorporating the code > > generation of the /standard/ library functions within the compiler > > but instead leaving this implemented external to the compiler proper > > (so that these functions can be written by someone else and code > > generated at compile time). > > Correct. > > I thought this all is basically understood when talking about standard > libraries (that is, libraries with an interface that is part of the > language standard). Ok so I'm starting to get on the same page as you (I may not agree but at least I now understand how you are seeing things :) Furthermore you seem to be saying that using a function call syntax rather than a verbose statement syntax (e.g. SQL) should be equally easy for the compiler to understand e.g. A compiler should be able to interpret the following three statemets as identical (1)... // function call syntax if( lt(x, 0), assign(y, 0), assign(y, x) ) (2)... // C syntax if (x<0) { y = 0; } else { y = x; } (3)... // verbose syntax if x < 0 then y = 0 else y = x endif AND because the compiler should be able to "understand" functions as easily as other language statements that it is a convenient way to extend the language. Could you please confirm this. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
> Gerhard Fiedler wrote: >> I thought this all is basically understood when talking about >> standard libraries (that is, libraries with an interface that is >> part of the language standard). > > Normally "standard library" refers to a documented set of subroutines > that have a widely agreed upon interface and are supplied with the > compiler. Define "normally" and "documented" and "widely agreed upon" for this context :) Or, alternatively, have a look at a language standard that contains library interface definitions (e.g. the C++ standard -- you'll find a link in this thread to a draft). If the interface is defined ("documented" in a "standard") -- both in syntax and semantics --, the compiler writer can rely on this definition for any high-level optimizations. I have written what I understand in this context as "standard library" repeatedly in this thread; just search for it. > You seem to want intrinsic functions that you can also write yourself > if you want to. No. I've written this already. Conclusions based on this have nothing to do with what I wrote. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>> sergio masci wrote: >> >>> If I understand you correctly what you are saying is that if the >>> compiler writer is already aware of the /standard/ library before >>> he starts writing the compiler AND the /standard/ library >>> definition is set in concrete just as the language definition is >>> then the compiler writer is able to use intimate knowledge of the >>> /standard/ library functions within the compiler without >>> incorporating the code generation of the /standard/ library >>> functions within the compiler but instead leaving this implemented >>> external to the compiler proper (so that these functions can be >>> written by someone else and code generated at compile time). >> >> Correct. >> >> I thought this all is basically understood when talking about >> standard libraries (that is, libraries with an interface that is >> part of the language standard). > > Ok so I'm starting to get on the same page as you (I may not agree > but at least I now understand how you are seeing things :) I thought that there was some disconnect, and I hoped that we would get there eventually :) > Furthermore you seem to be saying that using a function call syntax > rather than a verbose statement syntax (e.g. SQL) should be equally > easy for the compiler to understand > e.g. > A compiler should be able to interpret the following three > statemets as identical > > (1)... // function call syntax > if( lt(x, 0), assign(y, 0), assign(y, x) ) > > (2)... // C syntax > if (x<0) > { y = 0; > } > else > { y = x; > } > > (3)... // verbose syntax > if x < 0 then > y = 0 > else > y = x > endif I suspect that in many compilers (3) and (2) end up (as an intermediate representation) in something that's equivalent to (1). But even if not, it shouldn't be too difficult to make all three end up in the same, whatever the compiler's internal representation of this statement is. It's probably a bit more difficult to write a parser for (2) and (3) than it is to write one for (1), but in the great scheme of a compiler, I don't think that this difference is crucial. So, yes, I think for a compiler these three could be identical. I don't see that a compiler could derive any information from any of them that it couldn't derive from the other two. Provided, of course, that the functions used in (1) are just as defined as the operators and statements used in (2) and (3). > AND because the compiler should be able to "understand" functions as > easily as other language statements that it is a convenient way to > extend the language. Yes, extend or customize. That's approximately the C++ standard library way. (However, in C++ this isn't done thoroughly. Due to notation conventions and backwards compatibility, many language elements are special operators with special features; for example, no matter the specific meaning given through overriding, the relative precedence of operators is fixed by the language. Probably a good and necessary thing, but breaks the extendibility. Not having any precedence and requiring parentheses for everything would be the "clean" way, but that's more cumbersome to write in a large majority of the cases, so this is not only backwards compatibility, but also pragmatism.) > Could you please confirm this. Basically confirmed. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 22 Jul 2009, Gerhard Fiedler wrote: > Olin Lathrop wrote: > > > Gerhard Fiedler wrote: > >> I thought this all is basically understood when talking about > >> standard libraries (that is, libraries with an interface that is > >> part of the language standard). > > > > Normally "standard library" refers to a documented set of subroutines > > that have a widely agreed upon interface and are supplied with the > > compiler. > > Define "normally" and "documented" and "widely agreed upon" for this > context :) > > Or, alternatively, have a look at a language standard that contains > library interface definitions (e.g. the C++ standard -- you'll find a > link in this thread to a draft). > > If the interface is defined ("documented" in a "standard") -- both in > syntax and semantics --, the compiler writer can rely on this definition > for any high-level optimizations. > > I have written what I understand in this context as "standard library" > repeatedly in this thread; just search for it. > > > You seem to want intrinsic functions that you can also write yourself > > if you want to. > > No. I've written this already. Conclusions based on this have nothing to > do with what I wrote. Yes Gerhard as Olin says you do seem to want intrinsic functions that you can also write yourself if you want to. This is not a bad thing, just incredibly difficult to achive efficiently. By which I mean have a modern PC capable of generating an executable from such a compiler in an acceptable time. By definition, intrinsic means that the compiler understands what these functions do and how they do it. The exact optimised code generated for such a function could be contained in a special internal (to the compiler) macro with lots of conditional assembly going on inside it. Or as you have already suggested, it could be a function held as source (maybe C) which gets compiled everywhere it is needed (including conditional compilation). But this would still be intrinsic to the compiler because the compiler has a preconceived idea of what the function does and the actual source code for the function would not change this idea. The function calling protocol would be horrendus (very big and complicated and much slower to execute than that needed for the standard / normal function calling protocol) to allow for good optimisation but I can see that this could be done. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 22 Jul 2009, Gerhard Fiedler wrote: > Basically confirmed. Interesting. Thinking. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 22 Jul 2009, Gerhard Fiedler wrote: > sergio masci wrote: > > >> sergio masci wrote: > >> > >>> If I understand you correctly what you are saying is that if the > >>> compiler writer is already aware of the /standard/ library before > >>> he starts writing the compiler AND the /standard/ library > >>> definition is set in concrete just as the language definition is > >>> then the compiler writer is able to use intimate knowledge of the > >>> /standard/ library functions within the compiler without > >>> incorporating the code generation of the /standard/ library > >>> functions within the compiler but instead leaving this implemented > >>> external to the compiler proper (so that these functions can be > >>> written by someone else and code generated at compile time). > >> > >> Correct. > >> > >> I thought this all is basically understood when talking about > >> standard libraries (that is, libraries with an interface that is > >> part of the language standard). > > > > Ok so I'm starting to get on the same page as you (I may not agree > > but at least I now understand how you are seeing things :) > > I thought that there was some disconnect, and I hoped that we would get > there eventually :) > > > Furthermore you seem to be saying that using a function call syntax > > rather than a verbose statement syntax (e.g. SQL) should be equally > > easy for the compiler to understand > > e.g. > > A compiler should be able to interpret the following three > > statemets as identical > > > > (1)... // function call syntax > > if( lt(x, 0), assign(y, 0), assign(y, x) ) > > > > (2)... // C syntax > > if (x<0) > > { y = 0; > > } > > else > > { y = x; > > } > > > > (3)... // verbose syntax > > if x < 0 then > > y = 0 > > else > > y = x > > endif > > I suspect that in many compilers (3) and (2) end up (as an intermediate > representation) in something that's equivalent to (1). But even if not, > it shouldn't be too difficult to make all three end up in the same, > whatever the compiler's internal representation of this statement is. Ok so I write a seperate parser for all three and internally they all produce: if_statement .. expr .. .. lt .. .. .. x .. .. .. 0 .. statement .. .. assign .. .. .. y .. .. .. 0 .. statement .. .. assign .. .. .. y .. .. .. x So what advantage does this give me? If instead of the keyword 'if' I used a function 'xyz' thus: xyz( lt(x, 0), assign(y, 0), assign(y, x) ) my parsers would all now produce: expr .. func .. .. xyz .. .. expr .. .. .. lt .. .. .. .. x .. .. .. .. 0 .. .. expr .. .. .. assign .. .. .. .. y .. .. .. .. 0 .. .. expr .. .. .. assign .. .. .. .. y .. .. .. .. x Internally the compiler has different sections that deal with generating code for 'if_statment' trees, 'statement' trees and 'expr' trees. How would you propose that I treat 'xyz' differently while parsing and how would you add a specific code generator for the now special 'xyz' (you need to describe all this somehow)? e.g. the 'if_statement' code generator knows that it might have either one or two statements following the condition expression. It also knows that if the condition expression evaluates to a compile time constant that it can discard either the 'true' or 'false' statements. It also needs to interact with the code generator for the condition expression to allow that generator to produce efficient optimised code jumps to the 'true' or 'false' statements (think of early out logical expressions involving '&&' and '||'). Consider the difference between "if (cond)..." and "X=cond" > > It's probably a bit more difficult to write a parser for (2) and (3) > than it is to write one for (1), but in the great scheme of a compiler, > I don't think that this difference is crucial. > > So, yes, I think for a compiler these three could be identical. I don't > see that a compiler could derive any information from any of them that > it couldn't derive from the other two. Actually it can. Consider a long complex program made up purely of functions as in (1). What happens with a misplaced comma or parenthesis? The verbose syntax lets the compiler catch silly mistakes. Consider the difference between: if (...) { a = b; } while (...) { c = d; } and if ... then endif while ... do done Now edit the above: if (...) { while (...) { a = b; } c = d; } and if ... then while ... do a = b; endif c = d; done > > Provided, of course, that the functions used in (1) are just as defined > as the operators and statements used in (2) and (3). Ok so I'll give you that, all the functions are defined in exactly the same way in (1) as they are in (2) and (3). But what are we going to do about the vast number of functions defined in the /standard/ library? > > > AND because the compiler should be able to "understand" functions as > > easily as other language statements that it is a convenient way to > > extend the language. > > Yes, extend or customize. That's approximately the C++ standard library > way. But the C++ way is horrible! You have CLASS upon CLASS upon CLASS. If you want to write a modest program you end up so deep in 'standard' classes and templates that it gets very hard to see the wood for the trees. This nonsense that user classes should be written in such a way as to have special methods that the standard libraries expect (things like iterators) so that the items in a container can be accessed. The programmer shouldn't need to know about all this. He should be able to just say (e.g.) for all items in list FRED do *.x = $.x + 1 done > > Could you please confirm this. > > Basically confirmed. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>> I suspect that in many compilers (3) and (2) end up (as an >> intermediate representation) in something that's equivalent to (1). >> But even if not, it shouldn't be too difficult to make all three end >> up in the same, whatever the compiler's internal representation of >> this statement is. > > Ok so I write a seperate parser for all three and internally they all > produce: > > if_statement > .. expr > .. .. lt > .. .. .. x > .. .. .. 0 > .. statement > .. .. assign > .. .. .. y > .. .. .. 0 > .. statement > .. .. assign > .. .. .. y > .. .. .. x > > So what advantage does this give me? I don't know, but if you do it, you probably know :) Seriously, I don't understand the question in this context. > If instead of the keyword 'if' I used a function 'xyz' thus: > > xyz( lt(x, 0), assign(y, 0), assign(y, x) ) > > my parsers would all now produce: > > expr > .. func > .. .. xyz > .. .. expr > .. .. .. lt > .. .. .. .. x > .. .. .. .. 0 > .. .. expr > .. .. .. assign > .. .. .. .. y > .. .. .. .. 0 > .. .. expr > .. .. .. assign > .. .. .. .. y > .. .. .. .. x > > Internally the compiler has different sections that deal with > generating code for 'if_statment' trees, 'statement' trees and 'expr' > trees. > > How would you propose that I treat 'xyz' differently while parsing > and how would you add a specific code generator for the now special > 'xyz' (you need to describe all this somehow)? 'if' is a special statement/construct/function, defined in the language standard. 'xyz' is not. Therefore, the compiler can have (and generally has) special code to generate 'if' more efficiently than a function call. Compared to the original issue -- lists --, 'if' is much more simple, and the function call overhead here is important and more than the actual functionality typically would be. That's why it generally makes sense to implement such a construct directly, avoiding the function call overhead. I'm not a compiler specialist, but it could be that 'avoiding the function call overhead' is a premature optimization, and that a later, lower-level optimization could result in just this. In any case, since 'if' is defined in the language standard, there's nothing that would prevent a compiler writer to implement it in the compiler. > the 'if_statement' code generator knows that it might have either one > or two statements following the condition expression. It also knows > that if the condition expression evaluates to a compile time constant > that it can discard either the 'true' or 'false' statements. It also > needs to interact with the code generator for the condition > expression to allow that generator to produce efficient optimised > code jumps to the 'true' or 'false' statements (think of early out > logical expressions involving '&&' and '||'). Consider the difference > between "if (cond)..." and "X=cond" Yes. This all is pretty much C. I thought we weren't really talking about any specific languages, but about 'implemented in the compiler' versus 'implemented in a library'. The 'if' statement implementation is so short that going through a general-purpose function call convention would blow up the code tremendously. But: 1) Nobody says that the compiler writer /has/ to do this. The 'if' statement is defined in the language standard, and the compiler writer can choose to implement it directly in the compiler. 2) Even if the compiler writer chooses to implement it as function call, I think it is possible that a lower-level optimization detects the inefficiencies and successfully optimizes the function call away (remember that I considered that the function is available as source), reaching the same code as if implemented directly. >> So, yes, I think for a compiler these three could be identical. I >> don't see that a compiler could derive any information from any of >> them that it couldn't derive from the other two. > > Actually it can. Consider a long complex program made up purely of > functions as in (1). What happens with a misplaced comma or > parenthesis? That's a feature of that specific syntax, not a difference whether the 'if' function is implemented in the compiler or in a library. We didn't discuss the various merits of the different syntaxes (sp ?? :) > The verbose syntax lets the compiler catch silly mistakes. Of course. The more redundant (that is, verbose) the syntax is, the easier it is both for the programmer to get something wrong and for the compiler to catch when something is wrong. But we didn't discuss the merits of different syntaxes, we discussed merits of 'implemented in the compiler' versus 'implemented in a standard library'. >> Provided, of course, that the functions used in (1) are just as >> defined as the operators and statements used in (2) and (3). > > Ok so I'll give you that, all the functions are defined in exactly > the same way in (1) as they are in (2) and (3). But what are we going > to do about the vast number of functions defined in the /standard/ > library? I don't understand the question. I didn't mean to suggest that all functions need to have an equivalent in the forms (2) or (3), but rather the other way round, that typically constructs of the forms (2) or (3) have an equivalent function call syntax that does the same (functionally, not necessarily in terms of typing or user-friendliness). >>> AND because the compiler should be able to "understand" functions as >>> easily as other language statements that it is a convenient way to >>> extend the language. >> >> Yes, extend or customize. That's approximately the C++ standard >> library way. > > But the C++ way is horrible! You have CLASS upon CLASS upon CLASS. If > you want to write a modest program you end up so deep in 'standard' > classes and templates that it gets very hard to see the wood for the > trees. This is not about a specific implementation of the principle, this is about the principle. You always bring in C, despite (or because?) we already agreed that the C way is pretty much horrible. And we probably can agree that the BASIC way is horrible, too -- some exceptions notwithstanding :) > This nonsense that user classes should be written in such a way as to > have special methods that the standard libraries expect (things like > iterators) so that the items in a container can be accessed. The > programmer shouldn't need to know about all this. He should be able > to just say (e.g.) > > for all items in list FRED do > > *.x = $.x + 1 > done I don't really understand this. This is probably your syntax, and quite familiar to you, but I don't think a majority of list readers here would know what this does. In any case, I don't. Anyway, for what it's worth, and independently of the issue we're discussing (compiler-built-in vs standard-library-implemented), my C++ code looks similar: BOOST_FOREACH( item i, FRED ) { ++i.x; } (If this is what your code does... since I don't know what it does, I can't really tell, but it probably is trivial to correct it if it's doing something different. And I don't generally use identifiers like FRED for lists in C++, but that's only a style question.) But again, this is not a discussion of C++ style syntax versus BASIC style syntax (yet :) -- and I don't see anything particularly advantageous about the C++ style syntax. But all of C++'s list handling is implemented in libraries -- and that's the issue. (I used here an element from the Boost library. It's not a standard library, but it could be one. Whether or not a given library is a standard library is just a matter of definition, and not of principle.) Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
|
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'>The C18 compiler's cryptic "syntax error on line xxxx"
>didn't exactly help either. C30 is like this as well. It knows there is a syntax error in that it has reached something without finding a closing bracket, why does it need to use a catch-all error message??? -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Sat, 25 Jul 2009, Alan B. Pearce wrote: > >The C18 compiler's cryptic "syntax error on line xxxx" > >didn't exactly help either. > > C30 is like this as well. It knows there is a syntax error in that it has > reached something without finding a closing bracket, why does it need to use > a catch-all error message??? > Because at that level of parsing it may just be looking for any one of a set of tokens. The "catch-all error message" just means it found something it wasn't expecting (not in the list, maybe a keyword, brace or semicolon). At that level the function doesn't know what it's parsing so it can't tell you what the real problem is. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
> Gerhard Fiedler wrote: >> Of course. The more redundant (that is, verbose) the syntax is, the >> easier it is both for the programmer to get something wrong and for >> the compiler to catch when something is wrong. > > Not necessarily, and this is one of the problems with C. What "not necessarily"? Sometimes, it seems you don't really try to understand and go against, just because... :) > It's not always as obvious to a human when the wrong special character > is used, for example, than a more verbose keyword. This is what I wrote: the more verbose, the easier to catch ("more obvious") when something is wrong. > It is easier to get "{" or "}" wrong than the more verbose "begin" or > "end", for example. I think here you're wrong. The probability to type a wrong letter is about the same per letter, so the probability to type "begin" wrong is higher than the probability to type "{" wrong. > The latter are bigger patterns to match against and more obvious when > they are wrong, especially to a casual observer. Exactly... for the reviewer or for the compiler, it's much easier to catch the error with "begin" than with "{". This is what I wrote. So again: what "not necessarily"? > I had exactly this case last week. I added one more symbol to a C > ENUM, and apparently typed ")" instead of "}" to close the list. > These two look fairly similar, so I didn't notice. That, and the > fact that I don't do C that often caused several wasted minutes > trying to figure out why the code wouldn't compile. The C18 > compiler's cryptic "syntax error on line xxxx" didn't exactly help > either. It seems that the compiler didn't have a problem detecting that there was something wrong. That the compiler didn't provide you with a helpful error message is another issue -- that's not a problem with language syntax but with the specific compiler implementation. > I'm pretty sure that if I had been required to type END or something > more verbose than "}", the mistake would never have happened or I > would have noticed it much quicker. You can define preprocessor macros and use Begin and End instead of { and }. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>>> You seem to want intrinsic functions that you can also write >>> yourself if you want to. >> >> No. I've written this already. Conclusions based on this have >> nothing to do with what I wrote. > > Yes Gerhard as Olin says you do seem to want intrinsic functions that > you can also write yourself if you want to. > > This is not a bad thing, just incredibly difficult to achive > efficiently. By which I mean have a modern PC capable of generating > an executable from such a compiler in an acceptable time. > > By definition, intrinsic means that the compiler understands what > these functions do and how they do it. I already wrote a few times that what I'm thinking are functions for which the compiler writer knows what they do, but not necessarily how they do it (that is, they may be implemented in a library, not necessarily by the compiler)? > But this would still be intrinsic to the compiler because the compiler > has a preconceived idea of what the function does and the actual source > code for the function would not change this idea. Calling a function with a standard interface ("preconceived idea of what it does") and an implementation in a library for which source code is available "intrinsic" is against every common use of this word. Including against your own... you already said that you don't consider the C++ function std::list::insert an intrinsic function. But it's this type of function that I'm talking about. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'> Gerhard
>>> Of course. The more redundant (that is, verbose) the syntax is, the >>> easier it is both for the programmer to get something wrong and for >>> the compiler to catch when something is wrong. Olin: >> Not necessarily, and this is one of the problems with C. Gerhard: > What "not necessarily"? Sometimes, it seems you don't really try to understand and go against, just because... :) FWIW, and this is entiely about understanding each other's language, and not about understanding 'the machine's' language, I read what Olin wrote, understood it and largely agreed with it and it SEEMS to me that he understood you but that you misunderstand him. He seems to have an entiely valid point and to be addreessing a real issue. ("But,", as Sagan was wont to say , "I may be wrong" :-) ). This is NOT a criticism - just an observation. And it may be wrong in fact. >> It's not always as obvious to a human when the wrong special character >> is used, for example, than a more verbose keyword. >This is what I wrote: the more verbose, the easier to catch ("more >obvious") when something is wrong. To my ear/brain/eye [ebe], that is NOT what you wrote - My ebe says that you are say ing the opposite of what Olin is saying. > It is easier to get "{" or "}" wrong than the more verbose "begin" or > "end", for example. ie Olin is saying NOT that by random typing error you are more liable to get a single character errot (which, as you correctly note, is not the case, on a purely statistical basis. He is saying that if one substitutes any one of { or } or ) or ( incorrectly for some other member of the set, then one is more likely to miss the eeeor than if one puts begin when one should have put end. Whereas { for } is a typo, begin blah blah blah begin is a 'braino' which one would not makev (except on bank holidays at new moon in a fish market). You said (my ebe says) People are liable to make more mistakes and machines less with multi-character symbols compared to single character sumbols. > I think here you're wrong. The probability to type a wrong letter is > about the same per letter, so the probability to type "begin" wrong is > higher than the probability to type "{" wrong. As above - Olin was talking about mental parsing, not about statistical typing erros. >> The latter are bigger patterns to match against and more obvious when >> they are wrong, especially to a casual observer. > Exactly... for the reviewer or for the compiler, it's much easier to catch the error with "begin" than with "{". This is what I wrote. So again: what "not necessarily"? /> No. You actually said the opposite FOR A PERSON (my ebe says). > > I had exactly this case last week. I added one more symbol to a C >> ENUM, and apparently typed ")" instead of "}" to close the list. >> These two look fairly similar, so I didn't notice. That, and the >> fact that I don't do C that often caused several wasted minutes >> trying to figure out why the code wouldn't compile. The C18 >> compiler's cryptic "syntax error on line xxxx" didn't exactly help >> either. >It seems that the compiler didn't have a problem detecting that there >was something wrong. Yes. Which is his point. Olin had problems with the single character symbol. The compiler didn't. That is the point he was making and the one that you SEEMED to be opposing. >That the compiler didn't provide you with a helpful >error message is another issue -- that's not a problem with language >syntax but with the specific compiler implementation. Yes. > I'm pretty sure that if I had been required to type END or something > more verbose than "}", the mistake would never have happened or I > would have noticed it much quicker. You can define preprocessor macros and use Begin and End instead of { and }. Yes. The main point (possibly :-) ) is that the double ended sharp bladed light sabre has no guards on either end as that's the way the original expert liked it, but when playing with it it's easy to lose a finger. One can, of course, use a preprocessor to add guards and a scabbard. Some of the 'expert' features are harder to "fix" with simple text substitution pre-processing. eg the (if I recall the argument correctly) fall through CASE treatment which the original expert probably didn't mind but which allows people to make mincemeat of their programs if they do not understand that it diffres from more protective languages. Russell Russell -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Russell McMahon wrote:
>> Gerhard >>>> Of course. The more redundant (that is, verbose) the syntax is, the >>>> easier it is both for the programmer to get something wrong and >>>> for the compiler to catch when something is wrong. > > Olin: >>> Not necessarily, and this is one of the problems with C. > > Gerhard: >> What "not necessarily"? Sometimes, it seems you don't really try to >> understand and go against, just because... :) > FWIW, and this is entiely about understanding each other's language, > and not about understanding 'the machine's' language, I read what > Olin wrote, understood it and largely agreed with it ... So did I (even before he wrote :), in case I didn't get this over properly. > ... and it SEEMS to me that he understood you but that you > misunderstand him. Or both of you misunderstood what I wrote first? Not impossible either... for a variety of reasons :) >>> It's not always as obvious to a human when the wrong special >>> character is used, for example, than a more verbose keyword. > >> This is what I wrote: the more verbose, the easier to catch ("more >> obvious") when something is wrong. > > To my ear/brain/eye [ebe], that is NOT what you wrote - My ebe says > that you are say ing the opposite of what Olin is saying. Ok... this seems to become a language (English, not C) or a thought process issue. In my understanding, part of what I wrote is: "The more redundant (that is, verbose) the syntax is, the easier it is ... for the compiler to catch when something is wrong." I wrote "for the compiler", but the important issue is not the compiler, but that it's easier to catch an error (or even correct it) when the information is redundant. See your typo a few sentences below... easy to correct, because the language provides sufficient redundancy. >> It is easier to get "{" or "}" wrong than the more verbose "begin" or >> "end", for example. > > ie Olin is saying NOT that by random typing error you are more liable > to get a single character errot (which, as you correctly note, is not > the case, on a purely statistical basis. Agreed. So we seem to agree that it's more likely to type "begon" (or any of the other possible, wrong variants) instead of "begin" than "}" (or any of the other possible, wrong variants) instead of "{". And that it is more likely for the compiler to catch the "begon" typo than the "}" typo (even though it's quite likely that both won't go uncaught, if for nothing else then because the balance is off). > He is saying that if one substitutes any one of { or } or ) or ( > incorrectly for some other member of the set, then one is more likely > to miss the eeeor than if one puts begin when one should have put > end. Agreed. But that's after the fact, when reviewing, not when typing. And, FWIW, it's most likely that the compiler catches either with equal (and high) probability -- which is what's probably most important. I think it's been decades that I had a bug in a C program (a bug, not a "doesn't compile" error) that was due to an incorrectly typed parenthesis. (And this doesn't only include my own programs.) If I ever saw one. A decent language-aware editor already tells you that your parentheses/brackets/begin-end/whatever pairs are unbalanced before you even compile. > Whereas { for } is a typo, > > begin > blah blah blah > begin > > is a 'braino' which one would not makev (except on bank holidays at new moon > in a fish market). Agreed. (But IMO this is /not/ what Olin said... this is what you /thought/ reading his comments :) But I don't think there is a big difference in the probability that either will be caught by the compiler, and the probability is quite high. If you wanted to craft an example, you'd have to quite carefully do so -- create two such typos that cancel each other out and where the resulting code still is correct syntax. > You said (my ebe says) > People are liable to make more mistakes and machines > less with multi-character symbols compared to single > character sumbols. No (IMO... but you know that my English is sometimes ... how do I say ... odd, at least for a native speaker, and I still, sometimes against better judgment, use them as the measuring stick :) What I wrote we know (see above). What I meant, in the terms of this re-write, could be: People are liable to make more typos when they have to type more. But when they have to type more to pass the same information, the implicit redundancy allows the compiler (and possibly a reviewer) to more easily catch the typos due to the inconsistencies they create. (Doesn't apply to all verbosity, but applies to the one we're talking about.) >>> I had exactly this case last week. I added one more symbol to a C >>> ENUM, and apparently typed ")" instead of "}" to close the list. >>> These two look fairly similar, so I didn't notice. That, and the >>> fact that I don't do C that often caused several wasted minutes >>> trying to figure out why the code wouldn't compile. The C18 >>> compiler's cryptic "syntax error on line xxxx" didn't exactly help >>> either. >> >> It seems that the compiler didn't have a problem detecting that >> there was something wrong. > > Yes. Which is his point. Olin had problems with the single character > symbol. The compiler didn't. That is the point he was making and the > one that you SEEMED to be opposing. I didn't quite oppose what Olin said. With the exception of a rather unimportant side-issue, I thought that what I wrote originally was pretty much the same as what Olin wrote, even though looked at from a slightly different angle. The only thing I really opposed was Olin's opposition (as expressed in the "not necessarily"). > The main point (possibly :-) ) is that the double ended sharp bladed > light sabre has no guards on either end as that's the way the > original expert liked it, but when playing with it it's easy to lose > a finger. One can, of course, use a preprocessor to add guards and a > scabbard. Some of the 'expert' features are harder to "fix" with > simple text substitution pre-processing. eg the (if I recall the > argument correctly) fall through CASE treatment which the original > expert probably didn't mind but which allows people to make mincemeat > of their programs if they do not understand that it diffres from more > protective languages. Yes, but we've been there exhaustively, and I think there's no real disagreement about many of the shortcomings of C (at least not between the three of us). Maybe let me add a last clarification... I don't like C and C++ that much, but I'm quite proficient in both and spend a lot of time with them. Bugs due to typos with the language symbols are /very/ rare IME (which doesn't include only my code; it also includes code reviewed by me and written by all kinds of programmers). When such typos happen, they are with a quite high probability caught by the compiler, and with some experience (and a decent compiler) quickly fixed (Olin's mishap notwithstanding). The bugs due to typos that I see are of the kind where you have maybe a member variable m_balloons and change your logic for a certain function to work with the function argument a_balloons and changed all instances of m_balloon but one. The probability of such typos is pretty much language-independent, and depends largely on other factors. FWIW, Olin's Pascal-to-C compiler is a sort of elaborate pre-processor that even takes out the fall-through in the switch :) (Well, I haven't seen it, but I'm sure it does.) The brackets-versus-begin/end debate is quite old, and I think there's not much under the sun that could be said that hasn't already been said. It's a matter of taste, not much more. The arguments of the fighters on either side are many, and some on each side do have merit -- but who's to say how exactly the balance is? For everybody? Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |