|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 | Next > |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Tamas Rudnai wrote:
>> Not quite. Firstly, I was talking about /standard/ libraries, where >> the compiler also "knows" what the functions are supposed to do >> (because it's defined in the same place where is defined what the >> compiler itself is supposed to do). > > Now I completely lost. Are you saying that a C compiler recognizes > specific function calls like printf and when you write printf("hello > %s", "world"); it realizes that a puts("hello world"); would be much > cheaper as part of the optimization? I'm not talking about C at all; this is a discussion of the merits of features built into the compiler versus features in a standard library. I talk about C when discussing Sergio's examples in C, and I talk about other languages to bring in examples. This discussion is in principle not specific to any language (other than maybe XSCB, given Sergio's experience with it :). But of course a C compiler may recognize a function "printf" and possibly do such an optimization. IIRC I've heard about C "preprocessors" that do exactly this: go through your code, analyze all printf statements and replace them with their idea of what's an optimal implementation. > Or you are talking about intrinsic functions within the library ... We seem to need a definition of "intrinsic function". It seems that already what Sergio, Olin and I think of when we use this term doesn't match, and you now added to the mix intrinsic functions in the library :) > ... where it still uses the printf function but makes all the code > optimizations as the library function was written in the place where > it was called from? If by "in the place" you mean by the same vendor, then yes, this is a possibility. > That's because of the operator '+' can be overloaded in a string type > object. In laguages like Pascal the string has a different structure > than in the ANSI C, so you have the actual length of the string at > the very beginning of the string buffer (it is like a minimalistic > buffer header). That makes it possible to implement string > manipulations faster and easier -- therefore a string concatenation > is an easy task by language definition. Here you're comparing two different string implementations. That's a different issue; this here is all about built-in vs library. The question whether Pascal-type strings are more efficient than C-type strings has nothing to do with this. >> Now what if I need to handle Unicode strings? Wait for a compiler >> upgrade? And what if that compiler upgrade doesn't handle the >> Unicode encoding I need? > > I agree with you that in C++ they put these things in a way that it > can be extended easily -- especially if the string was handled by > STL. In the other hand on an x86 PC on the compiler side they only > needed to change minor things, like replacing "rep movsb" to "rep > movsw" and problem solved -- while in C++ these things are function > calls to the overloaded functions from the string class. Haven't done much Unicode coding lately? :) It's not so easy. Unicode is /not/ just 16-bit wide characters. The arguably most common Unicode encoding is UTF-8, which has its characters in 8, 16 and 32 bits, depending on the character. Then there are other flavors, most with varying character widths. So just treating each character as a 16-bit entity doesn't work. > Also I am not sure with HiTech but in many embedded C compiler you can > tell which type of printf do you want to use within your application > (there are some minimalistic version, with some restricted version > like no float types and the full version). Hang on a minute, are you > saying the HiTech automatically chooses the smallest/fastest but > still functional one when you are compiling your application? Yes. Independently of what exactly HiTech does here (I don't have any of their PRO compilers), one can go rather far with that and actually, in a fine-grained way, automatically customize the printf source to really only include the parts that are actually needed. (As long as you're thinking of a C library, this could be done by preprocessor macros that the compiler gathers while analyzing the application and the passes to the printf source when it compiles it.) Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'>> Are you saying that a C compiler could understand that "sin()" is
>> standard and generate straight math processor instructions (intel FP >> has had a FSIN instruction since forever) rather than a library call? We are the Borg of Pentium. Division is futile. You will be approximated! Russell McMahon -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 15 Jul 2009, Gerhard Fiedler wrote: > I'm not talking about C at all; this is a discussion of the merits of > features built into the compiler versus features in a standard library. > I talk about C when discussing Sergio's examples in C, and I talk about > other languages to bring in examples. This discussion is in principle > not specific to any language (other than maybe XSCB, given Sergio's > experience with it :). > XCSB XCASM XCPROD :-) Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>> I resisted the temptation to quote all the stuff again... :) > > I felt it was necessary because several days have gone by between > your post and my response. Yes... no critique meant with my comment. It just was /a lot/ :) >>> You seem to be saying that provided a set of libraries are well >>> written and available in source form to the compiler that it can >>> compile these together with the user program and (given that the >>> compiler is implemented well enough by the compiler writers) that >>> the compiler should be able to extract enough information from all >>> the combined source code to generate a resulting executable that is >>> as good as one that would be generated if the language had more >>> built-in features such as STRING, LIST, DYNAMIC ARRAYS etc. >>> Furthermore that the compiler should be able to catch the same kind >>> of bugs in both cases. >> >> Not quite. Firstly, I was talking about /standard/ libraries, where >> the compiler also "knows" what the functions are supposed to do >> (because it's defined in the same place where is defined what the >> compiler itself is supposed to do). > > It is now clear to me that you are talking about intrinsic functions. > Yes the function is defined in a /standard/ library but the compiler > also knows about the function independently of the library. Would you call the C++ standard library functions (like std::list::insert) "intrinsic functions"? At least the meaning that this term seems to have in the Microsoft VC++ and gcc compiler documentation is not what I'm talking about. >>> if strings were built into the language we might instead write: >>> >>> str = "hello world" + string(j) >> >> See, in C++ for example, strings are /not/ built into the language, >> and you can write pretty much exactly this. (Not with the >> std::string, but if you extend it a bit, you can, so in the case of >> C++ it's not really a question whether or not it can be done with >> strings in a library but whether the library definition is >> sufficient.) > > But the C++ compiler understands what is going on here even less. We > now end up adding even more run time overheads just to make the > source code look better. Not sure to what degree a compiler "understands", and I don't want to drift off in a discussion about the arbitrary shortcomings of C or C++. But when a compiler "knows" the intent of the line (because all operations that happen are defined in the language standard) and knows the implementation (because it of course "sees" the implementation of the operators that are implemented in the compiler, but it also sees the implementation of the library functions by making their sources available) -- what's the difference that's left between built-in and library? >> I agree that the lack of a built-in decent string type in C can be a >> pain, especially in terms of syntax. OTOH, I bet your strings are >> 8-bit strings. Now what if I need to handle Unicode strings? Wait >> for a compiler upgrade? And what if that compiler upgrade doesn't >> handle the Unicode encoding I need? > > Yes I understand your point of view, but 8-bit strings are still very > useful even if you need to use Unicode in the same program. Just like > integers are very useful even though you might need to use floating > point. Right. My point was that if 8-bit strings are built-in and Unicode strings are in a library, and you are claiming (elsewhere) that the built-in syntax can be different from library syntax, then I need to make the Unicode string syntax completely different -- structurally different -- from 8-bit string syntax. Can you imagine that? What a pain. >> (Snipped prelude to substr argument.) >> >>> But this comes with it's won hazards, mainly that the user could VERY >>> easily write: >>> >>> str2 = substr(str1, pos1, len); >>> >>> where he actually needed: >>> >>> str2 = substr_lengeth(str1, pos1, len); >>> >>> Realistically how is a conventional compiler (one that is not mega >>> complex and running on an infinately fast build machine with an >>> infinate amount of RAM) going to spot this type of mistake without >>> adding a ton of attributes to the function prototype? >>> >>> If strings were built in we could simply say something like >>> >>> str2 = substr str1 from pos1 to pos2 >>> >>> or >>> >>> str2 = substr str1 from pos1 to end >>> >>> or >>> >>> str2 = substr str1 from pos1 length len >> >> This is a simple matter of syntax. You don't do much more here than >> comparing C-style syntax with BASIC-style syntax. A matter of >> taste... The BASIC-style syntax uses "substr ... from ... to" and >> "substr ... from ... length". A similar C-style syntax could use >> substr_from_to and substr_from_len -- or any number of similar >> variants. Then there's the LISP syntax, and a few others. I don't >> see what this has to do with built-in vs library. > > It makes a difference if you consider that each statement helps the > compiler understand what the perpose of a variable is. In the above > example 'length len' within the 'substr' statement allows the > compiler to understand that 'len' is being used to manipulate strings > in this fragment so it WOULD be able to help me catch a simple error > such as: > > for (j=0; j<len; j++) > { > len2 = strlen(arr[j]); > > arr2[j] = substr(arr[j], 0, len-2); > } Be that as it may, but this is a difference between a specific C-style syntax and a specific BASIC-style syntax. There's nothing that would prevent a language to allow BASIC-style syntax for libraries (functions with several indentifiers that separate the arguments and together form one library call), so I don't really see the point WRT our discussion of built-in vs library. FWIW, my point was about specific function names. You didn't use my function names, so I don't know what you meant here. If you want to, can you rewrite this using substr_from_to and substr_from_len, then explain what is the difference to the BASIC variant? >>> I wonder if what you are really saying is that the compiler can do >>> more error checking and optimisation because it has all the source >>> rather than pre-compiled libraries? >> >> What I'm saying is that if it has the intent /and/ the source (the >> implementation), it can apply both for (usually different, and >> complementary) optimizations. Not that different from what it can do >> for built-in constructs. > > Ok, but you really are talking about intrinsics as I understand them > with the addition of a standard library function for each intrinsic. We probably need a common definition of "intrinsic function". I was talking about standard libraries that contain functions (and other language elements) that are defined in the language standard, in the same way as the language elements that the compiler implements. And additionally, to give the compiler access to the specific implementation (not only to the intent), the libraries are available to the compiler in source code. The intrinsic functions that are used e.g. in gcc and VC++ are not of this type. Most (if not all) are not even standard. >>> I talk about the brick wall between the compiler and the libraries >>> and you respond with "make the source of the libraries available". >> >> No. I say consider that the library is a /standard/ library. Adding >> the source code is in addition, so that the compiler not only >> "knows" the intent, but also sees the implementation. > > got you. intrinsic + library I'm not so sure... :) "Intrinsic" seems to imply (possibly among other things) that the library is written by the compiler vendor. I don't mean to imply this. >> Look at the C++ standard library definition for std::list::insert, >> for example. It contains a definition that allows each C++ compiler >> to "understand" what a call to std::list::insert is supposed to do. > > I will look at this. Can you point me at a specific doc and library > so that I can be sure to look at exactly what you are looking at. Take any standardized language with a standardized library. For example, the C++ standard (the real one costs money, but here is a not quite up to date one <http://www.open-std.org/jtc1/sc22/open/n2356/>). And of course, anything that's missing /could/ be there -- it just isn't. Or C# (ECMA-336) and the .NET CLI (ECMA-335), even though that's possibly a bit different. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>> sergio masci wrote: >> >>> Having a language and compiler that would generate efficient >>> multitasking code on both a 628 and Windows XP in a multicore >>> processor would be a feat. But the hardest part (that of getting it >>> to work on a 628) is done! Surely you can see that doing the same >>> for a system with MUCH better resources is trivial in comparison. >> >> No, I can't see that. Portable, safe and efficient multitasking on a >> modern multicore system is in no way trivial. It is IMO more complex >> than multitasking on a 628, by a few magnitudes. > > No Gerhard the fundamental principle is still indentical - you switch > task contexts. Wheather this means that you need to completely save > one CPU context and load another OR you arrange for the the task > contexts to co-exist and simply switch between them is upto the > implementor. This is in the context of the compiler implementing multitasking. Multitasking is not only about starting threads and maintaining their state, it's also about synchronization between threads, and do so efficiently. And many aspects of this depend on the exact situation. One difference between the 628 and Windows XP or Linux is that the compiler in the first case does part of what the OS it the other case does, so the implementations are different. The other is that I have a somewhat more sophisticated expectation of how multitasking works on a Windows or Linux PC when compared to a 628. Also, on a single-core system, an entity that can be manipulated atomically can be used to synchronize between threads. We do this all the time on PICs with bits and bytes. It doesn't work this simple way on a multi-core system; you need additional provisions to synchronize the corresponding processor cache lines. Then there's the performance issue. On a single-core system, normal locks for accessing containers and making sure they are always in a consistent state are good enough; as long as the code is inside the container, it's doing useful work, so serializing even read accesses isn't going to cost me much if anything. However, on a multi-core system, reader/writer locks may make a lot of sense and speed things up quite dramatically, as now several readers can read literally /at the same time/ the container, and the lock is only used to make sure a writer is always alone in the container. So there are quite substantial differences in multitasking between a small bare-metal system and a bigger system with a complete OS on one hand, and between single-core systems and multi-core systems on the other hand. If a compiler is to implement multitasking, it needs to take all this into account. I don't know whether it would make sense to use the same functionality for multitasking keyboard input, display output and measurement on a 628 and for a web server on Linux, but I suspect it wouldn't. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'William "Chops" Westfield wrote:
>> If I had meant intrinsics, I'd havewritten about intrinsics. > >> I said that if a library is a /standard/ library (and I have >> mentioned this word "standard" a few times; you didn't seem to pick >> up on this), > > What exactly do you see as the difference between an "intrinsic" > function and a Standard library function? I mean sin() is a standard > library function in C, but an intrinsic function in fortran/pascal/ > etc, right? My point in this whole thing is exactly that there isn't a big difference, and I'm trying to get to the bottom of what Sergio thinks is the difference. So this would be a question for Sergio :) > Are you saying that a C compiler could understand that "sin()" is > standard and generate straight math processor instructions (intel FP > has had a FSIN instruction since forever) rather than a library call? > (assuming that it "knows" that the instructions and the library are > supposed to generate the same results...) Sergio's example a while ago was that since the compiler knows what remove(), insert() and replace() do (because they are part of the language rather than a library), it is possible for the compiler to replace a sequence of remove() and insert() with an appropriate replace(), making the code more efficient. My counter-argument was that if remove(), insert() and replace() are /standard/ functions, the compiler doesn't have to actually implement them to know that it can replace a remove() call followed by an insert() call with a replace() call -- they may just as well be in a (standard!) library. (All obviously used in a way that this is actually correct and makes sens. We're not talking about compiler mistakes or library bugs or unusable definitions... :) Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
> From what you are saying now, it is clear to me that yes you are > talking about intrinsic functions. What follows doesn't seem to match my understanding of "intrinsic function". As long as we use this term, can we define it, so that we have a common understanding? > Yes the function is defined in a library but the compiler also knows > about the intent of the function. Correct, but maybe better "the source code of the function is defined in a library". The intent of the function is defined in the language standard. (Being defined in a library doesn't seem to match what's generally meant with "intrinsic".) > The intent is seperate from the definition since and is included by > the compiler writer directly into the compiler. Correct. > The definition of the function is provided by someone else (library > implementor?) ... Correct. Not necessary, but may be. > ... and may even be at odds with what the compiler understands it to > be. It is possible, of course, that they are at odds, but I'd classify this as a bug (in the compiler, in the library or in the standard). It seems we're getting somewhere :) See also my response to Bill. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Gerhard Fiedler ha scritto:
> Sergio's example a while ago was that since the compiler knows what > remove(), insert() and replace() do (because they are part of the > language rather than a library), it is possible for the compiler to > replace a sequence of remove() and insert() with an appropriate > replace(), making the code more efficient. Just my .0002c :) into this interesting topic: won't this make the programmer "lazier", while OTOH C lets you use your brain better? -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Note that you're no worse off if it doesn't support the character
representation you like. The reverse can also be a pain, like Java where everything is one of those unicode thingies. Maybe that's nice if you want to appease some illiterate in a distance jungle somewhere, but it's a hassle if you want to send a stream of 8 bit characters to a microcontroller. And now even 65K glyphs are apparently not enough. Is this nonsense ever going to end? Only peripherally related to the main point of this thread but: I know (hope?) that that comment was included for its provocative value rather than being intended asa racist / jest plain iggerant comment. To characterise those whose languages require, for whatever reason, more than 8 binary bits to represent them as - illiterate - far away (and thus by implication unimportant) - jungle (savage/ unworthy ...) is rather closer to ad hominem attacj than useful comment. I'm not overly into PC stuff (nothing to do with IBM)(& despite impressions formed to the contrary by some :-)) BUT you risk rejecting an awful lot of people and knowledge and usefulness and more if you genuinely suggest sticking to 8 bits because the mother tongue jest happens to manage OK with that many, or even less. (If you can't express it in Baudot it's not worth saying? :-) )(Or Ogham). Sure, needing to use extra bits to accomodate material that you have no interest in is a pain, but largely not a major problem with modern systems * except in systems so small that they can probably afford to not deal with such codes. (* Nobody should ever need more trhan 640 kB of memory**). Russell ** Claims to the contrary notwithstanding, Gates denies ever having said it. 2009/7/15 Olin Lathrop <olin_piclist@...> > Gerhard Fiedler wrote: > I resisted the temptation to quote all the stuff > again... :) > > > It really has been confusing trying to figure out what exactly your point > > is. > > Not quite. Firstly, I was talking about /standard/ libraries, where the > > compiler also "knows" what the functions are supposed to do (because > > it's defined in the same place where is defined what the compiler itself > > is supposed to do). > > > But that's exactly what intrinsic functions are, which you strongly claimed > you weren't talking about when I suggested that's what you might mean. Now > I am (and I think Sergio too) really confused. How is what you mean not > intrinsic functions? > > > I agree that the lack of a built-in decent string type in C can be a > > pain, especially in terms of syntax. OTOH, I bet your strings are 8-bit > > strings. Now what if I need to handle Unicode strings? > > > Tell your customers to use ASCII like civilized people ;-) > > Wait for a > > compiler upgrade? And what if that compiler upgrade doesn't handle the > > Unicode encoding I need? > > > Note that you're no worse off if it doesn't support the character representation > you like. The reverse can also be a pain, like Java where > everything is one of those unicode thingies. Maybe that's nice if you want > to appease some illiterate in a distance jungle somewhere, but it's a > hassle > if you want to send a stream of 8 bit characters to a microcontroller. And > now even 65K glyphs are apparently not enough. Is this nonsense ever going > to end? > > >> And we even got rid of the huge printf library as a side effect. > > > > If that's a concern, HiTech for example parses all printf strings in the > > whole application and based on that decides what to include into printf. > > > There are usually several ways around a single problem, but I'm not sure what > exactly your point is here. > > > Since printf is a /standard/ library function, they can do this -- and > I > > still can override theirs with mine, or make theirs work with my own > > putch function. I don't know how exactly they do it, but it doesn't > > really matter whether they have a bunch of configuration parameters for > > a printf library function and the compiler sets the configuration > > parameters accordingly, or whether their printf function is implemented > > in the compiler itself. It doesn't matter because it's a standard > > library, and as long as their compiler behaves accordingly (and lets me > > override their function with my own if I want), it's all fine. > > > So it sounds like you want a bunch of intrinsic functions (which represent > a large amount of compiler work), but have any optimization defeated and > your > own function called if you define one? > > > It's the number of symbols and their structure that makes a feature set > > "over inflated". IMO it doesn't make a difference whether you have 5000 > > symbols in a library and people think it's too much or whether you have > > 5000 symbols in a compiler and people think it's too much. > > > I think part of the point is that certain features, like the string > handling Sergio described, require a lot less special syntax when built in > than they > would require special functions if "built in" that way. > > > ******************************************************************** > Embed Inc, Littleton Massachusetts, http://www.embedinc.com/products > (978) 742-9014. Gold level PIC consultants since 2000. > -- > http://www.piclist.com PIC/SX FAQ & list archive > View/change your membership options at > http://mailman.mit.edu/mailman/listinfo/piclist > > http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Sorry ... meant to trim.
Not used to using GMail for lists. Hides the quotes where you can't see them (easily). Russell -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
|
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 15 Jul 2009, Gerhard Fiedler wrote: > sergio masci wrote: > > > It is now clear to me that you are talking about intrinsic functions. > > Yes the function is defined in a /standard/ library but the compiler > > also knows about the function independently of the library. > > Would you call the C++ standard library functions (like > std::list::insert) "intrinsic functions"? At least the meaning that this > term seems to have in the Microsoft VC++ and gcc compiler documentation > is not what I'm talking about. No. > > > >>> if strings were built into the language we might instead write: > >>> > >>> str = "hello world" + string(j) > >> > >> See, in C++ for example, strings are /not/ built into the language, > >> and you can write pretty much exactly this. (Not with the > >> std::string, but if you extend it a bit, you can, so in the case of > >> C++ it's not really a question whether or not it can be done with > >> strings in a library but whether the library definition is > >> sufficient.) > > > > But the C++ compiler understands what is going on here even less. We > > now end up adding even more run time overheads just to make the > > source code look better. > > Not sure to what degree a compiler "understands", and I don't want to > drift off in a discussion about the arbitrary shortcomings of C or C++. > But when a compiler "knows" the intent of the line (because all > operations that happen are defined in the language standard) and knows > the implementation (because it of course "sees" the implementation of > the operators that are implemented in the compiler, but it also sees the > implementation of the library functions by making their sources > available) -- what's the difference that's left between built-in and > library? home in on "'knows' the intent of the line" I am not talking about parsing a line and understanding it its meaning, I'm talking about understaning several lines as a unit. What is the programmer trying to say in these several lines. If I write: for (j=0; j<strlen(str); j++) { if (str[j] >= 'a' && str[j] <= 'z') { str[j] = str[j] ^ ('a' ^ 'A'); } } would you expect the compiler to flag "j<strlen(str)" or maybe replace it with the much more optimised: xlen = strlen(str); for (j=0; j<xlen; j++) { if (str[j] >= 'a' && str[j] <= 'z') { str[j] = str[j] ^ ('a' ^ 'A'); } } maybe optimise it further as: xlen = strlen(str); for (j=0; j<xlen; j++) { ptr = &str[j]; if (*ptr >= 'a' && *ptr <= 'z') { *ptr = *ptr ^ ('a' ^ 'A'); } } So here the compiler looked at several statements as a single unit and was able to optimise it (this is quite within the capabilities or modern C compilers). Now going a little further you could say that it was the intent of the programmer to "change all lower case alpha characters to uppercase". So the intent of the programmer spans several statements. Put this another way, if I write in assembler: movf X+0, w addwf Y+0 btfsc STATUS, C incf Y+1 movf X+1, w addwf Y+1 This code adds one 16 bit variable to another 16 bit variable. Here the programmers intent is to add 16 bit variable X to 16 bit variable Y. It is far easier for you, me and the compiler to understand the programmers intent if instead I write: X = X + Y; > > > >> I agree that the lack of a built-in decent string type in C can be a > >> pain, especially in terms of syntax. OTOH, I bet your strings are > >> 8-bit strings. Now what if I need to handle Unicode strings? Wait > >> for a compiler upgrade? And what if that compiler upgrade doesn't > >> handle the Unicode encoding I need? > > > > Yes I understand your point of view, but 8-bit strings are still very > > useful even if you need to use Unicode in the same program. Just like > > integers are very useful even though you might need to use floating > > point. > > Right. My point was that if 8-bit strings are built-in and Unicode > strings are in a library, and you are claiming (elsewhere) that the > built-in syntax can be different from library syntax, then I need to > make the Unicode string syntax completely different -- structurally > different -- from 8-bit string syntax. Can you imagine that? What a > pain. > Yes you are right it would be a pain. But it is a move in the right direction. > > >> (Snipped prelude to substr argument.) > >> > >>> But this comes with it's won hazards, mainly that the user could VERY > >>> easily write: > >>> > >>> str2 = substr(str1, pos1, len); > >>> > >>> where he actually needed: > >>> > >>> str2 = substr_lengeth(str1, pos1, len); > >>> > >>> Realistically how is a conventional compiler (one that is not mega > >>> complex and running on an infinately fast build machine with an > >>> infinate amount of RAM) going to spot this type of mistake without > >>> adding a ton of attributes to the function prototype? > >>> > >>> If strings were built in we could simply say something like > >>> > >>> str2 = substr str1 from pos1 to pos2 > >>> > >>> or > >>> > >>> str2 = substr str1 from pos1 to end > >>> > >>> or > >>> > >>> str2 = substr str1 from pos1 length len > >> > >> This is a simple matter of syntax. You don't do much more here than > >> comparing C-style syntax with BASIC-style syntax. A matter of > >> taste... The BASIC-style syntax uses "substr ... from ... to" and > >> "substr ... from ... length". A similar C-style syntax could use > >> substr_from_to and substr_from_len -- or any number of similar > >> variants. Then there's the LISP syntax, and a few others. I don't > >> see what this has to do with built-in vs library. > > > > It makes a difference if you consider that each statement helps the > > compiler understand what the perpose of a variable is. In the above > > example 'length len' within the 'substr' statement allows the > > compiler to understand that 'len' is being used to manipulate strings > > in this fragment so it WOULD be able to help me catch a simple error > > such as: > > > > for (j=0; j<len; j++) > > { > > len2 = strlen(arr[j]); > > > > arr2[j] = substr(arr[j], 0, len-2); > > } > > Be that as it may, but this is a difference between a specific C-style > syntax and a specific BASIC-style syntax. There's nothing that would > prevent a language to allow BASIC-style syntax for libraries (functions > with several indentifiers that separate the arguments and together form > one library call), so I don't really see the point WRT our discussion of > built-in vs library. I think I understand what you mean. But just to clarify: You are suggesting that instead of simply defining a function as: substr(char *, int); substr(char *, int, int); That I could instead write: substr(char * 'FROM' int); substr(char * 'FROM' int 'TO' int); substr(char * 'FROM' int 'LENGTH' int); Interesting. But you'd still need to be able to attach a boat load of attributes to the function to give the compiler the same capabilities it would have if these functions were actually built-in language constructs and the compile times would be horrendous. I think it would be do-able but you'd still need some kind of meta language to describe how these functions would interact with each other, their parameters and local variables that are used with them (from outside the call and not just as a parameter). > > FWIW, my point was about specific function names. You didn't use my > function names, so I don't know what you meant here. If you want to, can > you rewrite this using substr_from_to and substr_from_len, then explain > what is the difference to the BASIC variant? > for (j=0; j<len; j++) { len2 = strlen(arr[j]); arr2[j] = substr_from_len(arr[j], 0, len-2); } can you see that "len-2" should actually "len2"? > > >>> I wonder if what you are really saying is that the compiler can do > >>> more error checking and optimisation because it has all the source > >>> rather than pre-compiled libraries? > >> > >> What I'm saying is that if it has the intent /and/ the source (the > >> implementation), it can apply both for (usually different, and > >> complementary) optimizations. Not that different from what it can do > >> for built-in constructs. > > > > Ok, but you really are talking about intrinsics as I understand them > > with the addition of a standard library function for each intrinsic. > > We probably need a common definition of "intrinsic function". I was > talking about standard libraries that contain functions (and other > language elements) that are defined in the language standard, in the > same way as the language elements that the compiler implements. And > additionally, to give the compiler access to the specific implementation > (not only to the intent), the libraries are available to the compiler in > source code. > > The intrinsic functions that are used e.g. in gcc and VC++ are not of > this type. Most (if not all) are not even standard. An intrinsic function is one that the compiler understands intimately. Not just what parameters it takes and what result it returns. A good example of an intrinsic function is the add operator ('+'). It seems to be used differently to a user defined function because you use it in expressions as an infix operator e.g. A = B + C funcX(B + C) But this is just for convenience. If you were able to define a symbol in C, say 'ADD', with all the attributes that the '+' symbol has then '+' and 'ADD' would both behave the same way and the compiler would generate the same optimised code for both. e.g. A = B ADD C funcX(B ADD C) In C++ you can actually write your own add ('+') operator function. > > > >>> I talk about the brick wall between the compiler and the libraries > >>> and you respond with "make the source of the libraries available". > >> > >> No. I say consider that the library is a /standard/ library. Adding > >> the source code is in addition, so that the compiler not only > >> "knows" the intent, but also sees the implementation. > > > > got you. intrinsic + library > > I'm not so sure... :) "Intrinsic" seems to imply (possibly among other > things) that the library is written by the compiler vendor. I don't mean > to imply this. No not necessarily. If the compiler vendor were explicit as to how an intrinsic function's counterpart in the library should be written then a third party could do the work. But it does seem like a lot of trouble to go to on the part of the compiler writer. > > > >> Look at the C++ standard library definition for std::list::insert, > >> for example. It contains a definition that allows each C++ compiler > >> to "understand" what a call to std::list::insert is supposed to do. > > > > I will look at this. Can you point me at a specific doc and library > > so that I can be sure to look at exactly what you are looking at. > > Take any standardized language with a standardized library. For example, > the C++ standard (the real one costs money, but here is a not quite up > to date one <http://www.open-std.org/jtc1/sc22/open/n2356/>). And of > course, anything that's missing /could/ be there -- it just isn't. > > Or C# (ECMA-336) and the .NET CLI (ECMA-335), even though that's > possibly a bit different. Ok I've had a look on the net and it's just template stuff. As I've said before, the compiler isn't really understanding intent here it's just mechanically churning out code and reducing it as much as possible. It gives the illusion that it understands what is going on because so much source code is being condensed into a small executable but the reallity is that all that source code is hand holding the compiler and telling it exactly what to generate. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'On Wed, 15 Jul 2009, Dario Greggio wrote: > Gerhard Fiedler ha scritto: > > > Sergio's example a while ago was that since the compiler knows what > > remove(), insert() and replace() do (because they are part of the > > language rather than a library), it is possible for the compiler to > > replace a sequence of remove() and insert() with an appropriate > > replace(), making the code more efficient. > > Just my .0002c :) into this interesting topic: > > won't this make the programmer "lazier", while OTOH C lets you use your > brain better? You could say the same about moving from C to assembler. But as I keep saying, the main point isn't about optimisation it's about understaning the intent of the programmer. The better you understand the intent the better you can catch mistakes. It just so happens that the better you understand the intent the better you can ALSO optimise the generated code. Friendly Regards Sergio Masci -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>> (other than maybe XSCB, given Sergio's experience with it :). > > XCSB > > XCASM > > XCPROD I had already remembered that it starts with X and contains the letters B, C, S and X. Then I could remember that it ends with B. Now I hopefully will remember that it starts with XC -- and will never again spell it wrong :) Greetings to the left side of your desk :) Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'sergio masci wrote:
>>> It is now clear to me that you are talking about intrinsic >>> functions. Yes the function is defined in a /standard/ library but >>> the compiler also knows about the function independently of the >>> library. >> >> Would you call the C++ standard library functions (like >> std::list::insert) "intrinsic functions"? At least the meaning that >> this term seems to have in the Microsoft VC++ and gcc compiler >> documentation is not what I'm talking about. > > No. But this is what I'm talking about. So I was right with my suspicion that I wasn't talking about what you called "intrinsics". > home in on "'knows' the intent of the line" This is what I'm trying to do. This is the reason why I want to understand what it is that makes a function that is defined by the standard (and is implemented in a library) different from a construct that is defined in the same standard and implemented inside the compiler. I haven't yet seen an example by you that I could understand -- and that wasn't about something else. [Snipped string loop optimization example.] > So here the compiler looked at several statements as a single unit and > was able to optimise it (this is quite within the capabilities or > modern C compilers). Or other languages... something like this seems to be standard in good compilers. > It is far easier for you, me and the compiler to understand the > programmers intent if instead I write: > > X = X + Y; Agreed. >> Right. My point was that if 8-bit strings are built-in and Unicode >> strings are in a library, and you are claiming (elsewhere) that the >> built-in syntax can be different from library syntax, then I need to >> make the Unicode string syntax completely different -- structurally >> different -- from 8-bit string syntax. Can you imagine that? What a >> pain. > > Yes you are right it would be a pain. But it is a move in the right > direction. I'm not sure. I think I wouldn't like it. For example, it seems that for some things Pascal-style strings are more efficient than C-style strings. There's nothing that prevents me from working with Pascal-style strings efficiently in C++ (and no matter whether ASCII, 8-bit with different codepages, different encodings of Unicode) -- in the same idiom that is used for standard strings in C++. This is because strings are /not/ built into the language (among other things). Being able to work with similar types in a similar way is important for code quality. If you need to use a different idiom for different string encodings, code quality goes down -- and code quality is important. > I think I understand what you mean. But just to clarify: You are > suggesting that instead of simply defining a function as: > > substr(char *, int); > substr(char *, int, int); > > That I could instead write: > > substr(char * 'FROM' int); > substr(char * 'FROM' int 'TO' int); > substr(char * 'FROM' int 'LENGTH' int); > > Interesting. That's one possible form, but it's not quite what I meant. What I meant is that you have a way to declare the function and syntax so that you actually can write SUBSTR str1 FROM pos1 TO end and the compiler knows what library to call in which way. This doesn't look too complicated, and it's not actually that different from SUBSTR_FROM_TO str1, pos1, end Just a slightly different syntax. > But you'd still need to be able to attach a boat load of attributes to > the function to give the compiler the same capabilities it would have > if these functions were actually built-in language constructs and the > compile times would be horrendous. I don't understand why. This doesn't seem to be much more complicated to parse than a normal function call. It's a starting token SUBSTR, followed by five tokens that have to be three expressions interspersed with FROM and TO. > I think it would be do-able but you'd still need some kind of meta > language to describe how these functions would interact with each > other, their parameters and local variables that are used with them > (from outside the call and not just as a parameter). I'm not sure what you mean by "interaction". The interaction of functions is defined by the language standard that defines what they do. Their arguments would be described just as they are in any procedural language. You'd need a bit of a meta language to define such a construct, but not much I think. I don't think this goes much further than a normal function declaration; just add to the "<typename> <identifier>" pairs a possible pair "KEYWORD <identifier>". > An intrinsic function is one that the compiler understands intimately. > Not just what parameters it takes and what result it returns. Next step is to define what "understands intimately" means. Does a specification (like in a language standard) qualify? > In C++ you can actually write your own add ('+') operator function. Which would be different from the standard operator+() functions, in that the compiler doesn't "know" what they are supposed to be doing and has to treat them as normal function calls (in the case of C++ with the special rules that the language standard defines for operator+(), of course). >> Take any standardized language with a standardized library. For >> example, the C++ standard (the real one costs money, but here is a >> not quite up to date one <http://www.open-std.org/jtc1/sc22/open/n2356/>). >> And of course, anything that's missing /could/ be there -- it just >> isn't. >> >> Or C# (ECMA-336) and the .NET CLI (ECMA-335), even though that's >> possibly a bit different. > > Ok I've had a look on the net and it's just template stuff. As I've > said before, the compiler isn't really understanding intent here it's > just mechanically churning out code and reducing it as much as > possible. It gives the illusion that it understands what is going on > because so much source code is being condensed into a small > executable but the reallity is that all that source code is hand > holding the compiler and telling it exactly what to generate. Not sure whether the standard can tell you what kind of optimizations actual compilers implement. Much of this is probably a trade secret. Then I'm not convinced a casual look at this is enough to find out what /could/ be implemented. They don't talk about optimizations in the standard; they just say what has to be the result. Going back to one of your original arguments, the one that prompted my question about what the difference is between list operations that are implemented by the compiler and list operations that are implemented in a standard library: the substitution of a delete from a list followed by an insert into a list at the same location by a replace. If you want to implement such an optimization in your compiler, what's the difference between having these functions as part of the compiler, or having them defined as standard library functions in the standard that the compiler is based upon? In both cases, the compiler "knows" that this substitution is possible and the exact implementation of the three functions is not necessary to be known for this optimization to be possible. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
> It really has been confusing trying to figure out what exactly your > point is. Sergio's point is that there is a really important difference WRT optimizations between constructs that are implemented in the compiler and constructs that are implemented in a standard library (that is defined in the same language standard). My point is that I don't see a difference. His example with respect to this was that in the case that the compiler implements list handling, it is able to "know" that a 'delete' followed by an 'insert' into the same location can be replaced by a 'replace' as optimization. He talks about 'intent', and that the list handling being implemented by the compiler allows it to better grasp the programmers intent. My point is that if the list handling functions 'delete', 'insert' and 'replace' are defined in the language standard that the compiler implements, it is able to detect that intent in the same way and make this optimization, no matter whether the functions themselves are implemented in the compiler or in a (standard) library. The important thing here is that they are defined by the language standard. Most of the rest of the thread goes back to this; this is fundamental to keep in mind as the issue. >> Not quite. Firstly, I was talking about /standard/ libraries, where >> the compiler also "knows" what the functions are supposed to do >> (because it's defined in the same place where is defined what the >> compiler itself is supposed to do). > > But that's exactly what intrinsic functions are, which you strongly > claimed you weren't talking about when I suggested that's what you > might mean. Now I am (and I think Sergio too) really confused. How > is what you mean not intrinsic functions? For example, in C++ there is are list handling function defined in the standard (e.g. std::list::insert). This is what I mean, this is what I wrote. I have never heard anybody refer to this as "intrinsic function". The intrinsic functions that are called so by Microsoft (in VC++) and the writers of gcc are typically /not/ part of the language standard, and therefore something completely different from what I'm talking about. > Note that you're no worse off if it doesn't support the character > representation you like. Yes, I am -- if the string library doesn't use the same syntax that the built-in strings use. Consistency is important for code quality, and having to use different code constructs and functions for doing the same things with different types of strings lowers consistency and therefore code quality. For example, you may be able to use "string1 + string2" for concatenating ASCII strings, but have to use concat(string1,string2) for Unicode strings or 8-bit strings that handle codepage translations. I have to repeat myself: what a pain. > The reverse can also be a pain, like Java where everything is one of > those unicode thingies. Exactly. For flexibility, it is important to be able to choose -- which speaks against being built-in. (Not necessarily, but as a tendency.) > Maybe that's nice if you want to appease some illiterate in a distance > jungle somewhere, Now this is a completely unnecessary and stupid remark. What has using a character set that can't be represented in ASCII to do with illiteracy or distant jungles? > So it sounds like you want a bunch of intrinsic functions (which > represent a large amount of compiler work), but have any optimization > defeated and your own function called if you define one? I don't "want" any of this. You need to understand the context of this argument, and place what I write into this context. No affirmation here stands alone. But yes, this would be (and is) typically possible with standard functions that are implemented in a library. No amount of optimization in the compiler is guaranteed to address all cases, so being able to use a custom implementation is possibly one of the reasons of the popularity of C in the small micro market. >> It's the number of symbols and their structure that makes a feature >> set "over inflated". IMO it doesn't make a difference whether you >> have 5000 symbols in a library and people think it's too much or >> whether you have 5000 symbols in a compiler and people think it's >> too much. > > I think part of the point is that certain features, like the string > handling Sergio described, require a lot less special syntax when > built in than they would require special functions if "built in" that > way. Can you make up an example here? I think it's the other way round: if it's built into the language, there tends to be a special syntax, whereas if it's part of a standard library, it tends to use the standard syntax that's already there. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
>> Right. My point was that if 8-bit strings are built-in and Unicode >> strings are in a library, and you are claiming (elsewhere) that the >> built-in syntax can be different from library syntax, then I need to >> make the Unicode string syntax completely different -- structurally >> different -- from 8-bit string syntax. Can you imagine that? What a >> pain. > > Maybe not. If each type of string is a different data type, then the > syntax can be simple and the compiler could even convert from one > representation to another if this can be done unambiguously. Correct. But this was in the context of the various claims that there is an essential difference between built into the compiler and in a library. Both you and Sergio said at one point that the syntax could be different for the constructs built into the compiler than what's possible for features implemented in a (standard) library. Here you're saying that the syntax can be the same. I'm with you on this... (FWIW, what you describe is how strings are implemented in C++.) But then you need to make sure that the syntax for the constructs built into the compiler is not different from the syntax that's possible for constructs in a (user) library. Can't have it both ways. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
|
|
|
Re: [TECH] language - was [PIC] using BREAK in 'C'Olin Lathrop wrote:
>>> I think part of the point is that certain features, like the string >>> handling Sergio described, require a lot less special syntax when >>> built in than they would require special functions if "built in" >>> that way. >> >> Can you make up an example here? I think it's the other way round: >> if it's built into the language, there tends to be a special syntax, >> whereas if it's part of a standard library, it tends to use the >> standard syntax that's already there. > > I did provide some examples that you snipped. Using, for example, > the "+" operator for string concatenation is no additional syntax at > all. It can also handle various cases of strings of different types > (ASCII, unicode, whatever) or whether they are characters or strings > if the language makes such a distinction. Unless you allow for > function overloading, you'd have a large mess of function names for > all the possible combinations. Even without overloading you probably > have a few for different special cases for efficiency. > > Using the operator in this example, it is easier for the compiler to > "see" what is going on and deal with the combinations of string types > and special cases itself. You could in theory define a generalized > string concatenation function that the compiler understands as a > special case, but that's really the same thing with a different > syntax, more work for the compiler writer, and less readable code. > > I'd rather write > > strb <-- stra + stru + " more stuff" > > than > > strb <-- StringConcatenate(stra, stru, " more stuff"); > > and I think Sergio is saying that from a compiler writer's point of > view he'd prefer the first also. So do I, but I fail to understand why this requires that the string handling is built into the compiler. Your preferred syntax looks (in principle) quite similar to the C++ std::string syntax, where normal string handling is implemented in a standard library. Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist |
|
|
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |