|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
Function Address fixup missing?Dear Sirs,
I have a problem that, after much thinking with the disassembly and maps, I cannot explain other than a compiler or linker bug. This problem appears on a big program. I tried to make a smaller version exposing the problem, without success, in the sense that any smaller version seems to work. So I am sorry that I cannot place here the full program, but I will try to give all needed information, hoping for some help. If needed I can send other information or all the code if necessary. Thank you in advance for any help or workaround. I have a program for the AtMega2561 that is more that 128K and that make use of function pointers. The compiler is WinAvr as of March 13, 2009. As I understood when a function body is in the upper 128K, and the address of this function is taken, the compiler (or linker) generates a small stub in the "trampoline" area in lower memory that contains a jump to the function body. The "address of" operator then returns the address of this stub, rather that the address of the function itself. This way an indirect jump through the pointer (only 16 bits) ends up, being the EIND register always zero, to the stub, which in turn makes a full jump to the function. If this is correct (and I have verified that this is the case with smaller programs), the I don't understand the following results: I have a function in upper 128K: int ButtamiViaSubito(void) { return 3; } In the main (in lower memory) i take the address and then I call the function: extern int ButtamiViaSubito(void); typedef int (*PuntaAButtami)(void); PuntaAButtami puntatore; int main(void) { ... puntatore = ButtamiViaSubito; // (1) ... int butta = puntatore(); // (2) ... } In the disassembly of the full program (out of elf file) i find: In the trampoline area: 00003e50 <__trampolines_start>: 3e50: 0d 94 45 07 jmp 0x20e8a ; 0x20e8a <test+0x9e> which looks correct: the function is at 20e8a. instruction (1) is: puntatore = ButtamiViaSubito; 3efc: 80 e0 ldi r24, 0x00 ; 0 3efe: 90 e0 ldi r25, 0x00 ; 0 3f00: 90 93 82 0b sts 0x0B82, r25 3f04: 80 93 81 0b sts 0x0B81, r24 which looks wrong. The first 2 ldi should load 3F28 which is the word address of the trampoline. It looks that this fixup is not filled by the linker leaving the value at zero. The pointer is in RAM at B81. The instruction (2) is also correct: int butta = puntatore(); 3ffe: e0 91 81 0b lds r30, 0x0B81 4002: f0 91 82 0b lds r31, 0x0B82 4006: 19 95 eicall The same seems to happen also within the library around the fputc function, which I assume is using function pointers. Thanks again for any help. Regards. Mau. _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
RE: Function Address fixup missing?> I have a program for the AtMega2561 that is more that 128K
> and that make use of function pointers. The compiler is > WinAvr as of March 13, 2009. > > As I understood when a function body is in the upper 128K, > and the address of this function is taken, the compiler (or > linker) generates a small stub in the "trampoline" area in > lower memory that contains a jump to the function body. > The "address of" operator then returns the address of this > stub, rather that the address of the function itself. > This way an indirect jump through the pointer (only 16 bits) > ends up, being the EIND register always zero, to the stub, > which in turn makes a full jump to the function. Trampolines work only for statically linked functions, not function pointers. This is a known bug and cannot easily be fixed. The root of the problem is that GCC's architecture does not lend itself to 24-bit entities. As you noted, function pointers are 16 bits, which limits the range to 128K of flash (since all AVR instructions are 2 bytes wide, the program counter addresses 2 byte words). GCC could be modified to run with 32-bit (4 byte) pointers, but the ATTiny folks would yell like mashed cats if that were done universally. I suppose that a switch could be placed telling GCC whether to use 16 bit or 32 bit pointers, but then the avr-libc library would need to be compiled both ways leading to problems of making sure that the correct library is linked with the correct compiled source. All of the above could be done. However, the GCC team are notoriously low on volunteers. Interested in taking this project on? I thought not. For the moment, the only solution is to place the target of all function pointers in the bottom part of the ATMega2560/1's flash. If you check my post on FreeRTOS for the ATmega2560/1 on AVR Freaks (http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=70387 ) you will find that I describe how to place function pointer targets in low flash. I should point out that there *is* another fix for the problem: I understand the IAR and CodeVision have compilers capable of handling the larger ATMega series. Oh wait, those cost *money*. Well, Cheap Fast Good - choose two. Best regards, Stu Bell DataPlay (DPHI, Inc.) _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
Re: Function Address fixup missing?Stu Bell wrote: > Trampolines work only for statically linked functions, not function > pointers. Sorry, but I don't understand this. I believe that there is no problem with statically linked functions. GCC actually generates a call with the full address, and no 16 bit pointers are involved, as in the following disassembly: butta = ButtamiViaSubito(); 4014: 0f 94 f5 06 call 0x20dea ; 0x20dea Here I am calling the same function but directly, and GCC generates the correct code for the AtMega2561. Actually, as I told in my previous mail, I tried a small program where it seems to work. Here is the main: typedef void (*PPro)(void); PPro ppro1; extern void pro1(void) __attribute__ ((section ("spro1"))); int main(void) { ppro1 = pro1; pro1(); ppro1(); while(1) ; return(0); } Here is the function: __attribute__((section ("spro1"))) void pro1(void) { return; } In the linker flags I added: -section-start=spro1=0x21000 to place the routine in the upper 128K. and the relevant disassembly is: ... 000000cc <__trampolines_start>: cc: 0d 94 00 08 jmp 0x21000 ; 0x21000 <pro1> ... ppro1 = pro1; 116: 86 e6 ldi r24, 0x66 ; 102 118: 90 e0 ldi r25, 0x00 ; 0 11a: 90 93 01 02 sts 0x0201, r25 11e: 80 93 00 02 sts 0x0200, r24 ... ppro1(); 126: e0 91 00 02 lds r30, 0x0200 12a: f0 91 01 02 lds r31, 0x0201 12e: 19 95 eicall _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
Re: Function Address fixup missing?Stu Bell wrote: > Trampolines work only for statically linked functions, not function > pointers. Sorry, but I don't understand this. I believe that there is no problem with statically linked functions. GCC actually generates a call with the full address, and no 16 bit pointers are involved, as in the following disassembly: butta = ButtamiViaSubito(); 4014: 0f 94 f5 06 call 0x20dea ; 0x20dea Here I am calling the same function but directly, and GCC generates the correct code for the AtMega2561. Actually, as I told in my previous mail, I tried a small program where it seems to work. Here is the main: typedef void (*PPro)(void); PPro ppro1; extern void pro1(void) __attribute__ ((section ("spro1"))); int main(void) { ppro1 = pro1; pro1(); ppro1(); while(1) ; return(0); } Here is the function: __attribute__((section ("spro1"))) void pro1(void) { return; } In the linker flags I added: -section-start=spro1=0x21000 to place the routine in the upper 128K. and the relevant disassembly is: ... 000000cc <__trampolines_start>: cc: 0d 94 00 08 jmp 0x21000 ; 0x21000 <pro1> ... ppro1 = pro1; 116: 86 e6 ldi r24, 0x66 ; 102 118: 90 e0 ldi r25, 0x00 ; 0 11a: 90 93 01 02 sts 0x0201, r25 11e: 80 93 00 02 sts 0x0200, r24 ... ppro1(); 126: e0 91 00 02 lds r30, 0x0200 12a: f0 91 01 02 lds r31, 0x0201 12e: 19 95 eicall _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
RE: Function Address fixup missing?> Stu Bell wrote:
> > Trampolines work only for statically linked functions, not function > > pointers. > Sorry, but I don't understand this. > I believe that there is no problem with statically linked functions. And that's what I said. There is not a problem with statically linked functions. In your first email, you wrote: > I have a function in upper 128K: > int ButtamiViaSubito(void) { > return 3; > } > > In the main (in lower memory) i take the address and then I call the > function: > > extern int ButtamiViaSubito(void); > typedef int (*PuntaAButtami)(void); > PuntaAButtami puntatore; > > int main(void) > { > ... > puntatore = ButtamiViaSubito; // (1) > ... > int butta = puntatore(); // (2) > ... > } At this point, "puntatore" is a function pointer. Unless the GCC gods disagree with me (in which case I have been doing a lot of work for nothing for the last 2 years), GCC only understands puntatore as a 16-bit entity. Further, it will *not* use the trampoline for this call. What sayeth thee, oh GCC gods? :-) In fact, given that the trampoline is not used for static calls to upper flash, I am also confused as to why it is not used for function pointers. I suspect the problem is that if a function pointer is used in a routine in upper flash, the 16-bit call would go to the wrong place. So, an EICALL must be generated to go to the trampoline, and EIND must be set correctly for the call to work. On the other hand, if the compiler is generating a call to the trampoline which it *knows* is in lower flash, EIND can always be forced to 0 before the call. But then the code needs to be smart enough to reset EIND to what it was before the function pointer call. That means that *every* function pointer call would need to generate instructions to save EIND before the call and more to restore it's state after the call. This would need to be done because the compiler (which generates the instructions) has no idea where the eventual location of the code will be, so it must plan for the worst. Generation of instructions would need to be an architecture-dependent, since the owners of ATTinys would be pissed if the compiler added istructions for a different architecture that are completely worthless to them. Sounds like this is a job for a volunteer. Would you like the job, Mau? Again, as I said in my first post, you *must* place *all* targets of function pointers (in this case, the function ButtamiViaSubito()) in lower flash. There is currently no other solution in GCC. Trust me, I've looked for one. Again, as I said in my first post, if you look in http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=70387 you will find that I describe exactly how to place all of these function pointer targets in lower flash. It isn't hard and if you choose you can steal (err, leverage, yeah, that's it, leverage!) my work directly. I will add one more gotcha here -- I've noticed that ISRs also work "better" when in lower flash. Theoretically this is not needed, but I suspect that the compiler has some assumptions about upper versus lower flash register states (specifically, EIND) that do not hold when an ISR is in upper flash. Sorry about the long reply, but I've spent time fighting this issue and the results are, well, complicated. Best regards, Stu Bell DataPlay (DPHI, Inc.) _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
Re: Function Address fixup missing?Stu Bell wrote: >> Stu Bell wrote: >>> Trampolines work only for statically linked functions, not function >>> pointers. >> Sorry, but I don't understand this. >> I believe that there is no problem with statically linked functions. > > And that's what I said. There is not a problem with statically linked > functions. Sorry, I meant that the trampolines have nothing to do with direct calls. As I understood the AtMega core have assembly instructions for direct call and jump with full 24 bit address within the instruction code. So no trampoline is needed and neither extension registers like EIND. I also believe that GCC is able to generate such correct code for direct calls and jumps anywhere. > At this point, "puntatore" is a function pointer. Unless the GCC gods > disagree with me (in which case I have been doing a lot of work for > nothing for the last 2 years), GCC only understands puntatore as a > 16-bit entity. Further, it will *not* use the trampoline for this call. I don't agree with this, but maybe some GCC guru can clarify further. I'll try to explain now, as I showed you in the code in my last mail. I believe that GCC assumes two things: - Register EIND is always zero, and nobody ever touches it - Trampolines are always in lower flash (<128K) With these assumptions it is possible to use 16bit pointers to functions (the default pointers in AVR-GCC) to reach functions anywhere. I'll try to explain what I have understood, and please tell me if and where I am wrong. When in the code there is a request for the address of a function which is over 128K, this is what I believe happens: 1) a trampoline il lower memory is generated. This trampoline contains a single jump to the full 24 bit address of function. 2) The address of the trampoline is returned instead. This is important so I repeat: "The address of the trampoline is returned instead". This address is still a full 24 bit address, but since the trampoline is in lower memory, the upper 8 bits are zero. So this pointer is "safely" stored into a 16Bit variable, without loosing information. When, in the code, there is a call to the function through its pointer (stored into a 16bit variable), this is what I think happens: 1) The GCC generates the load of the pointer contents, which is the low part of the trampoline address, and generates also a eicall instruction. 2) When this code executes, during the eicall processing, the EIND register is used to extend the 16bit address, but EIND is zero, and the 24bit address resulting from this concatenation operation results in the 24bit address of the trampoline. 3) The eicall places a 24bit return address into the stack, and the PC is loaded with the address of the trampoline so the execution continue there 4) The trampoline contains the full address of function, so the PC is again loaded with the correct function address (and the EIND register is not involved anymore). 5) When the function will return, it will find the full return address on the stack so it will return to the point after the eicall. This magic works, in my understanding, without any penalty for code below 128K, and with only a small penalty in time and space, but only for functions accessed through a pointer. And all this using only 16bit pointers! Or maybe I am completely wrong, but to my support there is the small code that I sent to the list in my mail, that works exactly as I described. My original question was related to a possible implementation bug, while you were trying to explain to me that the whole trampoline stuff is not working for a design problem (16 bit pointers and so ...). Can you comment on this? As a last comment, I can say that I can place all my functions where a pointer is needed in lower memory, and indeed this was my first attempt. Unfortunately the library code is always placed last, so over 128K, and functions like fputc do use function pointers (I assume to manage open file descriptors), and so I need that the trampoline stuff works, or some other workaround. Thanks all. Mau. _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
|
|
Re: Function Address fixup missing?Stu Bell schrieb:
> The root of the problem is that GCC's architecture does not lend itself > to 24-bit entities. As you noted, function pointers are 16 bits, which > limits the range to 128K of flash (since all AVR instructions are 2 > bytes wide, the program counter addresses 2 byte words). > > GCC could be modified to run with 32-bit (4 byte) pointers, but the > ATTiny folks would yell like mashed cats if that were done universally. > I suppose that a switch could be placed telling GCC whether to use 16 > bit or 32 bit pointers, but then the avr-libc library would need to be > compiled both ways leading to problems of making sure that the correct > library is linked with the correct compiled source. The multilib would be no problem, same for introducing a new compiler option. The hard part would be to rework the backend. At the moment, Pmode is HImode. You would have to set Pmode to SImode, but note that gcc knows just /one/ pointer mode. So both data and code pointer stuff would be in 32-bit arithmetics, load and stores. Believe me, this would certainly trigger a flood of "optimization regression" bugs from the same folks that requested such a feature... The E-registers are sometimes chenged by the hardware, and sometimes not. The E-regs are no GPRs, yet they would need handling as if they were R32. Any pointer arithmetic would involve clobber registers to hold and modify the E-part of the address. You really want that? Georg-Johann _______________________________________________ AVR-GCC-list mailing list AVR-GCC-list@... http://lists.nongnu.org/mailman/listinfo/avr-gcc-list |
| Free embeddable forum powered by Nabble | Forum Help |