|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
Watchdog DebuggingI was thinking the other day that it would be useful to be able to
determine which thread / interrupt caused a watchdog. We have done this in our offices before by toggling some output lines after every Yield, Sleep, EventWait, EventPost, fflush, and other OS functions that yield. I have even recompiled the OS before to do the same thing so that I could trace it down well enough. One idea that crossed my mind would be very simple to implement across the OS and user code. If you could assign a specific piece of memory (say 4 bytes of high heap memory) to keep thread flags, upon reboot, your program could detect a watchdog reboot and then report the 4 bytes back to the user. The only thing that really keeps this from being super simple to implement is that it requires making sure the heap manager (NutHeapAlloc / malloc) NEVER uses this area of memory. I'm sure we could rewrite the DEBUG version of the OS to do that! Any thoughts or improvements out there in the ether? Another thought looking through the OS, the _putf function in crt/putf.c should probably have a pointer to a structure that includes all of the variables that it declares. It could dynamically allocate and free the memory needed for this structure. This is the part of fprintf that "explodes" your memory usage and causes oh so many stack overflows. Thanks, Tim DeBaillie _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingI don't see why you would consider this memory area to be on the heap in the
first place. I tried something similar by chopping down the size of RAM that the Os was configured to use, and using that extra for keeping a thread history (last 10 threads), but something funky kept happening to that RAM -- either a bug in my code or something the Os did that I didn't know about. I did a similar thing to your first suggestion by each thread structure a bit pattern to for IO lines while that thread was running, but resources (available io) limited my ability to go beyond a pattern for each thread. Nathan _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingTimothy M. De Baillie wrote:
> > Any thoughts or improvements out there in the ether? > What about: - with a configuration option, remove usage of the hardware watchdog and provide a software one, based on a available timer/counter. When the counter reaches its max value, an interrupt is fired. One must choose a max value that makes the timer/counter with a timing equals to the real watchdog (or the closest possible). - the function to reset the hardware watchdog then just reset the timer/counter. - the interrupt fired when the counter reaches its max value can write information to EEPROMs, banked RAM, on the serial port, etc. I use a similar scheme on targets without any OS (for debugging), or if I the maximum allowed time by the hardware watchdog is too short for some processing: in that case I have to reset the hardware watchdog at different points in the code but I keep the software watchdog running with a longer period and reset it only in the main application cycle (again on targets without OS and in that case this system is kept even when debugging is not needed but only to ensure to have some kind of watchdog). Regards, Bernard _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingNathan Moore wrote:
> I don't see why you would consider this memory area to be on the heap in the > first place. > I tried something similar by chopping down the size of RAM that the Os was > configured to use, > and using that extra for keeping a thread history (last 10 threads), but > something funky kept > happening to that RAM -- either a bug in my code or something the Os did > that I didn't know about. Without reading the details of the original post, I may be able to help here. Depending on the platform, the runtime initialization may make use of the stack before/while entering NutInit. To reserve some RAM on the top, it may be required to inform the linker as well. GNU AVR linker option --defsym,__stack=0x10FF GNU ARM linker script section MEMORY (reducing the len parameter of the ram) Harald _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingOn Wed, Sep 24, 2008 at 12:31 PM, Harald Kipp <harald.kipp@...> wrote:
> Nathan Moore wrote: > > I don't see why you would consider this memory area to be on the heap in > the > > first place. > > I tried something similar by chopping down the size of RAM that the Os > was > > configured to use, > > and using that extra for keeping a thread history (last 10 threads), but > > something funky kept > > happening to that RAM -- either a bug in my code or something the Os did > > that I didn't know about. > > Without reading the details of the original post, I may be able to help > here. Depending on the platform, the runtime initialization may make use > of the stack before/while entering NutInit. To reserve some RAM on the > top, it may be required to inform the linker as well. > GNU AVR linker option --defsym,__stack=0x10FF > GNU ARM linker script section MEMORY (reducing the len parameter of the > ram) Wouldn't changing the upper limit in the configurator result in this being done? _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingOn Wed, Sep 24, 2008 at 12:15 PM, Bernard Fouché <bernard.fouche@...
> wrote: > Timothy M. De Baillie wrote: > > > > Any thoughts or improvements out there in the ether? > > > What about: > > - with a configuration option, remove usage of the hardware watchdog and > provide a software one, based on a available timer/counter. When the > counter reaches its max value, an interrupt is fired. One must choose a > max value that makes the timer/counter with a timing equals to the real > watchdog (or the closest possible). > > - the function to reset the hardware watchdog then just reset the > timer/counter. > > - the interrupt fired when the counter reaches its max value can write > information to EEPROMs, banked RAM, on the serial port, etc. > Just keep in mind that if a hang-up happens within an ISR or critical section this method won't catch it. NutEnterCritical(); for(i = 0; (i = 4); i++) { f(i); } NutExitCritical(); Nathan _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingNathan Moore wrote:
> On Wed, Sep 24, 2008 at 12:31 PM, Harald Kipp <harald.kipp@...> wrote: >> GNU AVR linker option --defsym,__stack=0x10FF >> GNU ARM linker script section MEMORY (reducing the len parameter of the >> ram) > > > Wouldn't changing the upper limit in the configurator result in this being > done? Until today the Configurator modifies a few compiler/linker options only, mainly compiler option -D. On GNU AVR the standard linker scripts are used, which will set the stack pointer to the end of internal RAM by default. Imagecraft uses the end of external RAM to initialize the early stack pointer. For the ARM, the linker script is used. Harald _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingOn Wed, Sep 24, 2008 at 1:29 PM, Harald Kipp <harald.kipp@...> wrote:
> Nathan Moore wrote: > > On Wed, Sep 24, 2008 at 12:31 PM, Harald Kipp <harald.kipp@...> > wrote: > >> GNU AVR linker option --defsym,__stack=0x10FF > >> GNU ARM linker script section MEMORY (reducing the len parameter of the > >> ram) > > > > > > Wouldn't changing the upper limit in the configurator result in this > being > > done? > > Until today the Configurator modifies a few compiler/linker options > only, mainly compiler option -D. > > On GNU AVR the standard linker scripts are used, which will set the > stack pointer to the end of internal RAM by default. Ok, we were using the end of external RAM on AVR with GCC. It did turn out that the problem I was looking for also had it's way with RAM before killing the board, so it's likely that that is what caused that area of RAM to be overwritten. Nathan _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingTimothy M. De Baillie wrote:
> One idea that crossed my mind would be very simple to implement across > the OS and user code. If you could assign a specific piece of memory > (say 4 bytes of high heap memory) to keep thread flags, upon reboot, > your program could detect a watchdog reboot and then report the 4 bytes > back to the user. If you search in the Configurator for NUTMEM_RESERVED, you'll find a default of 64, which creates an array in arch/avr/os/nutinit.c: uint8_t nutmem_onchip[NUTMEM_RESERVED]; If I remember correctly, this had been used to reserve some memory in internal AVR RAM, which can be used while manipulating the address lines. One application is to access the hidden external RAM, that overlaps the internal addresses. The implementation is so awful, that I would like to delete it immediately. However, the right way may be to put nutmem_onchip in a different segment. I remember, that avr-libc offers an uninitialized data segment, which won't be touched by the runtime initialization. If we manage to force it into internal RAM, this array could provide both features. In any case I agree, that your suggestions would be most helpful for debugging. Harald _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingNathan Moore wrote:
> On Wed, Sep 24, 2008 at 12:15 PM, Bernard Fouché <bernard.fouche@... > >> wrote: >> > > >> Timothy M. De Baillie wrote: >> >>> Any thoughts or improvements out there in the ether? >>> >>> >> What about: >> >> - with a configuration option, remove usage of the hardware watchdog and >> provide a software one, based on a available timer/counter. When the >> counter reaches its max value, an interrupt is fired. One must choose a >> max value that makes the timer/counter with a timing equals to the real >> watchdog (or the closest possible). >> >> - the function to reset the hardware watchdog then just reset the >> timer/counter. >> >> - the interrupt fired when the counter reaches its max value can write >> information to EEPROMs, banked RAM, on the serial port, etc. >> >> > > Just keep in mind that if a hang-up happens within an ISR or critical > section > this method won't catch it. > > NutEnterCritical(); > for(i = 0; (i = 4); i++) { > f(i); > } > NutExitCritical(); > > Nathan > _______________________________________________ > http://lists.egnite.de/mailman/listinfo/en-nut-discussion > > void MyInterrupt(void * arg){ //save the interrupt threads information unsigned int thread_information = _my_memory_location; //set the memory location to the interrupts flag _my_memory_location = MY_INTERRUPT_FLAG; for(;;); //causes lockup //under normal circumstances, you would then set the interrupted thread back again before returning from the interrupt _my_memory_location = thread_information; } Now I know that's not 100% reliable, but should get the job done. Criticals don't change threads (or shouldn't) so you don't have to do anything special. Tim _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingBernard Fouche' wrote:
> Timothy M. De Baillie wrote: > >> Any thoughts or improvements out there in the ether? >> >> > What about: > > - with a configuration option, remove usage of the hardware watchdog and > provide a software one, based on a available timer/counter. When the > counter reaches its max value, an interrupt is fired. One must choose a > max value that makes the timer/counter with a timing equals to the real > watchdog (or the closest possible). > > - the function to reset the hardware watchdog then just reset the > timer/counter. > > - the interrupt fired when the counter reaches its max value can write > information to EEPROMs, banked RAM, on the serial port, etc. > > I use a similar scheme on targets without any OS (for debugging), or if > I the maximum allowed time by the hardware watchdog is too short for > some processing: in that case I have to reset the hardware watchdog at > different points in the code but I keep the software watchdog running > with a longer period and reset it only in the main application cycle > (again on targets without OS and in that case this system is kept even > when debugging is not needed but only to ensure to have some kind of > watchdog). > > Regards, > > Bernard > > _______________________________________________ > http://lists.egnite.de/mailman/listinfo/en-nut-discussion > > Tim _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingTim,
My remark that it wouldn't work was referring to the inability of a software based watchdog's timer ISR to interrupt critical sections or other ISRs. A hardware watchdog will, and when debugging on my desk with a debugger lock up asserts are great. Nathan _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
|
|
Re: Watchdog DebuggingNathan Moore wrote:
> > Just keep in mind that if a hang-up happens within an ISR or critical > section > this method won't catch it. > Sure, this is not a perfect solution, it's just a tool among others that can help in some situations. I've also tried other ways, for instance by storing into into RAM not just a few bytes but a circular buffer with some debug info, to be able to retrieve an history of the events that made the whole system fails because sometimes it is not a particular function that is broken, the flaw being in the way a chain of events is handled. Also one has to keep in mind that things that first appear as being a hardware watchdog reset may be related to brown out detection (or any other hardware related origin) and before spending time digging into the software, one must be sure that the hardware is totally clean. (Some MCU allows one to know the previous reset origin) There are also situations where the hardware watchdog fires because some processing is too long by a very little bit, or because processing depends on events external to the MCU, and changing the application timing with debug features makes the problem disappear or happen earlier. (sometimes a very difficult case to fix) At last using a versioning system like CVS or SVN may help a lot: if you are serious in your commit/comment policy, you have a big advantage when bizarre things start to show up in a project that was previously working correctly. (In my experience this is one of the most efficient debug tool whatever the kind of the bug being chased) So I think that there is no miracle solution that handles all possible cases but a set of tools/habits/experience to choose from. Bernard _______________________________________________ http://lists.egnite.de/mailman/listinfo/en-nut-discussion |
| Free embeddable forum powered by Nabble | Forum Help |