Take a look at http://www.nongnu.org/avr-libc/user-manual/FAQ.html#faq_flashstrings
It seems to me that the flukso uC is suffering he same trouble. I don't know wiring/arduino development. But if I just run
- make
- avr-size bin/main.elf
I get
- text data bss dec hex filename
- 2922 216 241 3379 d33 bin/main.elf
I get a RAM usage of nearly half ram (we only have 512 byte of ram in a ATMEGA48)
In an older version of the code there where far longer strings, so size was close to 512 and that leaves us no space for the stack would cause real trouble!
Second point is defining i as a file-scoped variable, that's just nuts.
Third point, don't call function from interrupts routines. It increase size like hell and interrupts are blocked during execution making measurements corrupt.
Doing an 32 bit multiply in an interrupt routine (it's an 8 bit machine remember)? Every time, while we need the result only a after 1000000000/ADC_CONST? And then sending the value over the UART while in the ADC interrupts, we certainly mis measurements that way.
Ok enough complained, time to act. I am willing to rewrite the firmware so it's more compact, use ways less (ram) memory and maybe contain a bootloader so we can update firmware if we need to. (Just need to figure out a way to create a compact-single-purpose version of avrdude.) But I need to know what avr-libc version are the dev's using? What version of avr-gcc? WinAVR compatible perhaps?
Sorry for complaining, but my fingers itch to get this right.
HTH
Wutje
Hi Wutje,
I'm aware of ordinary strings being duplicated into RAM. The plan is to substitute the Arduino's serial library (which besides the Makefile is about the only reference left to the original Arduino code) for Peter Fleury's one here: http://beaststwo.org/avr-uart/index.shtml
This UART library contains a Tx buffer which should solve any blocking on Tx, and support for program space string utilities, thus saving precious RAM:
Re the 32 bit multiplication: With a clock of 1Mhz and ADC sampling at 488Hz, the AVR has about 2000 cycles to perform one 32-bit MAC. The debugging message should indeed be commented out or be relegated to the main loop.
Re the bootloader: The ATmega48 doesn't have support for a bootloader. Besides, it would take 1-2k out of the 4k flash memory. There is however a 6-pin ICSP footprint available for flashing. Just be careful not to flash the EEPROM as well. So comment this action out in the Makefile.
We're using:
avr-libc 1.4.7
avr-gcc 4.2.2
Thanks for the feedback! Let me know if need more information.
Cheers,
Bart.
UART: Peter Fleury. Yes using Peter well know drivers is a very good idea. I having been doing so myself for a long time.
True, a 32 bit multiply is not that heavy, but we are using other interrupts as well. We might miss an interrupt because we are staying in the ADC interrupt for so long.
If we are sending the 26-27 bytes then we are in the ADC interrupt for: 4800baud/10 (10 bits = 1 start, 1 stop, 8 databits) = 480 bytes/s. That is roughly the number of ADC measurements we make. This means that for every character we send in the ADC interrupt routine we miss an ADC conversion.
This immediately explains why we need a bootloader. Now the firmware in the AVR can not be updated.
It is true the ATMEGA48 does not have a bootvector, but we can still implement a bootloader, we only need the capability to write the flash, which we have. We can write our own upload tool, it's not that hard.
Using the ICSP6 is very hard. You first need to desolder the AVR pcb, I have access to some pretty (de)soldering tools so it's ok for me (already did that). But for others it's hard, besides needing the AVR programming tools.
The avr-libc version is a bit old. My shiny new ubuntu 9.10 comes with 1.6.2 and 4.3.3. I will try to manage.
HTH,
Wutje
I've just committed rev 67. The ADC0 debugging info has been moved to the main loop. So this is what the ISR now looks like:
In fact, we're multiplying a 16-bit constant with a 16-bit ADC value, then casting it to a 32-bit unsigned integer and adding it to the nWh counter. The 16-bit multiplication should consist in 4 MUL operations (2 clocks), 3 shifts (1 clock) and 3 ADDs (1 clock) or 14 clocks. This is only my rough calculation, but it gives us a ballpark idea of how much it's costing us.
Re bootloader: You're right, such an upload tool would be very nice. I'm also considering 2-way comms for configuring the meter ID's and METERCONST via the Fluksometer's UART.
Re ICSP6: Just plug in a male 6-pin header into the programmer's female connector. Then slide it into the board's ICSP, keep it tilted so that all 6 pins can make contact and program. It's not going to win an electronics beauty contest, but it works.
Re Ubuntu: I'm still on 8.04 LTS. Since it's always a small hassle to get everything working properly after an upgrade, I'm waiting for 10.04 to make the jump.
Cheers,
Bart.
I saw the improvement in SVN, good work.
However your assumption about the cast is not correct. Your are now upcasting the constant to 32 bits and then GCC starts doing a 32 bit multiply (yes AVR-GCC is not optimal here, a 32*16 would be sufficient). There is no need to cast to a larger variable, it only causes gcc to be confused. You probably wanted to write this:
But this is sufficient
That results in 3 mul, some movw and adds so that close to the 14 you had in mind.
And another thing I noticed with your latest SVN version, you now read ADC multiple times from different contexts, that might disturb the measurement (although very unlikely). It is better to save the ADC output in RAM and use that in the loop.
And while I am thinking about it, you have severe race conditions on your measurements arrays, the ISR's might be updating it while your are sending it over the UART. Copy it to a (local) buffer within a cli()/sei() block to guarantee integrity.
Yes doing a ICSP programming the way you describe is ok for once, but deving using that is a no-go, but for emergency yes that would probably be sufficient.
HTH,
Wouter
I've fixed the possible race condition on measurement->value and added a dedicated debug field to the aux struct for storing the ADC value in rev 68.
avr-gcc doesn't support 16*16 to 32 bit multiplication out of the box but restricts itself to 16*16 to 16. For an nice explanation, see [1]. That's why we needed to cast the constant to 32 bits. With a METERCONST of 7091, a pure 16 bit multiplication would overflow on all but the lowest ADC values.
Some further googling returned application note AVR201 [2] and a very thorough article on the performance of arithmetic functions in low-power micro-controllers [3]. Based on these, I've implemented an ASM macro MacU16X16to32(uint_32Acc, uint_16In1, uint_16In2) which multiplies the last two operands and adds them to the first one. According to [3] the 16x16 -> 32 unsigned MAC should execute in 37 cycles with operands and results in memory. Committed in rev 69.
Cheers,
Bart.
[1]: http://mekonik.wordpress.com/2009/03/18/arduino-avr-gcc-multiplication
[2]: http://www.atmel.com/dyn/resources/prod_documents/DOC1631.PDF
[3]: http://www2.ife.ee.ethz.ch/~roggend/publications/wear/DSPMic_v1.1.pdf
Yes you're absolutely right. I figured out I was wrong as well. But you fixed it even before I could mention it. Good work!