CHDK Wiki
(asm example, some typos)
(clean up a bit)
Line 6: Line 6:
 
|}
 
|}
 
== Introduction ==
 
== Introduction ==
This page is intended to document best practices and common pitfalls specific to CHDK code. This is intended to cover technical issues, not things like coding style. Emphasis is on things that are unique to CHDK, or unique to the arm platform.
+
This page documents best practices and common pitfalls specific to CHDK code and the ARM platform. The aim is to provide developers who are new to CHDK with the information they need to write safe, correct code.
   
 
== Avoiding physical damage to the camera ==
 
== Avoiding physical damage to the camera ==
If your code directly manipulates hardware or certain propcases, '''there is a very real danger that bad code could result in permanent physical damage to the camera'''. For example, commanding lens hardware when the lens is retracted. If your code interacts with hardware, think very carefully about when the operation should be allowed, and abort or panic if the requirements are not met.
+
If your code directly manipulates hardware or certain propcases, '''there is a very real danger that bad code could result in permanent physical damage to the camera'''. For example, commanding lens hardware when the lens is retracted. If your code interacts with hardware, think very carefully about when the operation should be allowed, and abort or panic if the requirements are not met. High level functions in the Canon generally perform sanity checking, but you should not assume that all functions do.
 
 
== Memory allocation ==
 
== Memory allocation ==
 
=== Available Memory ===
 
=== Available Memory ===
Line 133: Line 132:
 
=== Optimizing for speed ===
 
=== Optimizing for speed ===
 
* The first four function parameters are passed in registers, so functions with four or less parameters are very cheap.
 
* The first four function parameters are passed in registers, so functions with four or less parameters are very cheap.
* Interleave memory access and calculation.
+
* Interleave memory access and calculation. This is only relevant in assembler, since GCC will re-order things as it sees fit.
 
: TODO '''EXAMPLE'''
 
: TODO '''EXAMPLE'''
  +
* Use the multiple register forms of load and store (LDM/STM). GCC isn't smart about this, so re-writing memory intensive code in assembler can result in significant gains.
 
* Avoid using division or modulo, since they are done in software.
 
* Avoid using division or modulo, since they are done in software.
* Operations on small constants or values that can be generated by shifting a small constant do not require an additional memory access.
+
* Operations on small constants or values that can be generated by shifting a small constant do not require an additional memory access. GCC trys to do this, but can do a poor job in many circumstances.
 
* Multiplications that can be expressed as a short sequences of shifts and adds are cheap.
 
* Multiplications that can be expressed as a short sequences of shifts and adds are cheap.
 
* Shifts are extremely cheap.
 
* Shifts are extremely cheap.
Line 143: Line 143:
   
 
TODO '''HOW GOOD IS GCC ABOUT DOING THE THINGS MENTIONED ABOVE FOR YOU'''
 
TODO '''HOW GOOD IS GCC ABOUT DOING THE THINGS MENTIONED ABOVE FOR YOU'''
 
 
TODO '''ADD SOME LINKS TO GENERAL ARM OPTIMIZATION INFORMATION'''
 
TODO '''ADD SOME LINKS TO GENERAL ARM OPTIMIZATION INFORMATION'''
   

Revision as of 23:34, 14 December 2008

THIS IS A WORK IN PROGRESS

Introduction

This page documents best practices and common pitfalls specific to CHDK code and the ARM platform. The aim is to provide developers who are new to CHDK with the information they need to write safe, correct code.

Avoiding physical damage to the camera

If your code directly manipulates hardware or certain propcases, there is a very real danger that bad code could result in permanent physical damage to the camera. For example, commanding lens hardware when the lens is retracted. If your code interacts with hardware, think very carefully about when the operation should be allowed, and abort or panic if the requirements are not met. High level functions in the Canon generally perform sanity checking, but you should not assume that all functions do.

Memory allocation

Available Memory

Most cameras have a total of between 300KB and 1MB available to malloc()/umalloc() after CHDK is loaded. If you need to manipulate large amounts of data, consider working in chunks or hijacking some other part of memory. Keep in mind that the Canon firmware needs to be able to allocate memory too.

Memory APIs and CPU caching

malloc() / free() operate just like they do in C. umalloc() / ufree() work on addresses for which the CPU cache and write buffer is not enabled. Memory that is directly read/written by hardware should usually be uncached. Other memory should generally be cached.

Notes

  • Uncached memory should not be used in processing intensive operations, since this will be much slower.
  • Both cached and uncached memory come from the same pool, they are simply referred to with different addresses. In other words, either one counts against the same 300K-1M mentioned above.
  • Do not use free() on memory allocated with umalloc or ufree() on memory allocated with malloc()

TODO MANIPULATING ADDRESSES TO REFER TO CACHED/UNACHED MEMORY. ALSO ENABLE/DISABLE/FLUSH

Disk IO

Several file IO APIs are available in CHDK. Each has specific requirements and limitations. Failing to respect these can result in invalid data being read or written.

Low level io with open() read() and write()

These functions correspond roughly to the unix syscalls of the same name. They operate on file handles, which are numbered starting from zero. Invalid filehandles are < 0. In CHDK, read() and write() must be used only with uncached memory. This is typically obtained using umalloc.

Notes

  • The uncached requirement means you should not use read() and write() with normal C variables. e.g. int x; read(fd,&x,sizeof(x)); is unsafe.
  • Not all cameras behave the same way. Just because it appears to work on yours doesn't mean it's OK.
  • Because uncached memory access is slow, if you are loading data that will be used frequently or for extensive calculations, you should copy it to normal memory. Alternately, use the stdio functions described below.
  • Internally (in the sig and stubs files) the functions that make up this API are referred to with a capital letter, i.e. _Open, _Read, _Write. The wrappers cause these to be used when you call the lower case variants in CHDK. The low level lower case variants (_open etc) should not be used.

(almost) ANSI stdio with fopen() et al.

The Fut API provides higher level, buffered io which corresponds roughly to C stdio. CHDK wraps this to provide the normal stdio functions in a way that is mostly compatible with a normal ANSI libc. This API uses FILE * and does not require the user to provide uncached memory.

Notes

  • Each open file handle allocates a little over 32KB of memory.
  • The FILE * used by Fut funtions is not compatible with other functions that might take a FILE *.
  • Some fields of the FILE structure have been reverse engineered, but you should avoid using them directly unless there is no way to get equivalent functionality from the camera APIs.

Raw DISK IO

TODO

Other functions

There several other IO APIs found in the firmware (for example, the original vxworks stdio and the lower case open/read/write) however these generally do not work reliably and should not be used.

Calling functions from firmware assembly

It is frequently necessary to call a C (or self written asm) function from disassembled firmware code. It is extremely important that you understand the arm calling convention and preserve any required registers.

Example

    MOV     R0, R4 ; this will be the first argument to the following BL call

    BL      sub_FFD483A4 ; call some function in Canon ROM
    BL      capt_seq_hook_raw_here ; call our function here

    MOV     R2, R4 ; load up the registers for the next call
    MOV     R1, #1
    BL      sub_FFD43578 ; call another function in Canon ROM
; ...

In this code, the call to the C function capt_seq_hook_raw_here is added to assembly code obtained by disassembling the canon firmware.

In the arm calling convention, a C function gets it's first four arguments in R0-R3, and is not required to preserve them. It is required to preserve other registers. Because of this, GCC will generate code for capt_seq_hook_raw_here which leaves these registers in an undetermined state after the call returns.

In the above case, because the call to the new function is added immediately after an existing call, we know that the firmware already expects R1-R3 to be undefined, so the fact that our function might clobber them is unimportant.

We do have to worry about R0, because it is also used for the return value. Given that the firmware loads R2 and R1, but not R0 before calling the final function (sub_FFD43578), we can assume that the return value of the preceding function (sub_FFD483A4) is expected to be the first argument to this call. If our call to capt_seq_hook_raw_here changes R0, problems could result. One easy way to avoid this is to define your function to take one argument and return it unchanged, like this:

int capt_seq_hook_raw_here(int save_R0) {
// ... your code goes here
    return save_R0;
}

This ensures that GCC saves the value of R0, since it will take whatever value is already in R0 as it's first argument, and return it using R0.

Now consider if we put our call a few lines later:

    MOV     R0, R4 ; this will be the first argument to the following BL call

    BL      sub_FFD483A4 ; call some function in Canon ROM

    MOV     R2, R4 ; load up the registers for the next call
    MOV     R1, #1
    BL      capt_seq_hook_raw_here ; call our function here
    BL      sub_FFD43578 ; call another function in Canon ROM
; ...

This is clearly wrong, because R2 and R1 were set up for the final BL, but now our call is going to (potentially) stomp on them.

The best practice is to insert your call immediately after an existing call, preserving R0. If you want to insert calls into arbitrary assembly code, you have to preserve all registers and possibly the CPU status word.

You can carefully analyze the firmware code to decide exactly what needs to be saved, but keep in mind others might modify your code without noticing these details.

Integral data types

On CHDK platforms, long and int are both a 32 bit integer. Pointers are also 32 bits. Short is a 16 bit integer, and chars are 8 bits.

  • In general, use int or unsigned for variables and function parameters. The arm architecture is such that using shorts or chars is less efficient than using an int.
  • long is identical to int. Using the long keyword only serves to add confusion. CHDK is not expected to run on systems where int or long are anything other than 32 bits.
  • For arrays, use the smallest type that has the requisite number of bits.
  • For structures, alignment considerations limit the value of using chars and shorts.
TODO WHAT IS THE DEFAULT STRUCTURE PACKING ?
  • Avoid using floats or doubles, since there is no floating point hardware.
TODO CAN YOU USE FLOATS AT ALL ?

Multi tasking and intertask communication

TODO

Optimization

Don't optimize prematurely

Consider what overall impact the execution of your code will have. There's no point in saving a few instructions if you are going to spend several seconds reading/writing flash memory. If your code will only be executed when the user is browsing through a menu, a few instructions more or less will make no difference.

Optimize for size unless you really need speed

CHDK operates in very limited memory. This means that limiting the compiled code and static data sizes is frequently more important than execution speed.

General recommendations for memory use

  • Use auto (local) variables for things that are a few hundred bytes or smaller.
Notes
  • The total stack size in most CHDK code is 8KB. If you use large amounts of autos or stack parameters, consider how deeply it is likely to call or be called from.
  • The structure of the GUI code is such that it is frequently impossible to use stack memory. Local variables in a menu or dialog callback will not be valid after the callback returns.
  • Use static and global variables only where required.
  • Use dynamic memory (malloc()/free()) for large data structures that do not have to exist for the life of the program.
Notes
  • malloc() memory does have overhead, so allocating lots of very small data structures is inefficient.
  • memory fragmentation is a concern.
  • Use function calls rather than copy/pasting similar code, avoid large macros or inlines unless you really need them. Don't unroll loops unnecessarily.
  • Consider loading large amounts of constant data from disk, rather than hard coding it.

Optimizing for speed

  • The first four function parameters are passed in registers, so functions with four or less parameters are very cheap.
  • Interleave memory access and calculation. This is only relevant in assembler, since GCC will re-order things as it sees fit.
TODO EXAMPLE
  • Use the multiple register forms of load and store (LDM/STM). GCC isn't smart about this, so re-writing memory intensive code in assembler can result in significant gains.
  • Avoid using division or modulo, since they are done in software.
  • Operations on small constants or values that can be generated by shifting a small constant do not require an additional memory access. GCC trys to do this, but can do a poor job in many circumstances.
  • Multiplications that can be expressed as a short sequences of shifts and adds are cheap.
  • Shifts are extremely cheap.
TODO MORE DETAIL
  • Optimize for cache use. All CHDK supported cameras tested so far have independent 4KB caches for data and instructions.

TODO HOW GOOD IS GCC ABOUT DOING THE THINGS MENTIONED ABOVE FOR YOU TODO ADD SOME LINKS TO GENERAL ARM OPTIMIZATION INFORMATION

TODO MERGE OLD OPTIMIZATION PAGE