Friday, July 11, 2008
Alignment Matters
When do these two snippets behave differently?
Answer: when 'something' returns a value that's not a multiple of "sizeof(int)".
Some processor architectures only allow loading memory into N-byte registers from memory addresses that are a multiple of N (e.g. multiple of 4 for 32-bit registers). Using misaligned addresses can have some various interesting consequences, depending on the CPU; I've heard of at least these:
Last week, I got lucky: I discovered that the MIPS32 CPU on the phone I was programming falls in the "crash" category. (why lucky? because an "address load exception" message is much easier to debug than some corrupted data)
Usually, we don't have to worry about such alignment issues; the compiler and runtime make sure that all objects it allocates go at addresses that have the right alignment for their type (e.g. malloc must return memory "suitably aligned" for all possible types).
Trouble comes when we lie to the compiler, such as telling it by a cast that "p" points to an "int" when such is not the case. This is what happened to me: I was parsing a file, and the 4-byte-value-that-should-be-put-in-an-int followed an arbitrary-length string. It ended up on an odd address, and boom. (Here's another way I was lucky: it COULD have been a nice multiple of 4 in all my tests, only to come out odd on a client's desk)
Functions like memcpy, of course, are required to work with all addresses (as expressed by taking void* parameters, which require no cast).
Lesson of the day? Don't lie to your compiler!
(alternate lesson: "casts: evil AND chaotic"?)
void* p = something();
int i = *(int*)p;
int i;
void* p = something();
memcpy(&i, p, sizeof(int));
Answer: when 'something' returns a value that's not a multiple of "sizeof(int)".
Some processor architectures only allow loading memory into N-byte registers from memory addresses that are a multiple of N (e.g. multiple of 4 for 32-bit registers). Using misaligned addresses can have some various interesting consequences, depending on the CPU; I've heard of at least these:
- It works fine
- It works, but slowly (e.g. because the misalignment is detected and alternate instructions are used to load 1 byte at a time - I've heard that x86 works this way)
- The program may crash due on an "invalid address" trap
- The CPU may load data from the wrong address (e.g. if only multiples of 4 are valid addresses, the instruction might ignore the bottom 2 bits of the address)
Last week, I got lucky: I discovered that the MIPS32 CPU on the phone I was programming falls in the "crash" category. (why lucky? because an "address load exception" message is much easier to debug than some corrupted data)
Usually, we don't have to worry about such alignment issues; the compiler and runtime make sure that all objects it allocates go at addresses that have the right alignment for their type (e.g. malloc must return memory "suitably aligned" for all possible types).
Trouble comes when we lie to the compiler, such as telling it by a cast that "p" points to an "int" when such is not the case. This is what happened to me: I was parsing a file, and the 4-byte-value-that-should-be-put-in-an-int followed an arbitrary-length string. It ended up on an odd address, and boom. (Here's another way I was lucky: it COULD have been a nice multiple of 4 in all my tests, only to come out odd on a client's desk)
Functions like memcpy, of course, are required to work with all addresses (as expressed by taking void* parameters, which require no cast).
Lesson of the day? Don't lie to your compiler!
(alternate lesson: "casts: evil AND chaotic"?)
Labels: c++, compiler, optimization
Wednesday, May 30, 2007
Pay it Forward: Part 1
When writing C++ code, do your compiler a favor and use forward declarations whenever possible: it will pay you back later... and more.
Forward declarations will...
Here are some rules of thumb that will help you determine whether a full definition is necessary for a certain type:
Pay it forward... and wait to be paid back.
Forward declarations will...
- ... decouple classes that depend on each other.
- ... improve incremental build times considerably when changing the header file declaring the class where the forward declaration was used.
- ... increase the portability of your code.
AddressBook.h
//#include "Contact.h" // NO! Do this in the .cpp file instead. Unless youIf the user of this class doesn't use the member functions that require the undefined type, he will not be required to include the header that defines it. The user will only need to do this if he uses those member functions.
// really need the definition of the class.
// Forward type declaration
class Contact; // YES!
class AddressBook
{
public:
// ...
/** Ways you can use the declaration without having the definition **/
// Member functions input parameters.
void addContact(unsigned int id, Contact const& newContact);
void addContact(unsigned int id, Contact const* newContact);
// Member functions output parameters.
void getContact(unsigned int id, Contact& contact);
void getContact(unsigned int id, Contact* contact);
// These cases are a bit more obscure than the others, but it is legal and
// still works.
void addContact(unsigned int id, Contact newContact);
Contact getContact(unsigned int id);
private:
// Member variables
Contact* m_contactPointer;
Contact& m_contactRef;
/** These some cases where the compiler needs the definition. **/
Contact::Address m_address;
static const size_t ms_contactSize = sizeof(Contact);
Contact m_contact;
Contact m_contactArray[10];
/** Here are some cases that often work with different STL implementations and
compilers, but are illegal. **/
std::vector<Contact> m_contacts;
std::auto_ptr<Contact> m_contactAutoPtr;
};
Here are some rules of thumb that will help you determine whether a full definition is necessary for a certain type:
- Enums and typedefs types cannot be forward-declared.
- You will be able to use your forward-declared type as a template parameter if and only if the templated type doesn't need the full definition.
- You will need the full definition to get the size of a type, use any constructors or detructors, or access members (types scoped at class level).
Pay it forward... and wait to be paid back.
Labels: c++, compiler, optimization, pay it forward