Discussing the nuts and bolts of software development

Friday, September 18, 2009

 

Maven Compiler Tips and Tricks

Apache’s Maven is a great tool for managing a build environment: it keeps track of all project dependencies and provides a number of configurable build phases which can add depth to the build process. Building a project from the ground-up with Maven is a sure-fire way to keep it well organized and easy to maintain – but what about adding Maven on to an existing project, or worse, merging a non-Maven project into a project that already relies on Maven?

What follows is a look at some of the lessons I’ve learned from tweaking compiler plug-ins and digging through search results to debug various Maven-related issues. Hopefully it will be useful to the next developer who happens to hit similar issues, and if you have tips of your own be sure to leave a comment.

The maven-compiler-plugin <include> property

When overriding the default maven-compiler-plugin, the <include> tag may be used to force the compiler to include extra files into the build. There are a couple of interesting points to note here:

It is a filter. Many things in Maven expect a path to a directory, but not the <include> tag. If you have some extra java classes in src/main/java and you pass that to <include> it will fail silently – what you actually want is src/main/java/**/*.java.

It is for the compile-phase only. By default, in addition to using the maven-compiler-plugin during the compile phase, Maven will also use it during the test-compile phase. Most properties will apply to both, but <include> is not one of them; the test-compile phase requires a separate property, <testIncludes>, for any includables it requires[1].

The generate-sources and generate-resources phases

These phases are great for adding source (.java) and resource (.class) files to the compiler before the compile phase occurs. The snippet below shows how to use mojo's build-helper-plugin to add some obscure .class files to the classpath:

<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<executions>
<execution>
<phase>generate-resources</phase>
<goals>
<goal>add-resource</goal>
</goals>
<configuration>
<resources>
<resource>
<directory>../obscure/classes</directory>
</resource>
</configuration>
</execution>
</executions>
</plugin>

Now any .class files in ../obscure/classes (or any subdirectories) will be added to the classpath.

The same plug-in is used for adding .java files, simply tweak the generate-resources and add-resource values to generate-sources and add-source, and the <resources> and <resource> properties to <sources> and <source>.

The maven-compiler-plugin <compilerArgument> property

The <complierArgument> tag may be used to pass command-line arguments directly to the java compiler. Two examples:

<classpath> seems like it would let you add resources to the classpath, but near as I can tell, these aren’t actually used. Maven seems to prefer managing its own classpath, though we can still add/remove entries by overriding the default generate-resources phase (as explained above).

<sourcepath> is a great way to specify multiple source directories for compilation. It takes a semicolon-delimited list of top-level directories containing java files to compile. Alternately, an override of the generate-sources phase may be used here as well.

One gotcha with regards to sourcepath: to get this to work, I had to manually set <fork> to true on the maven-compiler-plugin. In fact, this gets even worse: when <fork> is true, Maven will use the %PATH% environment variable to determine which JRE to use, and this will fail with a totally non-descript error if the path to your JRE contains any spaces – very annoying to track down. This is actually a bug in Maven, logged here (http://jira.codehaus.org/browse/MCOMPILER-30).

The <sourceDirectory> property

A minor but important tag when playing around with various sources and resources, the <sourceDirectory> tag may be used to set the base directory for including java source files. It defaults to src/main/java, but I found when playing around with a lot of sources spread around various directories, it was easiest to set it to the current directory as follows:

<build>
<sourceDirectory>.</sourceDirectory>
{...}
</build>

Other useful debugging hints

When running into problems, it’s always good to have a few debug flags around to get a little more information out of Maven, which is generally not great at telling you what might be wrong.

Specifying the -e flag while running Maven will print out any exceptions Maven encounters, with the corresponding stack trace.

The <verbose> tag may be added inside any plug-in’s <configuration> property and when toggled to true (default is false) it will print some extra information, including the sourcepath and classpath being used by Maven’s compiler.

Specifying the -X flag while running Maven will document all kinds of intermediate steps Maven takes during the build – much more than -e and <verbose>.



[1] This makes perfect sense, of course: it’s unlikely that you’ll want the same includables for both the normal compiler and the testing compiler. It’s just counter-intuitive compared to the rest of the <configuration> properties.

Labels: , ,


Friday, July 11, 2008

 

Alignment Matters

When do these two snippets behave differently?


void* p = something();
int i = *(int*)p;


int i;
void* p = something();
memcpy(&i, p, sizeof(int));


Answer: when 'something' returns a value that's not a multiple of "sizeof(int)".

Some processor architectures only allow loading memory into N-byte registers from memory addresses that are a multiple of N (e.g. multiple of 4 for 32-bit registers). Using misaligned addresses can have some various interesting consequences, depending on the CPU; I've heard of at least these:

Last week, I got lucky: I discovered that the MIPS32 CPU on the phone I was programming falls in the "crash" category. (why lucky? because an "address load exception" message is much easier to debug than some corrupted data)

Usually, we don't have to worry about such alignment issues; the compiler and runtime make sure that all objects it allocates go at addresses that have the right alignment for their type (e.g. malloc must return memory "suitably aligned" for all possible types).

Trouble comes when we lie to the compiler, such as telling it by a cast that "p" points to an "int" when such is not the case. This is what happened to me: I was parsing a file, and the 4-byte-value-that-should-be-put-in-an-int followed an arbitrary-length string. It ended up on an odd address, and boom. (Here's another way I was lucky: it COULD have been a nice multiple of 4 in all my tests, only to come out odd on a client's desk)

Functions like memcpy, of course, are required to work with all addresses (as expressed by taking void* parameters, which require no cast).

Lesson of the day? Don't lie to your compiler!
(alternate lesson: "casts: evil AND chaotic"?)

Labels: , ,


Thursday, August 23, 2007

 

Pay it Forward: Part 2

When writing C++ classes and interfaces (e.g. abstract classes), do your compiler a favor and declare each class in its own header file whenever you can. Trust me, it'll pay you back later.

Rule of thumb:
If you ever need to use the would-be nested class without using the nesting class, then your class should not be nested.
To see how nested classes should be used, look at std::string::iterator. It's a nested type (Okay, okay, it could be a nested typedef that could alias a non-nested class. From outside std::string, it looks like a nested class and that's all that matters.) that's useless without its nesting class std::string.

Now, here's an example of how not to use your nested types:

AddressBook.h

// Forward type declaration
class Contact; // Remember this?

class AddressBook
{
public:
// Interface to be implemented by objects wanting AddressBook notifications.
class IContactEventSink
{
public:
virtual void onContactChanged(Contact const& oldContact, Contact const& newContact) = 0;
};
...
};
With the the above implementation, whenever someone wants to receive notification from the AddressBook, they need to include the AddressBook itself and derive from AddressBook::IContactEventSink. Doing this has the following down sides:
To fix all of this and to give your compiler a break, declare the IContactEventSink class in its own header and include this header whenever you need the full definition.

IContactEventSink.h

// Forward type declaration
class Contact; // Yup, even here.

class IContactEventSink
{
public:
virtual void onContactChanged(Contact const& oldContact, Contact const& newContact) = 0;
};
Then, hit "Build", sit back and wait to...

... actually no. You can get on with your life. Your incremental build is now done because you've paid it forward!

Labels: , , ,


Wednesday, May 30, 2007

 

Pay it Forward: Part 1

When writing C++ code, do your compiler a favor and use forward declarations whenever possible: it will pay you back later... and more.

Forward declarations will...

AddressBook.h

//#include "Contact.h" // NO! Do this in the .cpp file instead. Unless you
// really need the definition of the class.

// Forward type declaration
class Contact; // YES!

class AddressBook
{
public:
// ...

/** Ways you can use the declaration without having the definition **/

// Member functions input parameters.
void addContact(unsigned int id, Contact const& newContact);
void addContact(unsigned int id, Contact const* newContact);

// Member functions output parameters.
void getContact(unsigned int id, Contact& contact);
void getContact(unsigned int id, Contact* contact);

// These cases are a bit more obscure than the others, but it is legal and
// still works.
void addContact(unsigned int id, Contact newContact);
Contact getContact(unsigned int id);

private:

// Member variables
Contact* m_contactPointer;
Contact& m_contactRef;

/** These some cases where the compiler needs the definition. **/

Contact::Address m_address;
static const size_t ms_contactSize = sizeof(Contact);
Contact m_contact;
Contact m_contactArray[10];

/** Here are some cases that often work with different STL implementations and
compilers, but are illegal. **/

std::vector<Contact> m_contacts;
std::auto_ptr<Contact> m_contactAutoPtr;
};
If the user of this class doesn't use the member functions that require the undefined type, he will not be required to include the header that defines it. The user will only need to do this if he uses those member functions.

Here are some rules of thumb that will help you determine whether a full definition is necessary for a certain type:
Note that a small caveat of this technique is that it sometimes sacrifices some convenience in the cases where the user will almost certainly want to interact with the member functions which require the full definition of the type. In this case it is a judgment call whether to use forward declaration or not, otherwise...

Pay it forward... and wait to be paid back.

Labels: , , ,


This page is powered by Blogger. Isn't yours?