The Macadamian Files: May 2008

Friday, May 23, 2008

UMPCs And The Race To The Bottom - Or how much more does a Flash SSD weights, when it's full with data?

Our last Macadamian Barcamp session focused on Ultra Mobiles and Netbook PCs. The whole race to the bottom seem to have started when the One Laptop Per Child project took the initiative of creating an affordable ultra-mobile PC aimed at developing countries school children. Asus followed shortly and its Eee PC grabbed most of our attention with its slim size and crispy screen.

On top of portability, we found that most UMPCs offer the following advantages:

Bigger screens compared to smartphones, Wi-Fi and now WiMAX support (some Phillips devices already support WiMAX, and Nokia tablets will have WiMAX this summer).
Most can be used as a car GPS, most have GPS built-in, others can use a Bluetooth or USB GPS receiver.
They're good for note taking, Web browsing, email writing - They're great for common tasks and while most don't handle Vista very well, they have enough horsepower to run most applications written for desktop PCs.

There's always the power versus battery life aspect, which is generally slim compared to smartphones. The One Laptop Per Child PC battery lasts longer but its intentional Fisher-Price looks and ruggedness make it only valuable for either elementary-aged children or front line military soldiers.

The Nokia Internet Tablets looks very promising since they seem to have a good balance between battery life, available applications and horsepower.

Now that most computer vendors have embraced the idea of making Asus Eee-like PCs, we can only expect UMPCs will follow the same fast-paced evolution as cellphones and competition should keep their costs low. Both Eee PC and OLPC can now run WindowsXP. This doesn't make them gain an extra gram, they might even cost cheaper but personally, I feel using them without their native OSes just ain't Kosher.

Labels: Barcamp, Netbooks, UMPC

# posted by Jean-Claude Batista @ 12:36 p.m. 13 comments

Thursday, May 22, 2008

Software Metrics : What are they, and what can they do for you

What are software metrics?

This might be a term you have heard before. Simply put, they are a way to quantify aspects of source code.

These aspects vary, and each have a different significance. ... it will make more sense if you see some examples:

LOC - Lines of code. This metric is simply the line count per file, per project, per function, per class, etc. Its value is that you can determine when code is growing beyond control. The refactoring remedy is to split up the file/class/method into smaller, more manageable parts.
NSM - Number of static methods/fields. Excessive use of the static modifier is an indication that something is amiss. The refactoring remedy may include using a Singleton design pattern, or reworking the class design.
LCC - Lines of comments. This is the total number of lines of comments. When taken in ratio with the lines of code, you get a vague idea of how well documented your code base is. It is a fairly common practice to generate API documentation from source, so this metric can be especially valuable if the project is a library.
VG -Cyclomatic Complexity. This metric is not as complex as the name sounds. It is simply the count of the possible linearly independent paths the code may take. Every IF, FOR, WHILE, DO, CASE and trinary operator increments the count. This can help identify poorly written code (ie huge if-else statements), as well as potential slow spots.
LCOM - Lack of Cohesion of Methods. LCOM is a formula for detecting the cohesiveness of a class. It can indicate when it may be time to split a class into two or more specialized classes.
DIT - Depth of inheritance tree. This metric is the number of levels a class is sub-classed. On its own, the value doesn't necessarily indicate a need for refactoring, unless the base class wasn't designed for such a task.
NORM - Number of overridden methods. NORM can be useful for detecting classes that have evolved beyond their original design. It might indicate better abstraction is needed at a low level.
NOM - Number of methods. The number of methods can be useful for detecting when a class has become over grown.

This is only a partial list of some software metrics that can identify weak spots in your code base. It is also certainly possible to dream up software metrics of your own, such as the number of preprocessor "#if" statements used.

How to harness software metrics

All these numbers are quite meaningless without context. What is needed are tools. An ideal tool should run the software metric calculator on every file that is checked into a software repository. The metrics could then be stored and compared per file, and an action could be taken when a particular metric threshold has been crossed. From that trigger an appropriate action could be applied, such as sending out an email to the project manager, or creating a JIRA task to refactor the file. By using the type of metric, a list of possible refactoring remedies could be supplied to steer the developer on the right track.

Another desirable feature would be instant red/yellow/green status on projects and individual files, and trend when refactoring will be needed as metric counts increase. This would be the real power of such a system, being able to see instantly when a project is becoming "design sloppy" and being able to forecast early and schedule time for the issues to be refactored.

Most managers don't like refactoring. It's unpredictable, and has no immediate beneficial output. Experienced managers know that refactoring early is key to keeping a large code base manageable, and will save them time down the road. Using software metrics, you can take such abstract and intangible notions, and turn them into hard numbers. Those hard numbers can then be used for prediction, and prediction is the key to getting software estimates right. This is the most vital step to get right for a software outsourcing company.

Some useful tools for generating software metrics:

# posted by Mark Kotyk @ 9:35 a.m. 11 comments

Wednesday, May 21, 2008

Transitioning from ASP.Net to JSP/J2EE

Note: This is my first blog post ever - and I call myself a web developer (I'm ashamed of myself... really I am).

There is a sub title to this post, I like to call it a little tongue in cheekly:
half-whip-soy-mohca-latte dev frameworks vs single dev frameworks

Having a very strong ASP.Net and MS development background as well as a significant amount of PHP and Perl programming under my belt, I thought transitioning to JSP/J2EE would be very simple, or at least not much of a challenge. I mean web development and development in general all subscribe to the same metaphors and design patterns, the only difference is the implementation details of the language and environment.

Well, I couldn’t be more wrong. This transition has been tough. I write this blog as a small warning for people looking to do the transition from .Net to Java (and not necessarily the other way around) and might be a little overzealous or confident in how easy it might be.

I truly believe that JSP + insert web framework here + J2EE is a good platform for web development and if you are looking for an argument in favor of one framework vs. the other, look elsewhere. The major challenge moving from a predominantly single web framework environment (ASP.Net) to a multiple web framework environment (JSP, Struts, JSF etc… see a lengthy list of web frameworks for JSP) is exactly that. There is no one rule book when you move to Java, there are many and one rule book will not necessarily tell you how you are going to choose your ‘flavour of the day’ half-whip-soy-mocha-latte framework. Some people will want JSP/Struts/Spring/Hibernate; some will want JSP/Struts and others JSP/Spring and others JSP/Spring/Struts/JSF trying to lean on the benefits of each framework. The challenge for an ASP.Net developer here is that these frameworks don’t just work together. You need to spend time figuring out how each of them will communicate and cohabitate in a fashion that will work and then you can start developing. This is huge paradigm shift, as suddenly you need to do some significant set up and tweaking to just get a web dev environment working. Furthermore, suddenly you have to start thinking about Application Servers, not just web servers. In the IIS+ASP.Net world, IIS will just about handle everything for you and an application servers were about architecture and scaling up or out using remoting or web services. In the JSP/J2EE world, your application server is NOT your web server, it’s something completely different – it’s a built in load balanced remote application server that can be separated from your web server with moderate ease.

These two major paradigm shifts, once fully integrated into your knowledge help you take the small shifts that will come along, such as eliminating the event driven model of ASP.Net for whatever MVC model you go with for Java (some will be similar), or having to work update a number of xml configuration files to get Struts 2 up and running (And for any addition you make). Whatever your implementation troubles might be, keep in mind, these two software languages/frameworks/philosophies are very similar in concept but worlds apart in implementation.

Hopefully this quick blog will help another ASP.Net developer stuck on a Java project transition just a little bit quicker or an ASP.Net developer looking at JSP and not jump in too quickly and find themselves drowning in the details.

# posted by Tony @ 11:41 a.m. 8 comments

Wednesday, May 14, 2008

Java programmer living in a C++ world

Being a long time Java programmer, I've become familiar and fond of many of the features provided by the language. Reflection and runtime type identification add an extra level of power to any programming language. When I returned to C++ development, I found it necessary to fill the void I had grow accustomed to in the Java world.

Inversion of control in C++

Inversion of control is a powerful design pattern (and/or philosophy) that allows for loosely coupled and highly reusable objects. The main principle is that an owner object is responsible for supplying all the needed information and resources to any object contained within. This includes configuration, initialization, logging, thread pools, databases and any other conceivable resource.

When dealing with inversion of control, it's often desirable to test if a particular object implements a specific interface. An object may implement the Configurable interface, but not the Loggable interface. This is easy enough to do in Java using the "instanceof" keyword or via reflection. In C++, you must do things a bit different:


class Configuration;

class Configurable
{
public:
virtual void configure(const Configuration &configuration) = 0;
template<class>
static bool tryConfigure(T *obj,
                       const Configuration &configuration)
{
  Configurable *configurable = dynamic_cast<Configurable>(obj);
  if (configurable)
  {
      configurable->configure(configuration);
      return true;
  }
  return false;
}
};

In this case, a owner object can attempt to configure a child object simply by calling the static method tryConfigure. If it the supplied object is of the wrong type, it will simply return false.


Configuration config;
Configurable::tryConfigure( &ownedObjectA, config );

The same can be applied for initializing and de-initializing objects. Note that the static testing method is templated so the object type is not lost when passed in.


class Initializable
{
public:
virtual bool initialize(void) = 0;
virtual bool deinitialize(void) = 0;

template<class>
static bool tryInit(T *obj)
{
  Initializable *init = dynamic_cast<Initializable>(obj);
  if (init)
  {
      return init->initialize();
  }
  return false;
}

template<class>
static bool tryDeinit(T *obj)
{
  Initializable *init = dynamic_cast<Initializable>(obj);
  if (init)
  {
      return init->deinitialize();
  }
  return false;
}
};

Singletons in C++

Singleton is the most basic of all patterns, but extremely valuable. Care must be taken not to overuse singletons, but in the right place, they are a design gem. In C++ there is a simple template way to reduce the coding overhead of returning a static instance, and also unify the method names to get the instance.

template <class> class Singleton
{
public:
  // Virtual destructor
  virtual ~Singleton() {}

  // Get instance as pointer
  inline static Target *ptr(void) { return &(ref()); }

  // Get instance as reference
  inline static Target &ref(void) {
          static Target theInstance;
          return theInstance;
  }
protected:
  Singleton(void) {}                              // Default constructor
};

To inherit a singleton object, one extra step must be taken so that the inherited constructor is called. That is to add the templated version of singleton as a friend to the class.


class MySingletonClass :  public Singleton<MySingletonClass>
{
friend class Singleton<MySingletonClass>;
};

NOTE: Some compiler optimizations of a static inline method might cause undesirable effects, such as multiple instances of a singleton. You may need to twiddle optimization flags.

Factories in C++

Factories are responsible for creating objects of a specific interface type. For example, you could have a LoggerFactory that is able to create a PlainTextLogger, an XMLLogger and a SocketLogger. Unlike Java, in C++ it is difficult to dynamically create an instance of a class based on its name alone. By combining a Prototype pattern and a C macro, it is possible to register a class by name, and instantiate it later by name. This is somewhat analogous of querying an interface in Microsoft COM.


#include <string>
#include <map>

class Prototype
{
public:
virtual bool createInstance(void **instance) = 0;
};

template<class>
class PrototypeTemplate : public Prototype
{
public:
virtual bool createInstance(void **instance) {
  if (instance)
  {
      *instance =  new T();
      return true;
  }
  return false;
}
};

#define xstr(s) #s
#define PROTOTYPE(x) xstr(x), new PrototypeTemplate<x>()


template<class>
class Factory
{
public:

bool queryPrototype(const std::string &name, T **instance)
{
  if (name.empty() || (!instance))
      return false;

  if (m_registeredPrototypes.find(name) != m_registeredPrototypes.end())
  {
      return m_registeredPrototypes[name]->createInstance(
          (void **) instance);
  }
  return false;
}

virtual ~Factory(void)
{
  // Release the prototypes
  for (std::map<std::string,>::iterator it =
      m_registeredPrototypes.begin();
      it != m_registeredPrototypes.end(); it++)
      delete it->second;
}
protected:

/* Protected by default.  Can re-expose as public if you want
 outside code able to register new prototypes */
void registerPrototype(const std::string &name, Prototype *prototype)
{
  m_registeredPrototypes[name] = prototype;
}

std::map<std::string,>   m_registeredPrototypes;
};

A typical implementation may look like this:


#include "Factory.h"
#include "Singleton.h"

#include "Logger.h"
#include "PlainTextLogger.h"
#include "XMLLogger.h"
#include "SocketLogger.h"

namespace Protected {
// This namespace is really just an attempt to hide the base type, since it must
// first be templated as a factory, and then as a singleton.

class LoggerFactory : public Factory<Logger>
{
public:
LoggerFactory(void)
{
  registerPrototype( PROTOTYPE( PlainTextLogger ) );
  registerPrototype( PROTOTYPE( XMLLogger  ) );
  registerPrototype( PROTOTYPE( SocketLogger ) );
}
};
}


class LoggerFactory : public Protected::LoggerFactory,
               public Singleton<LoggerFactory>
{
friend class Singleton<LoggerFactory>
};

We can now create a logger dynamically by specifying the name of the class as a string. In this manner, the logger could be determined at runtime from a configuration item. It would also be possible to register prototypes at runtime via DLL or shared objects.


Configuration config;
Logger *myLogger;
std::string loggerTypeName;

if( !config.getValue("logger", loggerTypeName ) ||
loggerTypeName.empty() ||
!LoggerFactory::ref().queryPrototype( loggerTypeName, &myLogger ) )
{
// Not found, default to known existing logger
loggerTypeName = "PlainTextLogger";
LoggerFactory::ref().queryPrototype( loggerTypeName, &myLogger );
}
Configurable::tryConfigure( myLogger, config.getSubConfiguration( loggerTypeName ) );
Initializable::tryInit( myLogger );

# posted by Mark Kotyk @ 4:14 p.m. 205 comments

Chmod-me Win32 - A quick look at NTFS file system permissions

Even if NTFS is the de-facto standard file system on Windows machines today, the NTFS security model uses a set of concepts that are somewhat unfamiliar to most of us or may seem familiar until we actually use them, programmatically.

A while ago, I installed a C++ application I wrote using a local administrator account on an WinXP machine. Later, I tested the application using a restricted user account and got into a situation where a "config.ini" file, copied to the Windows shared application data folder the first time the application was launched, couldn't be modified. I quickly figured that fixing the problem would just be a matter of setting the proper "config.ini" file permissions since the file, being initially copied in an Administrator security context, wouldn't have the proper permissions to be modified by a restricted user.

In the Unix world, a shell command called 'chmod' sets file access permissions for user and groups. It's simple easy, simple, effective and on a C++ program, a single system call is all you need.

Now in Windows, the NTFS security model has a much finer grain, so there's multiple things to consider:

1) In NTFS permission are chained in a list, for files permissions that list is a Discretionary Access Control List (DACLs).

2) An Access Control List contain one or more Access Control Entries (ACE) which allows to grant or deny specifics permissions. Since file permissions can inherit permissions from parent folders (provided the parent folder allows permissions to be inherited), file permission can either be granted or denied.

3) An ACE uses Security Identifiers (SIDs) to identify a user or group.

To change the permission of a single file in a Win32 C++ program, you may end-up coding something like this:

Notes:
1) For the sake of clarity, only the *Unicode* character set is used in this example.
2) The header files "AclApi.h" and "Sddl.h" are required."
3) _WIN32_WINNT 0x0500" needs to be added to your project "Preprocessor Definitions" settings.

/*******************************************************************************
*
* FUNCTION      SetFilePermissions
*
* DESCRIPTION   Sets file permissions for a specific file
*
* PARAMETERS    string filename: full pathname of the file to change permissions
*               string username: name of a user or a group
*               int permissions: can be one or more of the following:
*               {GENERIC_READ | GENERIC_WRITE | GENERIC_EXECUTE OR GENERIC_ALL}
*
*               If a permission is omitted and is currently associated
*               with the specified file, it will be removed,
*               unless that permission is inherited.
*
* RETURNS       non-zero if succeeds, zero if it fails.
*               Use GetLastError() to get extended the error information.
*
******************************************************************************/
bool SetFilePermissions(LPCWSTR filename, LPCWSTR username, int permissions)
{
SID_IDENTIFIER_AUTHORITY sia = SECURITY_NT_AUTHORITY;
EXPLICIT_ACCESS eAcc;

PSID pSid = NULL;
PACL dacl = NULL;
int lRes = ERROR_SUCCESS;

eAcc.grfAccessMode = GRANT_ACCESS;
eAcc.grfAccessPermissions = permissions;
eAcc.grfInheritance = OBJECT_INHERIT_ACE|CONTAINER_INHERIT_ACE;
eAcc.Trustee.MultipleTrusteeOperation = NO_MULTIPLE_TRUSTEE;
eAcc.Trustee.pMultipleTrustee = NULL;
eAcc.Trustee.TrusteeType = TRUSTEE_IS_WELL_KNOWN_GROUP;

// NOTE: In some cases, you will want to use a "well-known security identifiers"
//       (http://support.microsoft.com/kb/243330) instead of a username or group
//       since SIDs remain the same from one operating system language to another.
if( ConvertStringSidToSid(username, &pSid) )
{
    eAcc.Trustee.TrusteeForm = TRUSTEE_IS_SID;
    eAcc.Trustee.ptstrName = static_cast(pSid);
}
else
{
    // Reset lasterror since ConvertSidToStringSid() is also used
    // to determine if a username is a SID or not.
    SetLastError(0);
    eAcc.Trustee.TrusteeForm = TRUSTEE_IS_NAME;
    eAcc.Trustee.ptstrName = const_cast(username);
}

// Create a DACL
lRes = SetEntriesInAcl(1, &eAcc, NULL, &dacl);
if (lRes == ERROR_SUCCESS)
{
    // Set DACL
    lRes = SetNamedSecurityInfo( const_cast(filename), SE_FILE_OBJECT,
        DACL_SECURITY_INFORMATION, NULL, NULL, dacl, NULL);
}

if (pSid != NULL)
    LocalFree((HLOCAL)pSid);

if (dacl != NULL)
    LocalFree((HLOCAL)dacl);

return lRes == ERROR_SUCCESS;
}

int wmain(int argc, WCHAR* argv[])
{
// As an example, let's allow "Read" and "Write" permissions to the group
// "Everyone" for the file "myconfig.ini". Since the actual name "Everyone"
// depends on the actual operating system language, we'll use its
// matching SID string representation (S-1-1-0) instead.
bool success = SetFilePermissions(
    L"C:/Documents and Settings/All Users/Application Data/myapp/myconfig.ini",
    L"S-1-1-0",  GENERIC_READ | GENERIC_WRITE);

 // For security purposes, It might make more sense to allow only
 // authenticated users ( SID: S-1-5-11 ) instead of the group "Everyone".

return !success; // zero means the program ran successfully
}

The Windows security model isn't trivial, but fortunately some good articles have been published on the subject. The following gives a good overviews of permissions precedence and this one provides more details about ACL Inheritance. For a more in-dept API coverage, please refer to the MSDN documentation .

You if are in a hurry you can always get away by using the real Microsoft chmod command line equivalent - CACLS.EXE in a script.

Happy Chmoding - Thanks Mikhail for your input.

Labels: DACL, file access permissions, NTFS, Win32

# posted by Jean-Claude Batista @ 2:17 p.m. 5 comments

Saturday, May 10, 2008

Experience of applying model checking on industrial software

The very exciting ICSE 2008 (International Conference on Software Engineering) is going to begin in the beautiful Leipzig, Germany. Luckily my paper was accepted at the conference and scheduled in the Telecom Experience Track.

The paper is an abstract of my thesis work done at Queen's University which relates to apply model checking techniques on real industry scale software.

Model checking has been claimed to provide comprehensive coverage to the system it verifies. However, there were few critical studies of the application of model checking to industrial scale software systems by people other than the model checker's own authors. In the paper, I reported my experience in applying the SPIN model checker to the validation of the failover protocols of a commerical telecommunication system, WebArrow.

In my experiment, I used SPIN model checker as the tool as it is the most heavily optimized model checker in the field. The methodology I used as follows:
1. Summarize the failover protocol with UML activity digrams
2. Build Promela (the input language of SPIN model checker) models
3. Check for deadlock
4. Express properties in LTL (Linear Temporal Logic)
5. Verify LTL properties in the SPIN
6. Optimize and simplify the model
7. Debug model using counter-examples
8. Record results

The key study results are that I found there to be a significant gap between the promise of model checking and the reality. Rather than enjoying fully automated checking of properties versus declarative system models, I battled problems of tracebility in the model checker. In addition, the time to create and check the models was incompatible with realities of time-to-market demands. Furthermore, the creation of the model itself took a big chunk of effort as well.

From the results, I can see that there are still some areas in the model checking community that the model checker authors can improve.

Labels: QA

# posted by Barry Long @ 2:11 p.m. 10 comments

Friday, May 02, 2008

Design Guidelines with Style

I'm not going to go into detail about the benefits of using CSS, which have been covered so many times before. In short, it provides a convenient way to separate presentation logic from actual web markup, it reduces duplication of style-related code, and best of all it's incredibly flexible! Ah, but that flexibility is a bit of a double-edged sword: allowing many ways to do the same thing means similar things will likely be done differently in different places, gradually leading to a mess of files which is a nightmare to maintain.

If you don't believe me, install a tool like FireBug and pull up a favourite large-scale web site. Set a small goal for yourself, like changing the colour of an entire page's background, or changing the bottom margin of all headings across all pages, something that should ideally only be a few lines to fix. Is it really as easy as it sounds? The answer is likely no, and there are many factors that can contribute to this:

Multiple developers working on separate areas of the site, each using their own set of rules for how their CSS should be written.
Years of modifications by different (or even the same!) developers are done with little consistency across the site.
The original developers saw CSS as an afterthought; something you work on after the site "works" that can be done in any number of ad-hoc ways.
The CSS was generated using a WYSIWYG-style editor (such tools are notoriosly bad for producing maintainable code).
The original CSS is out of touch. The web changes quickly, and even sites that are only a few years old may be using syntax that has been updated or replaced.

Even well-meaning developers will make mistakes while modifying files that were started by others. If the existing file already does the same thing in several ways, which way should one use? What if the particular file only defines one way of doing it, but other files use a different method? What if the existing way is consistent, but far from optimal? Refactoring one CSS file (let alone several) is a remarkably grueling and time-consuming task. Is this unavoidable? No! It just requires a little planning and a little discipline.

The trick is to set up (and subsequently follow!) some design guidelines to make sure everyone starts, and most importantly stays, on the same page. This is best done at the beginning of a project, but it's never too late to start doing something right. For example, here are some things you may want to set guidelines for:

How specific should selectors be? Is it all right to define a general rule for all instances of <div>, or should each <div> require a class? Should selectors be classes-only, or are ids allowed as well?
Should traversing the DOM, as in "div.classA span.classB a" be allowed? encouraged? mandatory whenever possible?
What sort of things are forbidden? This is important to consider for things like wildcards (*), inline-styles and <style> tags, etc.
How should CSS be broken into multiple files? Does each section of the site get its own file (login.css, register.css, menu.css...) or are files broken up by functionality (layout.css, form.css, graphics.css...)?
What level of CSS quality are you looking for? Does all CSS have to validate? What about browser-specific rules that are likely non-standard? (IE's conditional comments can work wonders here)

There is no right or wrong answer to any of those questions. What's important is to give each option some thought, and decide which is best for your particular needs. These are design decisions that involve real trade-offs in maintainability, functionality and resource needs, and just like any other design decisions, they will have a real impact on the outcome of the site. Deciding upon and following through on just a few well-defined guidelines makes these factors clear, and provides an optimal environment for developers to write maintainable code.

Labels: CSS, web development

# posted by Dan M @ 6:27 p.m. 18 comments