Wednesday, January 25, 2006

Fore-warned is fore-armed

Many of the networking or related applications have to marshal/unmarshal packets to/from structures. And its not uncommon to have more than one field within a byte of a packet. So, what's the best way to construct a packet before sending it over the wire? Say, a SCSI CDB (command descriptor block - similar to an SNMP packet) has to be sent to a device target. Lets simplify the whole thing and say that just 1 byte consisting of 8 1-bit flags is to be sent over the wire. What's the best way to represent such a byte? The first solution that comes to anyone's mind would be bitfields. So, let's go ahead with that.

typedef struct CDB_Byte {
  unsigned char a:1;
  unsigned char b:1;
  unsigned char c:1;
  unsigned char d:1;
  unsigned char e:1;
  unsigned char f:1;
  unsigned char g:1;
  unsigned char h:1;
} CDB_Byte;

Now I want to set only the LSB(least significant bit) and the one next to it. Okay, that's simple.

CDB_Byte byte;
byte.h = 1;
byte.g = 1;

But, I have a question. Is this the correct way to do it? The answer is NO. Let me illustrate why this is'nt the correct way with a simple example.

unsigned char c = 0x01; // LSB set
CDB_Byte byte;
std::memcpy(&byte, &c, 1); //copy the bits
// to change the value of 'c' to 0x03
byte.g = 1; // set the bit next to the LSB
std::memcpy(&c, &byte, 1); // copy the bits back
std::cout << (unsigned)c;

Will this print '3'? The answer is 'depends'. The language standard does not say anything about the allocation of bitfields. The compilers are free to implement them as they deem fit. On linux, the gcc compiler allocates the bitfields starting from the LSB. This can be verified - the above piece of code prints '65'.

hgfedcba

However on HP-UX, the aCC compiler allocates bitfields starting from the MSB. The above piece of code prints '3' (as one might have expected in the first place).

abcdefgh

Moral of the story is this: Never use bitfields to construct structures containing sub-byte fields. Always use a void * chunk of memory, extract a byte, and use masks and bitwise operators to set/extract sub-byte fields. At least, if you expect your application to be portable, don't use bitfields! I would say that bitfields should never be used anyway, 'cos there's no guarantee that the same application will run on the same platform if the compiler vendor rolls out a newer version.

Fore-warned is fore-armed!

1 comment:

Arvind said...

Good one... I tried it on HP-UX..it prints correctly... Need to try it on Linux yet.