technovelty

weblog of Ian Wienand

RSS  |  technovelty home  |  page of ian  |  ianw@ieee.org

Compare and Swap with PIC

Our Dear Leader Sam Hocevar has previously blogged about PIC and inline ASM. Today I came across a sort of extension to this problem.

Consider the following code, which implements a double word compare and swap using the x86 cmpxchg8b instruction (for a bonus you can lock it to make it atomic).

#include <stdio.h>

typedef struct double_word_t {
	int a;
	int b;
} double_word;

/* atomically compare old and mem, if they are the same then copy new
   back to mem */
int compare_and_swap(double_word *mem,
		     double_word old,
		     double_word new) {

	char result;
	__asm__ __volatile__("lock; cmpxchg8b %0; setz %1;"
			     : "=m"(*mem), "=q"(result)
			     : "m"(*mem), "d" (old.b), "a" (old.a),
			       "c" (new.b), "b" (new.a)
			     : "memory");
	return (int)result;
}

int main(void)
{

	double_word w = {.a = 0, .b = 0};
	double_word old = {.a = 17, .b = 42};
	double_word new = {.a = 12, .b = 13};

	/* old != w, therefore nothing happens */
	compare_and_swap(&w, old, new);
	printf("Should fail -> (%d,%d)\n", w.a, w.b);

	/* old == w, therefore w = new */
	old.a = 0; old.b = 0;
	compare_and_swap(&w, old, new);
	printf("Should work  -> (%d,%d)\n", w.a, w.b);

	return 0;
}

This type of CAS can be used to implement lock-free algorithms (I've previously blogged about that sort of thing).

The problem is that the cmpxchg8b uses the ebx register, i.e. pseudo code looks like:

if(EDX:EAX == Destination) {
    ZF = 1;
    Destination = ECX:EBX;
}
else {
    ZF = 0;
    EDX:EAX = Destination;
}

PIC code reserves ebx for internal use, so if you try to compile that with -fPIC you will get an error about not being able to allocate ebx.

A first attempt to create a PIC friendly version would simply save and restore ebx and not gcc anything about it, something like:

  __asm__ __volatile__("pushl %%ebx;"   /* save ebx used for PIC GOT ptr */
		       "movl %6,%%ebx;" /* move new_val2 to %ebx */
		       "lock; cmpxchg8b %0; setz %1;"
		       "pop %%ebx;"     /* restore %ebx */
	    	       : "=m"(*mem), "=q"(result)
		       : "m"(*mem), "d" (old.b), "a" (old.a),
		         "c" (new.b), "m" (new.a) : "memory");

Unfortunately, this isn't a generic solution. It works fine with the PIC case, because gcc will not allocate ebx for anything else. But in the non-PIC case, there is a chance that ebx will be used for addr. This would cause a probably fairly tricky bug to track down!

The solution is to use the #if __PIC__ directive to either tell gcc you're clobbering ebx in the non-PIC case, or just keep two versions around; one that saves and restores ebx for PIC and one that doesn't.

posted at: Fri, 01 Feb 2008 21:03 | in /code/arch | permalink | add comment (0 others)

Add a comment
*Name
*Email (not shown)
Website
*Comment:
*Word above?
* denotes required field

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.