Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What is a safest way to set all bits of a variable to true in C? (stackoverflow.com)
33 points by niyazpk on Oct 11, 2010 | hide | past | favorite | 28 comments


Personally I'd use: unsigned int a = 0xFFFF, not unsigned int a = -1.

This is a classic case of a readability issue. 0xFFFF is, in my mind, much clearer on your intention than -1. The only problem is that you're assuming a specific int size, but really, if you're working with bits, chances are good that you're working on a platform where you know the architecture size (at least on embedded platforms).


The problem is that you can't be sure how many bytes and int is on any given platform. So your way is more readable, but not portable.


Crap. This is why you don't respond without fully reading a comment. Where is my delete option? :-)


I actually work with bits a lot without knowing what the type is at all: enter templates, exit simplicity. My answer is to use C++ ~static_cast<Type_>(0) or C ~((Type_) 0).


Title said safest. use -1 and leave a very short comment.


The "limits.h" (ISO C99 Standard) defines UINT_MAX as the maximum value for a variable of type unsigned int which is 4294967295 (0xffffffff).


Sigh:

"The contents of the header <limits.h> are given below, in alphabetical order. The minimum magnitudes shown shall be replaced by implementation-defined magnitudes with the same sign."

  #define UINT_MAX 65535

Key here is implementation-defined, with a minimum of 65535.


Platforms differ, and the limits.h reflects this, that's the point. We should use the abstractions (the #defines) of the standard library, i.e. refer to these values by their names, that's the way to write portable software.


I always though that was set by the vendor correctly. Oh well..

I always liked the definition of true in Forth (all bits set to 1). It really made it a lot easier.


The standard doesn't require that an int use all of the bits of storage it takes. A hypothetical 33-bit machine may present a C environment where ints are 32-bits, with the extra bit unused (and unset).


How many 'ones complement' machines exist?


The most well-known are 30-40 years old by now and predate ANSI C (=c89), some predate even K&R C. They also have 18- or 36-bit words and other oddities.

Compare that to non-ASCII systems (e.g. AS/400), which are still much in use now and probably have a sizable bit of C/C++ programs running on them (besides COBOL and Java).

If you're programming for one of the more common platforms (i.e., x86, x64, ARM, PowerPC, 68k, MIPS, SPARC, VAX, 8080/Z80, 6502), you'll be safe to assume that ((unsigned)-1), ((unsigned)~0) and ~((unsigned)0) are all the same.


It's been quite a while since I heard VAX called one of the "more common" platforms. I'm too young. The only one I ever saw was in an ASU computer lab and we wrote our assembly language assignments on it (86HC11 assembly via some external test board... I don't even know what was in the VAX).


In C99, integer types can have padding bits that may not be writable, and writing all ones can be 'a trap representation' (except for the cas of unsigned char). So, I would guess that the portable way to do this requires taking the address of the variable, casting to (char unsigned *), and writing sizeof(var) all-ones unsigned char patterns (however that has to be done). I am not a C expert, though, so feel free to correct me.


But but but, this doesn't answer the question!? It explicitly acknowledges that -1 will not always set all bits to one, yet it recommends it!

That makes me very surprised by (1) the number of up-votes, and (2) the green "check" mark of approval.


But but but, you apparently didn't understand the answer. It doesn't matter what the representation of -1 is. The C standard defines the cast of a negative number to an unsigned int as the (UINT_MAX + 1) modulo of the number. By definition,

  unsigned int foo = -1;
will set foo to 0xFFFF..., automatically setting all bits to 1 regardless of the number of bits in int types and without respect to the representation of negative numbers.


EDIT : OK, just got it: I got the logic backwards: first, -1 is converted to uint. Second, -1 uint means UINT_MAX. Third, the binary representation of UINT_MAX is all 1s. The way I previously understood it, the -1 would be a signed integer which has some binary representation, and that binary representation would become the uint.

Weird bit of arcana. Below is my mistaken comment. (Notice that I pretended that UINT_MAX is not all 1s, which is silly. I suppose I made that mistake because I "couldn't be wrong" or something.)

As far as I know, your definition can't be inferred from the C standard. The answer itself acknowledges that -1 doesn't yield 0xFFFF… on every platform. The only guarantee is that it will yield UINT_MAX, which is not what was asked.

Otherwise, that would mean that C basically mandates a two's complement representation. Does it?


The next question: Why?

(To clarify: I mean, "why do you care what the bits are set to".)


I'm not sure what you mean.

If you meant "why worry about bits, you should be dealing with values", then there are plenty of cases where that isn't true. Lots of programs (e.g. embedded programs) have to deal with actual bits, not with the values themselves. Just as an example, flags.

(I don't know if this is what you meant, so apologies if I misunderstood your question.)


So why not have a type that's "bitfield64" or something, and not worry how the machine represents your integer.

This seems like too much abstraction for a C programmer, I know, but there is already precedent. int and int * are not the same type; if you use one as the other the compiler will tell you not to, even though they are the exact same bits in memory.


That's a great idea, actually. In the projects I worked on, there were usually typedefs of various sizes, for example byte -> unsigned char, word -> unsigned int, dword -> unsigned long, etc.

We made sure that each one of these had the correct bit amounts in the mapping, and that way, you always knew exactly how many bits you were working with, which is important in embedded systems (where memory is important, and where you usually have structures which directly map to e.g. ip headers, so you need exact sizes).

By the way, even the words byte/word/dword might cause confusion, cause none of them are well defined either. Some architectures assign a word 16 bits, some 32 bits. And believe it or not, some architectures even assign bytes a number of bits different than 8! The "officially correct" term, I believe, is Octet, which is defined as 8 bits. Of course, we just decided internally what we meant by byte, word and dword, and that worked fine.


Also bitmaps. There are a lot of bitmaps at the system level denoting, f.ex., used and unused inumbers and blocks. (Though it is easier--and more common--to initialize bitmaps to 0 than to initialize bitmaps to 1).



When converting a signed value to an unsigned value (and if the value cannot be represented with the unsigned type), the standard says: Add UINT_MAX + 1 until you get a valid unsigned value. Adding UINT_MAX + 1 to -1 gives you UINT_MAX.


Because of two's complement and the way implicit conversions work.


It is independent of the representation of negative numbers on the machine, because of the conversion rules.


Because of two's complement, "-1" as a signed int/long is internally represented with all bits set to 1.

Conversion rules are pretty basic, i.e. there is no conversion done on the actual value, so that's why it works: converting "-1" to an unsigned int yields MAX_UINT.


What I mean is that the conversion:

  unsigned int flags = -1;
always works, regardless of whether negative numbers are represented in two's complement, one's complement, or sign/magnitude on the underlying hardware.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: