An Introduction to Bit Operations, Masking and Filtering
Bit operations are daunting on newcomers to programming. In reality, they are quite simple, all it takes is a basic level understanding on how bits work and what the operators are and what they do. I will try to keep this article in not language specific, but any example code will be written in Java. However the concepts apply to all languages.
Operations
Shift Operations
Shifting in bit operations involves moving over all the bits in a value to different positions. This may sound confusing, but think of it like multiplying or dividing by 10 in base 10.
Left Shift
So lets work in the system we are more used to, base ten, for now. Like I said above, think of shifting as simply moving the existing bits over a certain amount of places. A left shift is like multiplying by 10n in where n is the number of places you want to shift. Let's look at an example. Here, the operator for shifting is going to be <<
, but in different calculators or programming languages you might see it a little bit differently.
Let's say you wanted to shift the number 5 by 1 in base 10, remember that the values here are all pretending that all these operations are done in base 10, I will get to the binary examples next. To do this with multiplication, you would simply multiply 5 * 10
:
05 * 10 = 50
You can see that multiplying by 10 caused the 5 to move from the first decimal position to the second. This is what shifting does. Let's see this using the shifting operator.
5 << 1 = 50
So this is a very basic example, let's say we want to shift 00123 over to 12300, this would involve shifting over two positions. In multiplication, the process would look like this:
00123 * 10^2 = 12300
Notice how all the digits were shifted over two positions, the left shift operator does the same thing. This is how that same operation would look like using the operator:
00123 << 2 = 12300
Note that it is very uncommon to actually see shifting done in base 10, I am simply showing the concept of how it works. Shifting in calculators and programming languages is going to be done in binary.
Speaking of binary, lets move on to it! So like in base 10, shifting in binary is like multiplying the value by 2n where n is the amount of places you want to shift. So let's say we wanted to shift the number 3 over 4 spaces, using our multiplication method, the process would look like this:
3 * 2^4 = 48
And using the operator, the process looks like this:
3 << 4 = 48
Now lets look at the binary representation for 3 and 48.
3 = 0000 0011
48 = 0011 0000
As you can see, the bits were shifted over 4 spaces to the left.
This is all fine and well, but let's consider one more thing before we move on. In computing, there aren't an unlimited amount of bits for a given value. Be mindful of what data type you are using, if you shift over too far, you might get a very negative number or just an error.
Right Shift
Right shifting is the same as left shifting, but in the other direction. Instead of multiplying, it's dividing. Let's consider base 10 first. A right shift would be result of dividing by 10n where n is the number of places we shift. Keep in mind that these operations are done with whole numbers, so the remainder will just be ignored. Let's suppose we we did a right shift on the number 123 by one space. By my definition, and ignoring remainders, that would look like this:
0123 / 10 = 0012
Now using the right shift operator, >>
, let's look at the same process:
0123 >> 1 = 0012
So all the digits were shifted to the right by one space, and the digit that was in the first position was knocked off. Any digits that would go past the right most digit will be knocked off during a right shift. Let's take 123 and shift it by 2 to illustrate this
0123 >> 2 = 0123 / 100 = 1
That's the basics, let's take a look at the binary implementation. So like left shifting is multiplying 2n, right shifting is dividing by 2n, where n is the number of places to shift to the right. Let's use division to shift 6 over by 1, 2, and 3 places. Remember that because we are only allowing for whole number results, the remainders of the divisions are simply removed.
6 / 2^1 = 3
6 / 2^2 = 1
6 / 2^3 = 0
And using the right shift operator, the above operations look like this:
6 >> 1 = 3
6 >> 2 = 1
6 >> 3 = 0
Let's look at the binary values of these numbers.
6 = 0110
3 = 0011
1 = 0001
0 = 0000
You can see that in the first operation, the bits were shifted by one, and no one's were lost. In the second operation, the bits were shifted by two, and one bit was lost. In the last operation, all the bits were lost because they were shifted by 3. The behavior is much more constant than left shifting because this is not affected by the size of the number value.
That is shifting in a nutshell, obviously, there's more too it, but this is more than enough when dealing with masking and filtering. Note that left shift is going to be the operation that is used in those processes.
And Operation
For this operation, and the other two below, the best way to demonstrate them is with examples. The and operation takes two values, and creates a new value that only consists of the bits that the two values had in common. If the bits had no common values, then the operation would return 0. The operator for this operation here will be &
, however the operator changes from language to language as well. So let's just look an example to see how it works. We aren't going to consider the actual decimal values of the numbers because it does not matter, but feel free to convert them to decimal if you'd like.
1100 1000 & 1011 1000 =
1100 1000
& 1011 1000
= 1000 1000
Looking at the columns, you can see that the result only had a one in the places were the two operands both had a one. So if only one, or neither of the operands have a 1 in a given spot, the result will have a zero in that spot. If two values have no common bits, the result will be zero. Here's an example:
1100 1000
& 0010 0110
= 0000 0000
This concept is very important when using bit operations for masking and filtering.
Or Operation
The or operation is slightly different from the and operation. It takes two values and returns a value where all the bits that were 1 in either of the two values remains a 1. Think of it as combining the two values (not addition). Here, the operator will be |
. Let's jump right into an example to visualize this:
1100 1000 | 1011 1000 =
1100 1000
| 1011 1000
= 1111 1000
Again, looking at the columns, you can see that the result only had a zero in the places where both the operands had a zero. If at least one of the operands have a 1 in a given spot, the result will have a one in that spot. If the two values have no uncommon bits, the result will be equal to the larger operand, here's an example:
1011 1001
| 0001 1000
= 1011 1001
Here, that the result and the first operand are equal.
The or operation is also a crucial part of both masking and filtering.
P.S. There is also an xor, or exclusive or, operation that is very similar to or, but it isn't used in masking or filtering so I will not show it here.
Not Operation
The last operation that we will be looking at is the not operation. Not is an interesting operator because it only has one operand. Conceptually, it simply inverts a value. This means that all the 0's will be 1's in the result, and all the 1's will be 0's. The not operator changes all the time, more so than the others, but here we will use ~
. Here's an example of the operation:
~ 1100 1000
= 0011 0111
Looking at the columns, you can see that all the spots that were 0 in the operand are 1 in the result, and all the spots that were 1 in the operand were 0 in the result. Unlike the and operation and the or operation, you need to worry about the size of your value, because leading zero's also become a 1. For example, if we are working with a byte, which has 8 bits and do something like ~3
, the result would be this:
~ 0000 0011
= 1111 1100
You even have to know whether your value is signed or not, because if it is, the value will be negative. However, the decimal value of the result, again, is not important when dealing with masking and filtering.
Let's have some fun and combine not with and to remove some bits from a number.
1011 1001 & ~(0001 1000) = ?
~ 0001 1000
= 1110 0111
1011 1001
& 1110 0111
= 1010 0001
Note the result only removed the bits that we wanted to remove, and kept the others. This process is where we will use the not operator in filtering.
Now that you understand those operations, let's finally move on the masking and filtering!
Masking
So masking is basically a way you can use bits to determine whatever using only an int value. That was a bad explanation, so let's use an example. Let's suppose that you are making a method that checks if a value is in an interval, and you want to make it configurable of whether or not that interval is inclusive on the two endpoints. You could use two booleans in the parameters, but that can sometimes be hard to follow, and will get difficult to track when you have methods that have a lot of these flags to check. So first, let's define some constants. As I said above, the code here will be in java but the concepts will carry to any language. Anyways, here are the constants
public static final byte EXCLUSIVE = 0; public static final byte INCLUDE_LEFT_ENDPOINT = 1 << 0; public static final byte INCLUDE_RIGHT_ENDPOINT = 1 << 1;
These are what are called single bit masks, or flags. They are called that because they only have one bit that is on. Their decimal values will be powers of two, so they could be defined as 1 and 2, but as you continue to add more masks, you will fill your code with large powers of two that just become difficult to visualize. Notice how the EXCLUSIVE
constant is set to 0. This is the "neither" case, it is useful to be set 0 because it contains no bit that is on. One last note is that the variables are bytes because we don't need that many masks. With a byte as our type, we can only have 8 single bit masks, and 9 unique masks including the "nether" mask. You can have some helper masks that combine two single bit ones, but that is not necessary. Anyways, let's put these masks to work. First, let's look at how to check if a given mask contains a flag. The process is simple, the basic structure will look like this:
byte flag = 1 << n;
byte mask = someInt; // The actual values are irrelevant
if((mask & flag) != 0) {
// The mask has the flag
} else {
// The mask does not have the flag
}
Because remember, if two values have no bits in common, the and operation will result in 0. So let's write our method. Here it is:
public static boolean isOnInterval(int n, int min, int max, int options) { boolean greaterThanMin; if((INCLUDE_LEFT_ENDPOINT & options) != 0) { greaterThanMin = n >= min; } else { greaterThanMin = n > min; } boolean lessThanMax; if((INCLUDE_RIGHT_ENDPOINT & options) != 0) { lessThanMax = n <= max; } else { lessThanMax = n < max; } return greaterThanMin && lessThanMax; }
You may be wondering why EXCLUSIVE
is not actually used, this is because it will always be equal to zero when you use the and operator with it. It is simply a helper constant to make the code a bit more readable. Note that a constant could have been made that would have been public static byte INCLUSIVE = INCLUDE_LEFT_ENDPOINT | INCLUDE_RIGHT_ENDPOINT
since it is a common case, but this does not illustrate how flexible masking can be when calling the method, so I opted not to make that. And finally, let's make a helper that tests these.
public static void testInterval(int n, int min, int max) { System.out.printf("%d on interval (%d, %d): %b\n", n, min, max, isOnInterval(n, min, max, EXCLUSIVE)); System.out.printf("%d on interval [%d, %d): %b\n", n, min, max, isOnInterval(n, min, max, INCLUDE_LEFT_ENDPOINT)); System.out.printf("%d on interval (%d, %d]: %b\n", n, min, max, isOnInterval(n, min, max, INCLUDE_RIGHT_ENDPOINT)); System.out.printf("%d on interval [%d, %d]: %b\n", n, min, max, isOnInterval(n, min, max, INCLUDE_LEFT_ENDPOINT | INCLUDE_RIGHT_ENDPOINT)); System.out.println(); }
See how the flags are used in the options mask. The flags that we want activated are joined by the or operator, meaning we want the method to include both. In a more complicated case, there can be a series of tasks that has some steps that are optional, and all those steps will be flags, and in the method call, the desired steps will be combined with or.
Now finally let's look at some test cases. We'll use the following test cases:
- n is greater than min, and less than max
- n is equal to min
- n is equal to max
- n is less than min
- n is greater than max
Here are those cases in code:
testInterval(3, 2, 5); testInterval(2, 2, 5); testInterval(5, 2, 5); testInterval(1, 2, 5); testInterval(6, 2, 5);
That code will have the following output.
3 on interval (2, 5): true 3 on interval [2, 5): true 3 on interval (2, 5]: true 3 on interval [2, 5]: true 2 on interval (2, 5): false 2 on interval [2, 5): true 2 on interval (2, 5]: false 2 on interval [2, 5]: true 5 on interval (2, 5): false 5 on interval [2, 5): false 5 on interval (2, 5]: true 5 on interval [2, 5]: true 1 on interval (2, 5): false 1 on interval [2, 5): false 1 on interval (2, 5]: false 1 on interval [2, 5]: false 6 on interval (2, 5): false 6 on interval [2, 5): false 6 on interval (2, 5]: false 6 on interval [2, 5]: false
Looks like it works! The source for this example is on the gist for this article here: https://gist.github.com/JoseRivas1998/679f46003d1bc05d9023e6af86724c3f
Filtering
Suppose you have a system where objects can interact with each other in some way. However, you also want to categorize objects and make certain objects only able to interact with objects of certain categories. This is where filtering comes in. Filtering requires an object to have two properties, a category mask (which should be a flag) and a filter mask. The easiest way to do this is to have a separate Filter class that an object has a reference to. Let's make that class now.
class Filter { public int category = 1 << 0; public int maskBits = ~0; }
Let's take a look at those values before moving on. Category is the flag that defines which category a filter is a part of, and maskBits is the mask of categories that an object using the filter can interact with. Let's look at the default values. So it's good to set the default category of a filtering system to 1, it's just standard and simple. The default of the maskBits variable is ~0
, (which evaluates to -1
in Java). The reason for this is because that means that the mask consists of all 1's, which means that any nonzero mask will match with the default filter. I used an int in this example, allowing for 31 other categories other than the default. More categories can be allowed by using a long
instead, and can also be limited by using a smaller data type like short
or byte
. Next is to create an object that is interact-able. That processes is also rather simple, here is that class:
class InteractingObject { public static final int TYPE_A = 1 << 1; public static final int TYPE_B = 1 << 2; public static final int TYPE_C = 1 << 3; public static final int TYPE_D = 1 << 4; public static final int TYPE_E = 1 << 5; public Filter filter; public char data; public InteractingObject(char data) { this.data = data; this.filter = new Filter(); } }
So the object has a filter, and data that is used to represent the object. The constants are the available categories, note that they are all unique flags. So let's make our method that makes two of these interact, here it is
public void interact(InteractingObject o1, InteractingObject o2) { Filter f1 = o1.filter; Filter f2 = o2.filter; if(((f1.category & f2.maskBits) != 0) && ((f2.category & f1.maskBits) != 0)) { System.out.printf("%s and %s interacted.\n", o1.data, o2.data); } else { System.out.printf("%s and %s did not interact.\n", o1.data, o2.data); } }
So the two objects must be able to interact with each other for this to work, however this can be modified by replacing the &&
with ||
in the if expression. Let's look at some test cases and make them.
a => type: A, interacts with: B
b => type: B, interacts with: A
c => type: C, interacts with: A, B, C
d => type: D, interacts with: D
e => type: e, interacts with: NONE
f => default
Here is the full tester:
public class BitFilteringTester { InteractingObject a = new InteractingObject('a'); InteractingObject b = new InteractingObject('b'); InteractingObject c = new InteractingObject('c'); InteractingObject d = new InteractingObject('d'); InteractingObject e = new InteractingObject('e'); InteractingObject f = new InteractingObject('f'); public BitFilteringTester() { a.filter.category = InteractingObject.TYPE_A; a.filter.maskBits = InteractingObject.TYPE_B; b.filter.category = InteractingObject.TYPE_B; b.filter.maskBits = InteractingObject.TYPE_A; c.filter.category = InteractingObject.TYPE_C; c.filter.maskBits = InteractingObject.TYPE_A | InteractingObject.TYPE_B | InteractingObject.TYPE_C; d.filter.category = InteractingObject.TYPE_D; d.filter.maskBits = InteractingObject.TYPE_D; e.filter.category = InteractingObject.TYPE_E; e.filter.maskBits = 0; } public void runTest() { interactAll(a); interactAll(b); interactAll(c); interactAll(d); interactAll(e); interactAll(f); } public void interactAll(InteractingObject o) { interact(o, a); interact(o, b); interact(o, c); interact(o, d); interact(o, e); interact(o, f); System.out.println(); } public void interact(InteractingObject o1, InteractingObject o2) { Filter f1 = o1.filter; Filter f2 = o2.filter; if(((f1.category & f2.maskBits) != 0) && ((f2.category & f1.maskBits) != 0)) { System.out.printf("%s and %s interacted.\n", o1.data, o2.data); } else { System.out.printf("%s and %s did not interact.\n", o1.data, o2.data); } } public static void main(String[] args) { BitFilteringTester tester = new BitFilteringTester(); tester.runTest(); }
}
This is the output:
a and a did not interact. a and b interacted. a and c did not interact. a and d did not interact. a and e did not interact. a and f did not interact. b and a interacted. b and b did not interact. b and c did not interact. b and d did not interact. b and e did not interact. b and f did not interact. c and a did not interact. c and b did not interact. c and c interacted. c and d did not interact. c and e did not interact. c and f did not interact. d and a did not interact. d and b did not interact. d and c did not interact. d and d interacted. d and e did not interact. d and f did not interact. e and a did not interact. e and b did not interact. e and c did not interact. e and d did not interact. e and e did not interact. e and f did not interact. f and a did not interact. f and b did not interact. f and c did not interact. f and d did not interact. f and e did not interact. f and f interacted.
This pretty much worked as expected. The source for this can be found here: https://gist.github.com/JoseRivas1998/679f46003d1bc05d9023e6af86724c3f
Note that filtering uses masking. You may be wondering, why is filtering useful in the first place? Well I'll give you an example. In game development, you may not want all physics bodies to interact with each other. Suppose you have a player, enemy, pick up, and ground category. The player can interact with enemies, pick ups and the ground, but the enemies can only interact with the player and the ground. That's a basic example on how useful this can be. In fact, the reason I wrote this was because I was learning Box2D for libGDX. Hope you found this helpful!