jeudi 2 mars 2017

Switch with tons of cases vs single if

Let's say we need to go through a 128-element array of uint8 and compare neighbour elements and put the result to another array. The code below is the most simple and readable way to solve this problem.

for (i = 1; i < 128; i++)
  if (arr[i] < arr[i-1] + 64) //don't care about overflow
    arr2[i] = 1;

It looks like this code 1) will not use branch table. And as far as I know, a cpu doesn't read just 1 byte, it actually reads 8 bytes (assuming a 64bit machine), and that 2) makes cpu do some extra work.

So here comes another approach. Read 2 (or 4 or 8) bytes at a time and create an extremely huge switch (2^16, 2^32 or 2^64 cases respectively), which has every possible combination of bytes in our array. Does this make any sense?

For this discussion let's assume the following: 1) Our main priority is speed 2) Next is RAM consumption. We don't care about the size of the executable (unless they somehow affect speed or RAM)

Aucun commentaire:

Enregistrer un commentaire