vendredi 22 janvier 2021

Which is faster in a tight loop ? [swich-cast, if-else, goto-label]

I have a function: f(param), which does a single calculation based on input (param). This function supposed to be called something around 1 million (maximum) in a tight loop:

for (std::uint_fast64_t i = 0; i < 1'000'000; ++i) {
  f (param);
}

First Question:

My (first) question is what is the most effective way to write the condition section (to what to do based on param) of the f() function. I tried some options that I know:

if-else if:

static int f_3(int const param) {
  if (A::_1 == param) {
    return A::_1 * param;
  } else if (A::_2 == param) {
    return A::_2 * param;
  } // ... 
}

switch-case:

static int f(int const param) {
  switch (param) {
    case A::_1:
      return param * A::_1;
    // ...
  }
}

goto-label:

static int f_2(int const param) {
  void constexpr* const  _table[] = {
    && L1, // ... 
  };

  goto* _table[param];

L1:
  return A::_1 * param;
// ...
}

And I benchmarked it:

  • The complete code.
  • Compiler: g++ (GCC) 10.2.0
  • Compiler options: -O3 -std=c++17 -lboost_timer
  • OS: ArchLinux 5.10.8-arch1-1
int main(int argc, char* argv[]) {
  using ufast_t = std::uint_fast64_t;

  ufast_t constexpr n = 1'000'000'000ULL;
  ufast_t sum = 0;

  {
    std::cout << "switch-case :";
    boost::timer::auto_cpu_timer _f;
    for (ufast_t i = 0; i < n; ++i) {
      sum = + f(argc);
    }
  }
  std::cout << "---------------------------" << std::endl;
  {
    std::cout << "goto        :";
    boost::timer::auto_cpu_timer _f2;
    for (ufast_t i = 0; i < n; ++i) {
      sum = + f_2(argc);
    }
  }
  std::cout << "---------------------------" << std::endl;
  {
    std::cout << "if-elseif   :";
    boost::timer::auto_cpu_timer _f3;
    for (ufast_t i = 0; i < n; ++i) {
      sum = + f_3(argc);
    }
  }
  std::cout << std::endl;

  return sum;
}

But it has (i think) a strange output:

 > ./a.out 1 2 3 4 5 6 7
switch-case : 0.000002s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
---------------------------
goto        : 0.000001s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
---------------------------
if-elseif   : 1.030610s wall, 1.020000s user + 0.000000s system = 1.020000s CPU (99.0%)

 > echo $?
49

So is there any (effective) way to do that? or I just have those options in my hand and I have to select one of them based on some benchmarks?

Second Question:

Why switch-case and goto in the above benchmark doesn't have any overhead? It's Undefined behavior code? or somehow the compiler removes their loops in the optimization stage?

Aucun commentaire:

Enregistrer un commentaire