I have a function: f(param), which does a single calculation based on input (param). This function supposed to be called something around 1 million (maximum) in a tight loop:
for (std::uint_fast64_t i = 0; i < 1'000'000; ++i) {
f (param);
}
First Question:
My (first) question is what is the most effective way to write the condition section (to what to do based on param) of the f() function. I tried some options that I know:
if-else if:
static int f_3(int const param) {
if (A::_1 == param) {
return A::_1 * param;
} else if (A::_2 == param) {
return A::_2 * param;
} // ...
}
switch-case:
static int f(int const param) {
switch (param) {
case A::_1:
return param * A::_1;
// ...
}
}
goto-label:
static int f_2(int const param) {
void constexpr* const _table[] = {
&& L1, // ...
};
goto* _table[param];
L1:
return A::_1 * param;
// ...
}
And I benchmarked it:
- The complete code.
- Compiler: g++ (GCC) 10.2.0
- Compiler options: -O3 -std=c++17 -lboost_timer
- OS: ArchLinux 5.10.8-arch1-1
int main(int argc, char* argv[]) {
using ufast_t = std::uint_fast64_t;
ufast_t constexpr n = 1'000'000'000ULL;
ufast_t sum = 0;
{
std::cout << "switch-case :";
boost::timer::auto_cpu_timer _f;
for (ufast_t i = 0; i < n; ++i) {
sum = + f(argc);
}
}
std::cout << "---------------------------" << std::endl;
{
std::cout << "goto :";
boost::timer::auto_cpu_timer _f2;
for (ufast_t i = 0; i < n; ++i) {
sum = + f_2(argc);
}
}
std::cout << "---------------------------" << std::endl;
{
std::cout << "if-elseif :";
boost::timer::auto_cpu_timer _f3;
for (ufast_t i = 0; i < n; ++i) {
sum = + f_3(argc);
}
}
std::cout << std::endl;
return sum;
}
But it has (i think) a strange output:
> ./a.out 1 2 3 4 5 6 7
switch-case : 0.000002s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
---------------------------
goto : 0.000001s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
---------------------------
if-elseif : 1.030610s wall, 1.020000s user + 0.000000s system = 1.020000s CPU (99.0%)
> echo $?
49
So is there any (effective) way to do that? or I just have those options in my hand and I have to select one of them based on some benchmarks?
Second Question:
Why switch-case and goto in the above benchmark doesn't have any overhead? It's Undefined behavior code? or somehow the compiler removes their loops in the optimization stage?
Aucun commentaire:
Enregistrer un commentaire