在用“for”循环替换了很多“旧”循环之前,我用visual studio 2013进行了一些测试:
std::vector<int> numbers; for (int i = 0; i < 50; ++i) numbers.push_back(i); int sum = 0; //vectorization for (auto number = numbers.begin(); number != numbers.end(); ++number) sum += *number; //vectorization for (auto number = numbers.begin(); number != numbers.end(); ++number) { auto && ref = *number; sum += ref; } //definition of range based for loops from http://en.cppreference.com/w/cpp/language/range-for //vectorization for (auto __begin = numbers.begin(),__end = numbers.end(); __begin != __end; ++__begin) { auto && ref = *__begin; sum += ref; } //no vectorization :( for (auto number : numbers) sum += number; //no vectorization :( for (auto& number : numbers) sum += number; //no vectorization :( for (const auto& number : numbers) sum += number; //no vectorization :( for (auto&& number : numbers) sum += number; printf("%f\n",sum);
看到拆卸,循环的标准都被矢量化:
00BFE9B0 vpaddd xmm1,xmm1,xmmword ptr [eax] 00BFE9B4 add ecx,4 00BFE9B7 add eax,10h 00BFE9BA cmp ecx,edx 00BFE9BC jne main+140h (0BFE9B0h)
但是基于循环的范围不是:
00BFEAC6 add esi,dword ptr [eax] 00BFEAC8 lea eax,[eax+4] 00BFEACB inc ecx 00BFEACC cmp ecx,edi 00BFEACE jne main+256h (0BFEAC6h)
有没有什么原因为什么编译器无法向这些循环进行向量化?
我真的想使用新的语法,但是放大矢量化太差了.
我刚刚看到了this question,所以我试过/ Qvec-report:2标志,给出了另一个原因:
loop not vectorized due to reason '1200'
那是:
Loop contains loop-carried data dependences that prevent vectorization. Different iterations of
the loop interfere with each other such that vectorizing the loop would produce wrong answers,and
the auto-vectorizer cannot prove to itself that there are no such data dependences.
这是同样的错误吗? (我也尝试过最后一个vc编译器“2013年11月CTP”)
我应该在MS连接上报告吗?
编辑
Du评论,我做了同样的测试与一个raw int数组而不是一个向量,所以没有迭代器类涉及,只是原始的指针.
现在除了两个“模拟的基于范围的”循环之外,所有循环都被矢量化.
编译器说这是由于’501’原因:
Induction variable is not local; or upper bound is not loop-invariant.
我不知道发生了什么事情
const size_t size = 50; int numbers[size]; for (size_t i = 0; i < size; ++i) numbers[i] = i; int sum = 0; //vectorization for (auto number = &numbers[0]; number != &numbers[0] + size; ++number) sum += *number; //vectorization for (auto number = &numbers[0]; number != &numbers[0] + size; ++number) { auto && ref = *number; sum += ref; } //definition of range based for loops from http://en.cppreference.com/w/cpp/language/range-for //NO vectorization ?! for (auto __begin = &numbers[0],__end = &numbers[0] + size; __begin != __end; ++__begin) { auto && ref = *__begin; sum += ref; } //NO vectorization ?! for (auto __begin = &numbers[0],__end = &numbers[0] + size; __begin != __end; ++__begin) { auto && ref = *__begin; sum += ref; } //vectorization ?! for (auto number : numbers) sum += number; //vectorization ?! for (auto& number : numbers) sum += number; //vectorization ?! for (const auto& number : numbers) sum += number; //vectorization ?! for (auto&& number : numbers) sum += number; printf("%f\n",sum);