我在我的C程序中遇到了一个运行时错误“double free or corruption”,它调用了一个可靠的库ANN并使用OpenMP来平行for循环.
*** glibc detected *** /home/tim/test/debug/test: double free or corruption (!prev): 0x0000000002527260 ***
这是否意味着地址0x0000000002527260的内存被释放多次?
错误发生在“_search_struct-> annkSearch(queryPt,k_max,nnIdx,dists,_eps);”内部函数classify_varIoUs_k(),它在函数tune_complexity()内部的OpenMP for循环中.
请注意,当OpenMP有多个线程时会发生错误,并且在单线程情况下不会发生.不知道为什么.
以下是我的代码.如果它不足以进行诊断,请告诉我.谢谢你的帮助!
void KNNClassifier::train(int nb_examples,int dim,double **features,int * labels) { _nPts = nb_examples; _labels = labels; _dataPts = features; setting_ANN(_dist_type,1); delete _search_struct; if(strcmp(_search_neighbors,"brutal") == 0) { _search_struct = new ANNbruteForce(_dataPts,_nPts,dim); }else if(strcmp(_search_neighbors,"kdtree") == 0) { _search_struct = new ANNkd_tree(_dataPts,dim); } } void KNNClassifier::classify_varIoUs_k(int dim,double *feature,int label,int *ks,double * errors,int nb_ks,int k_max) { ANNpoint queryPt = 0; ANNidxArray nnIdx = 0; ANNdistArray dists = 0; queryPt = feature; nnIdx = new ANNidx[k_max]; dists = new ANNdist[k_max]; if(strcmp(_search_neighbors,"brutal") == 0) { _search_struct->annkSearch(queryPt,_eps); }else if(strcmp(_search_neighbors,"kdtree") == 0) { _search_struct->annkSearch(queryPt,_eps); // where error occurs } for (int j = 0; j < nb_ks; j++) { scalar_t result = 0.0; for (int i = 0; i < ks[j]; i++) { result+=_labels[ nnIdx[i] ]; } if (result*label<0) errors[j]++; } delete [] nnIdx; delete [] dists; } void KNNClassifier::tune_complexity(int nb_examples,int *labels,int fold,char *method,int nb_examples_test,double **features_test,int *labels_test) { int nb_try = (_k_max - _k_min) / scalar_t(_k_step); scalar_t *error_validation = new scalar_t [nb_try]; int *ks = new int [nb_try]; for(int i=0; i < nb_try; i ++){ ks[i] = _k_min + _k_step * i; } if (strcmp(method,"ct")==0) { train(nb_examples,dim,features,labels );// train once for all nb of nbs in ks for(int i=0; i < nb_try; i ++){ if (ks[i] > nb_examples){nb_try=i; break;} error_validation[i] = 0; } int i = 0; #pragma omp parallel shared(nb_examples_test,error_validation,features_test,labels_test,nb_try,ks) private(i) { #pragma omp for schedule(dynamic) nowait for (i=0; i < nb_examples_test; i++) { classify_varIoUs_k(dim,features_test[i],labels_test[i],ks,ks[nb_try - 1]); // where error occurs } } for (i=0; i < nb_try; i++) { error_validation[i]/=nb_examples_test; } } ...... }
更新:
谢谢!我现在正试图通过使用“#pragma omp critical”来纠正classify_varIoUs_k()中写入相同内存问题的冲突:
void KNNClassifier::classify_varIoUs_k(int dim,int k_max) { ANNpoint queryPt = 0; ANNidxArray nnIdx = 0; ANNdistArray dists = 0; queryPt = feature; //for (int i = 0; i < Vignette::size; i++){ queryPt[i] = vignette->content[i];} nnIdx = new ANNidx[k_max]; dists = new ANNdist[k_max]; if(strcmp(_search_neighbors,"brutal") == 0) {// search _search_struct->annkSearch(queryPt,_eps); }else if(strcmp(_search_neighbors,"kdtree") == 0) { _search_struct->annkSearch(queryPt,_eps); } for (int j = 0; j < nb_ks; j++) { scalar_t result = 0.0; for (int i = 0; i < ks[j]; i++) { result+=_labels[ nnIdx[i] ]; // Program received signal SIGSEGV,Segmentation fault } if (result*label<0) { #pragma omp critical { errors[j]++; } } } delete [] nnIdx; delete [] dists; }
解决方法
好的,既然你已经声明它在单线程情况下可以正常工作,那么“普通”方法将不起作用.您需要执行以下操作:
>查找并行访问的所有变量
>特别是看看那些经过修改的
>不要在共享资源上调用delete
>查看在共享资源上运行的所有库函数 – 检查它们是否不进行分配/释放
这是双重删除的候选人列表:
shared(nb_examples_test,ks)
此外,此代码可能不是线程安全的:
for (int i = 0; i < ks[j]; i++) { result+=_labels[ nnIdx[i] ]; } if (result*label<0) errors[j]++;
因为两个或多个进程可能会尝试写入错误数组.