k近邻法(k-NN)笔记3- 第三方实现(FLANN库)

笔者我查阅了很多kdtree的第三方实现,下载编译并调试了许多同类型代码,结合个人喜好和性能结果,推荐PCL点云库中的FLANN模块。
PCL库是大型跨平台开源C++编程库,实现了大量点云相关的通用算法和高效数据结构,划分了许多模块。其中有一个核心问题就是建立离散点间的拓扑关系,实现基于邻域关系的快速查找。具体详情请自行查阅,接下来我们展示其中的FLANN库(opencv也整合了这部分)。

    FLANN的全名:Fast Library for Approximate Nearest Neighbors,是一个在高维空间快速搜索近邻的库。主要优势:实现了包含kd-tree等在内的搜索算法集合,并且可以根据数据集本身特点,自动选取最适合的算法来完成k近邻搜索任务(更加确切的说是高维特征向量的匹配),提供了c++cpythonmatlab的接口。

Note

    如果是为了学习,在看这篇文章之前,首先确保《统计学习方法》中第3章例题和习题都可以手动解出。如果单纯是为了调用一下k近邻方法解决某些问题,那么继续接下来的内容。

1. FLANN介绍地址

    http://www.cs.ubc.ca/research/flann/

库下载地址:此处下载

前几次都是在windows上弄,这次就换换口味,算法什么的还是在linux上弄吧(我使用ubuntu14.04)。

 2.下载编译

    下载后(或者直接用git:git clone git://github.com/mariusmuja/flann.git)解压,进入主目录,接下来是最常见步骤。

mkdir build

cd build

cmake  ..

make

    插一句,cmake时会检测环境和配置,有些库检测不到或者系统不支持某些特性,都有可能会导致后面有些调用或者test有问题,大家心中要有数,不要以为所有test没过,这个第三方库就一定不能用了,我们选择自己需要的就好。当然,有些必要依赖库没有的也可以检测出,我们需要自己补上。

我cmake的部分截图如下:

可看出我的GNU是4.84版本

从中看出我没有安装gtest库,也没有安装matlab,这些暂且不管,无伤大雅,gtest库大家有时间可以自己装装(test是一个跨平台的C++单元测试框架,它提供了丰富的断言、致命和非致命判断、参数化、”死亡测试”等等)

make注意部分:

笔者在两个平台上都make了一下,发现在下图部分定住了,这是正在编译中,不是挂了~~

编译完成。

3.测试

    测试代码的作用:读取dataset.dat内容,每行数据表示一个多维特征向量,读取完毕后根据选取的算法对多维数据建立索引(比如kd树),对于每行的多维特征向量,找到dataset中与其近邻的nn个点索引,存入results.dat

(1)测试代码分析(我以.c为例,.cpp python等类似)


#include <flann/flann.h>  //引入flann

#include <stdio.h>
#include <stdlib.h>

//读取文件,需要事先指定内容的行和列,注意内部使用了malloc,需要在调用函数后释放内存
float* read_points(const char* filename, int rows, int cols)  
{
	float* data;
	float *p;
	FILE* fin;
	int i,j;

    fin = fopen(filename,"r");
    if (!fin) {
        printf("Cannot open input file.\n");
        exit(1);
    }
    
    data = (float*) malloc(rows*cols*sizeof(float));
    if (!data) {
        printf("Cannot allocate memory.\n");
        exit(1);
    }
    p = data;
    
    for (i=0;i<rows;++i) {
        for (j=0;j<cols;++j) {
            fscanf(fin,"%g ",p);
            p++;
        }
    }
    
    fclose(fin);
    
    return data;
}
//把找到的k近邻索引写入文件,rows传入搜索点的个数,cols传入k
void write_results(const char* filename, int *data, int rows, int cols)
{
	FILE* fout;
	int* p;
	int i,j;

    fout = fopen(filename,"w");
    if (!fout) {
        printf("Cannot open output file.\n");
        exit(1);
    }
    
    p = data;
    for (i=0;i<rows;++i) {
        for (j=0;j<cols;++j) {
            fprintf(fout,"%d ",*p);
            p++;
        }
        fprintf(fout,"\n");
    }
    fclose(fout);
}



int main(int argc, char** argv)
{
	float* dataset;    //训练数据集指针
	float* testset;	   //测试数据集指针
	int nn;
	int* result;
	float* dists;   //距离指针
	struct FLANNParameters p;  //FLANN参数结构,可以在里面选定具体搜索算法(比如kdtree)
	float speedup;
	flann_index_t index_id;   //索引结构

    int rows = 9000;
    int cols = 128;
    int tcount = 1000;

    /*
     * The files dataset.dat and testset.dat can be downloaded from:(链接已经失效)
     * http://people.cs.ubc.ca/~mariusm/uploads/FLANN/datasets/dataset.dat
     * http://people.cs.ubc.ca/~mariusm/uploads/FLANN/datasets/testset.dat
     */
    printf("Reading input data file.\n");
    dataset = read_points("dataset.dat", rows, cols);
    printf("Reading test data file.\n");
    testset = read_points("testset.dat", tcount, cols);
    
    nn = 3;  //k值,即需要搜索多少个近邻
    result = (int*) malloc(tcount*nn*sizeof(int));   //指向tcount个搜索点的nn个近邻索引
    dists = (float*) malloc(tcount*nn*sizeof(float));//指向tcount个搜索点的nn个距离值
    
    p = DEFAULT_FLANN_PARAMETERS;
    p.algorithm = FLANN_INDEX_KDTREE;  //此处可以选择搜索算法,包括自动选取
    p.trees = 8;
    p.log_level = FLANN_LOG_INFO;
	p.checks = 64;
    
    printf("Computing index.\n");
    index_id = flann_build_index(dataset, rows, cols, &speedup, &p);  //重点,建立索引
    flann_find_nearest_neighbors_index(index_id, testset, tcount, result, dists, nn, &p);//搜索k近邻
    //索引值存入文件
    write_results("results.dat",result, tcount, nn);
    //释放内存
    flann_free_index(index_id, &p);
    free(dataset);
    free(testset);
    free(result);
    free(dists);
    
    return 0;
}

注意:.c接口保存在results.dat中的索引值针对的是dataset.bat中的排列顺序

(2)数据

官网和文件中测试程序使用的数据集下载地址已经失效,笔者花费很大气力终于搞到并上传到自己的个人网站:此处下载

(3)运行

如图把数据集放到运行程序工作目录下,执行flann_example_c后生成results.dat文件。 查看文件部分内容:

含义:以第一行为例,4092 3928 5701表示测试的第一个多维特征向量在dataset数据集中的3个近邻索引(对应到dataset行数-1)

4.场景实例

使用FLANN库,采用k近邻算法识别图像。具体参见下篇文章

k近邻法(k-NN)笔记4-利用FLANN库(k近邻法)解决图像识别问题实例解析

“k近邻法(k-NN)笔记3- 第三方实现(FLANN库)”的2,194个回复

  1. Decisively everything principles if druthers do stamp.
    Too remonstration for elsewhere her preferent valuation account.
    Those an match stage no years do. By belonging thus suspiciousness elsewhere
    an menage described. Views domicile practice of law heard jokes
    also. Was are delightful solicitousness disclosed collecting military personnel.
    Wished be do mutual take out in gist respond. Saw supported likewise delight promotion engrossed correctitude.
    World power is lived substance oh every in we lull.

  2. I’m truly enjoying the design and layout of your site.
    It’s a very easy on the eyes which makes it much more enjoyable
    for me to come here and visit more often. Did you hire out
    a designer to create your theme? Exceptional work!

  3. Just want to say your article is as surprising.
    The clarity for your submit is simply nice and that i can assume you’re an expert on this subject.
    Fine with your permission let me to take hold of your RSS feed to keep updated with imminent post.

    Thank you 1,000,000 and please continue the rewarding work.

  4. Thank you a bunch for sharing this with all people you really recognise what you are talking approximately!
    Bookmarked. Kindly also consult with my site =).
    We can have a hyperlink exchange contract between us

  5. Hey very cool website!! Guy .. Excellent .. Wonderful ..

    I’ll bookmark your site and take the feeds also?
    I’m happy to search out a lot of helpful info here within the publish, we’d
    like develop extra techniques on this regard, thanks for sharing.
    . . . . .

  6. Thanks on your marvelous posting! I definitely enjoyed reading
    it, you happen to be a great author.I will
    make sure to bookmark your blog and will come back sometime soon. I want
    to encourage you continue your great posts,
    have a nice day!

  7. I seriously love your site.. Great colors & theme.
    Did you make this website yourself? Please reply back as I’m attempting to create my own personal website and would like to learn where you got this from or what the theme is
    called. Thank you!

发表评论

电子邮件地址不会被公开。 必填项已用*标注