博文

[论文阅读]基于稀疏编码和单幅图像的三维人体姿势估计

已有 3201 次阅读 2013-5-28 09:34 |个人分类:论文阅读|系统分类:论文交流

伊朗著名的谢里夫大学有一篇<3D Human pose estimation from Image using Couple Sparse Coding>，希望能与阅读这篇文章或者对这篇文章有兴趣的同仁交流探讨,共同学习。

摘要

最近的研究证明使用sparse representation，high-level的semantic可以captured。在这篇文章，我们提出了一种基于静态image稀疏编码的3d human pose estimation。给定一个可视的输入silhouette，目标是使用silhouette空间信息和pose space的geometrical information来估计3D human pose。通过假设each data point和它的neighbor更可能位于本地线性的patch。我们的方法既不需要human body model也不需要human body parts的prior information。相反，它学习了新的input space和pose space的sparse representation。3D human pose的输入是通过线性组合pose dictionary来估计。在本文的工作中，训练数据的pose转换为新的空间而angles转换为integer form。在不同的human activity上的扩展性的experimental result显示我们提出的方法会导致在3D human pose估计准确性上极大的改进，大约平均有9%。

简介

从Monocular image中估计human body pose在计算机视觉领域有很大的兴趣，其包含gesture recognition，visual surveillance，activity recogniton，motion capture和human computer interaction等。尽管human pose estimation被深入地学习了很多年，它仍旧是一个非常困难并且没有解决的问题。首先，由于缺少深度信息从2d图像中估计human pose是内在ambiguous的。第二，形状和visual appearance在图像中有很大的变化。除此之外，一些像occlusion，姿势的high dimensionality，lighting condition的变化，background scene的变化，会使得human pose estimation更加困难。就我们所知，没有任何一个方法可以解决所有aforementioned的问题。通常来讲，该应用主要有三方面的思路。第一个是model-based approaches，它利用一个基于prior knowledge的已知的 parametric body model，通过invert运动学模型或者数值化的优化方法来进行human pose estimate。这些计算expensive的方法需要好的initialization和proper model。除此之外，由于这些objective function并不总是convex的，这些模型只能发现sub-optimal的解法。

为了避免精确的initialization，精确的3D body modelling，learning-based model用于直接的在input space和output space之间直接mapping。这些模型并没有假设一个直接的human body model，因为一系列的learning technique可以用于实时的fast pose estimation而格外的appealing。尽管这些方法比modal-based method有superiority，一个learning-based method主要的weakness是由于training data的数量，准确的infer姿势的精确性。

Example based approach可以存储一系列的training data来对应3D pose描述子，通过search训练的数据和发现最相似的训练数据来estimate未知图像中的pose。这些方法需要一个计算expensive的query。Shakhnarovich等人开发了一个有效的search algorithm，称为parameter-sensitive hashing，可以从数据库进行快速的query。主要的问题是example-based方法可以create和incorporate足够的example来densely覆盖具有多自由度的human pose。除此之外，这些方法对光照条件和背景场景的变化很敏感。

最近，一种在pattern recognition和machine learning领域引起很多attention的方法是sparse representation-based方法。Sparsity意味着data point可以根据少量的active dictionary base足够的express。这已经证明为一个acquire和represent样本的强大工具。Sparse representation拥有很多优势，比如以一个更elegant的方式interpret数据点和快速retrieval。除此之外，sparse representation可以过度计算，在data representation和compression上有很大的灵活性。由于这些优势，sparse representation在许多pattern recognition task领域内被广泛的研究，包括face recognition，object classification，human pose estimation。使用sparsity作为prior knowledge可以产生state-of-the-art performance。Chen等提出了一种example-based方法来recover 3d human pose。他们使用sparse representation来retrieve姿态candidate，然后选择与邻近帧具有最小difference的candidate。Huang提出了一种可以利用sparse representation解决occlusion的方法。每一帧的输入都表达为dictionary bases的紧凑linear combination。

在这篇论文中，我们提出了一种可以使用sparse representation来deal with这个问题的算法。我分两个步骤来mapping from 2d human silhouette到 3d human body pose configuration。它考虑了local manifold structure。提出的method和example-based approach极为类似。但是我们使用sparse representation证明对noise很鲁棒。同时，the local geometric structure已经被考虑了。pose estimation中一种常见的问题是depth的丢失和human body part信息使得3d pose ambiguous。在这篇文章中，我们使用sparse representation来提供对noise的robustness和flexibility到光照的变化。因为geometrical information对很多应用都是有益的，特别对human pose estimation。

在算法的第一步，我们把dictionary base的linear combination作为给定的输入，与此同时使用constructed graph，我们强调obtained sparse coefficient与geometrical information一致。然后，获得sparse solution之后，在第二步，对应的pose是通过dictionary of training pose和computed sparse representation来估计。当每个test sample都是一个basis vector的线性组合，我们实现pose dependent特征选择没有任何的近似。

转载本文请联系原作者获取授权，同时请注明本文来自刘小邦科学网博客。
链接地址：https://blog.sciencenet.cn/blog-942948-694213.html

上一篇：[科研方法]科研中的个人时间管理
下一篇：[欧洲生活]瑞士的健康保险制度

收藏 IP: 210.75.252.*| 热度|

当前推荐数：1 推荐人：王淑杰

该博文允许注册用户评论请点击登录评论 (1 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

刘小邦

扫一扫，分享此博文

刘小邦的个人博客分享 http://blog.sciencenet.cn/u/iamliuzhiyong 浮生浪迹笑明月千愁散尽一剑轻

博文

[论文阅读]基于稀疏编码和单幅图像的三维人体姿势估计

当前推荐数：1 推荐人：王淑杰

该博文允许注册用户评论请点击登录评论 (1 个评论)

刘小邦

全部作者的其他最新博文

全部精选博文导读

相关博文

刘小邦的个人博客分享 http://blog.sciencenet.cn/u/iamliuzhiyong 浮生浪迹笑明月 千愁散尽一剑轻

博文

[论文阅读]基于稀疏编码和单幅图像的三维人体姿势估计

当前推荐数：1 推荐人： 王淑杰

该博文允许注册用户评论 请点击登录 评论 (1 个评论)

刘小邦

全部作者的其他最新博文

全部精选博文导读

相关博文

刘小邦的个人博客分享 http://blog.sciencenet.cn/u/iamliuzhiyong 浮生浪迹笑明月千愁散尽一剑轻

当前推荐数：1 推荐人：王淑杰

该博文允许注册用户评论请点击登录评论 (1 个评论)