什么是深度识别?


在实际应用中,例如对象分类问题如对象的分类(对象可是文档、图像、音频等),我们不得不面对的一个是问题是如何用数据来表示这个对象,当然这里的数据并非初始的像素或者文字,也就是这些数据是比初始数据具有更为高层的含义,这里的数据往往指的就是对象的特征。例如人们常常将文档、网页等数据用词的集合来表示,根据文档的词集合表示到一个词组短语的向量空间(vector space model, VSM模型)中,然后才能根据不同的学习方法设计出适用的分类器来对目标对象进行分类;又如在图像处理中,像素强度的集合的表示方法可以最初浅的表示一幅图像,这也是我们视觉意义上的阁像,可是由于各种原因人们提出了更高层的语义的特征,如SIFT为经典的几何特征、以LBP为经典的纹理特征、以特征脸为经典的统计特征等,像SIFT,特征在很多图像处理的应用中突显出其优越性,因此特征选取得好坏对于实际应用的影响是很深刻的。因此,选取什么特征或者用什么特征来表示某一对象对于解决一个实际问题非常的重要。然而,人为地选取特征的时间代价是非常昂贵,另外劳动成本也高,而所谓的启发式的算法得到的结果往往不稳定,结果好坏经常是依靠经验和运气。既然如此,人们自然考虑到自动学习来完成特征抽取这一任务。Deep Learning的产生就是缘于此任务,它又被称为无监督的特征学习(Unsupervised Feature Learning),显然从这个名称就可以知道这是一个没有人为参与的特征选取方法。

 

来源

 


 

深度学习(Deep Learning)的概念是2006年左右由Geoffrey Hinton等人在《science》上发表的一篇文章《Reducing the dimensionality of data with neural networks))丨26 丨提出来的,主要通过神经网络(Neural Network,NN)来模拟人的大脑的学习过程,希望借鉴人脑的多层抽象机制来实现对现实对象或数据(阁像、语音及文本等)的抽象表达,整合特征抽取和分类器到一个学习框架下,特征的抽取过程中应该尽量少地减少人为的干预。

 

深度学习是通过大量的简单神经元组成,每层的神经元接收更低层的神经元的输入,通过输入与输出之间的非线性关系,将低层特征组合成更高层的抽象表示,并发现观测数据的分布式特征。通过自下而上的学习形成多层的抽象表示,并多层次的特征\’、其一个自动地无人工干预的过程。根据学习到的网络结构,系统将输入的样本数据映射到各种层次的特征,并利用分类器或者匹配算法对顶层的输出单元进行分类识别等。

 

研究表明,哺乳动物的大脑皮层,处理输入信息时是采用了一种分层机制信息从感知器官输入后,经过多层的神经元,在经过每一层神经元时,神经元会将能够体现对象本质的特征抽取出来,然后将这些特征继续传递到下一层神经元上,同样地,后继的各层神经元都是以类似的方式处理和传递信息,最后传至大脑[28]。深层的人工神经网络的诞生很大程度上受这一发现的启示,即构建一种包含多层结点,并且使得信息得以

 

逐层处理抽象的神经网络。换言之,哺乳动物的大脑是以深度方式组织的[29],这一类深层的结构组织里的每一层会对于输入进行不同层次的信息处理或者抽象表示,因此,实际生活中,层次化的方

 

法经常被用于表示一些抽象的语义概念。与哺乳动物一样,人类的大脑处理信息时也是采用逐层传输和表达的方式,人脑的初级视觉系统,首先利用某些神经元探测物体边界、元形状,然后又利用其他的神经元组织,逐步向上处理形成更复杂的视觉形状[29]。人脑识别物体的原理是:外部世界中的物体先在视网膜上进行投影,然后大脑的视皮层对于聚集在视网膜上的投影进行分解处理,最后利用这些分解处理后的信息进行物体识别。

 

因此,视皮层的功能不是仅限于简单的重现视网膜图像,而是提取和计算感知信号丨3″]。视觉系统的输入数据量在人类感知系统的层次结构进行了维数减约,并剔除了与物体个性无关的信息;例如对于处理潜在结构是复杂的丰富数据(如图像、视频、主意等),深度学习应该与人类视觉系统一样可以精准地获取对象的本质特征。深度学习的构想是借鉴大脑的分层组织方式,通过由下向上、由简单到高级的逐层抽象的特征学习,研究者们期望深度网络结构能通过模拟大脑来解决复杂的模式识别难题。因此,深层的人工神经网络是一种人工定义用于模拟人脑组织形式的多层神经网络。

 

参考文献

[1] BENGIO Y. Learning deep architectures for AI [J]. Foundations and Trends in Machine Learning, 2009,2(1): 1-12.

[2] BENGIO Y, DELALLEAU 0. On the expressive power of deep architectures [C]. Algorithmic Learning Theory, Berlin Heidelberg, 201]: 18-36.

[3] BENGIO Y, LECUN Y. Scaling learning algorithms towards AI [J]. Large-Scale Kernel

Machines, 2007,1 — 34

[4] PARKE F I. Computer generated animation of faces [C]. Proceedings of the ACM annual

conference, Boston, 1972 (1): 451-457.

[5]杨健.线性投影分析的理论勾算法及其在特征抽取中的应W研究[D];南京:南京理1:人予,

2002.

[6] CHAN H, BLEDSOE W. A man-machine facial recognition system: some preliminary results

[R]. Tech. Rep. Panoramic Research Tnc,Palo Alto, 1965,

[7] KC G S, KARGER P A. Cryptology ePrint Archive [R]. Report 2005 .

[8] 1IEITMEYER R. Biometric identification promises fast and secure processing of airline

passengers [J]. ICAO Journal, 2000, 55(9): 10-11.

[9]孙冬梅,裘正定.生物特征识别技术综述[J].屯+学报,200],29(12) : 1744-1748.

[10]HUANG T, X10NG Z, ZHANG Z. Face recognition applications [M]. Handbook of Face

Recognition. London : Springer. 2011.

[11]ZHAO W, CHELLAPPA R, PHILLIPS P J, el al. Face recognition: A literature survey [J].

ACM Computing Surveys (CSUR), 2003, 35 (4): 399-458.

[12]PRINCE S J, WARRELL J, ELDER J, el al. Tied factor analysis for face recognition across

large pose differences [J], IEEE Transactions on Pattern Analysis and Machine

Intel 1j gence, 2008, 30(6): 970-984.

[13]BLANZ V, VETTER T. Face recognition based on fitting a 3D morphable model [J]. IEEE

Transact ions on Pattern Analysis and Machine Intelligence,2003,25(9): 1063—1074.

[14]GR0SS R, MATTHEWS I, BAKER S. Appearance-based face recognition and ]ight-fields [J].

IEEE Transactions on Pattern Analysis and Machine Intel 3 igence, 2004,26(4): 449-465.

[]5]GI-0RGHIADES A S, BHLHUMEUR P N, KR1EGMAN D J. From few to many: Illumination cone models

for face recognition under variable lighting and pose [‘]]. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 2001, 23(6): 643-660.

[16]LAWRENCE S, Gil上S C 丨” 丁SOT A C,el a]. Face recogni t ion: A convo]utiona] neural-network

approach [J]. IEEE Transact ions on Neura] Networks, ]997,8(1): 98-113.

‘55 –

大连理工大学硕士学位论文

[64] JIA K, GONGS. Multi-modal tensor face for simultaneous super-resolution and recognition

[C], ICCV, Beijing China, 2005: 1683-1690.

[65]JIA K, GONG S, Multi-modal face image super-resolutions in tensor space [C]. Advanced

Video and Signal Based Surveillance, Como Italy, 2005: 121-128.

[66]JIA K, GONG S. Generalized face super-resolution [J]. IEEE Transactions on Image

Processing, 2008,17(6): 873-886.

[67]MA X,HUANG H, WANG S,et al. A simple approach to multiview face hallucination [J].

Signal Processing Letters, 2010,17(6): 579-582.

[68]MA X, ZHANG J, QI C. Hallucinating face by position-patch [J]. Pattern recognition,

2010’ 43(6) : 2224-2236.

[69] LIN F, COOK J, CHANDRAN V,et al. Face recognition from super-resolved images [C]. ISSPA,

Nanjing China, 2005: 2217-2229.

[70]AL-AZZEH M, ELEYAN A, DEMIREL H. PCA-based face recognition from video using super

resolution [C]. ICSIS, Istanbul, 2008:1-4.

[71]ZHOU S, KRUEGER V, CHELLAPPA R. Probabilistic recognition of human faces from video

[J]. Computer Vision and Imago Understanding, 2003, 91(1): 214-245.

[72]HUANG H, HE H. Super-Resolution Method for Face Recognition Using Nonlinear Mappings

on Coherent Features [J]. IEEE Transactions on Neural Networks, 2010, 99: 1-10.

[73]HUANG H, ZENG X. Super-resolution method for multi-view face recognition from a Single

image per person using nonlinear mappings on coherent features [J]. Signal Processing

Letters, 2012,19(4) : 195-198.

基丁_深度予习的人脸识别研究

[49]BAKER S, KANADE T. Hallucinating faces [C]. Automatic Face and Gesture Recognition,

Boston, 2000:83-88.

[50]LIN F, FOOKES C, CHANDRAN V, et al. Super-resolved faces for improved face recognition

from surveillance video [M]//Advances in Biometrics. Springer. 2007: l-]0.

[51]WANG X, TANG X. Hallucinating face by eigen-transformation [J]. IEEE Transactions on

Systems, Man, and Cybernetics, 2005, 35(3): 425-434.

[52]WHEELER F W, LIU X,TU P H. Multi-frame super-resolution for face recognition [C]. IEEE

International Conference on Biometrics: Theory, Applications and Systems, Washington,

2007:1-6.

[53]SINHA P, BALAS B, OSTROVSKY Y, et al. Facc recognition by humans: Nineteen results a]]

computer vision researchers should know about [J]. Proceedings of the IEEE, 2006, 94(11):

1948-1962.

[54]ZHANG H, ZHANG B, HUANG W, et al. Gabor wavelet associative memory for face recognition

[J]. IEEE Transactions on Neura] Networks, 2005, 16(1): 275-278.

[55]SEZER 0 G, ALTUNBASAK Y, ERCIL A. Face recognition with independent component-based

super-resolution [C]. Electronic Imaging in International Society for Optics and

Photonics,San Diego, 2006: 607705-607715.

[56]GUNTUKK B K, BATUR A U,ALTUNBASAK Y, et al. Eigenface-domain super-resolution for face

recognition [J]. IEEE Transactions on Image Processing, 2003, 12(5): 597-606.

[57]LEE S-W,PARK J,LEE S-W. Low resolution face recognition based on support vector data

description [J]. Pattern recognition, 2006, 39(9); 1809-1812.

[58]ARANDJEL0VIC 0, CIP0LLA R. A manifold approach to facc recognition from low quality

video across illumination and pose using implicit super-reso]ution [C]. 1CCV, Rio de

Janeiro, Brazil, 2007,1.

[59]BURTON A M, WILSON S, COWAN M, et al. Face recognition in poor-quality video: Evidence

from security surveillance [J]. Psychological Science, 1999,10(3): 243-248.

[60]ZHUANG Y, ZHANG J, WU F. Hallucinating faces: LPH super-resolution and neighbor

reconstruction for residue compensation [J]. Pattern recognition, 2007, 40(1]):

3178-3194.

[61]LI B, CHANG H, SHAN S,et al. Low-resolution face recognition via coupled locali ty

preserving mappings [j]. Signal Processing Letters, 2010, 17(1) : 20-23.

[62]HENNINGS-YE0MANS P H, BAKER S, KUMAR B V. Simultaneous super-resolution and feature

extraction for recognition of low-resolution faces [C]. Computer Vision and Pattern

Recognition, Anchorage Alaska, 2008:1467-1475.

[63]LI Y, LIN X. Face hallucination with pose variation [C]. proceedings of the Automatic

Face and Gesture Recognition,Seoul, Korea, 2004:723-728.

58

[32]LE ROUX N, BENGIO Y. Representational power of restricted Boltzmann machines and deep

belief networks [J]. Neural Computation, 2008, 20(6): 1631-1649.

[33]WELLING M, ROSEN-ZVI M, HINTON G. Exponential family harmoniums with an application

to information retrieval [J]. Advances in neural information processing systems, 2005,

17:1481-1488.

[34]HINT0N G. A practical guide to training restricted Boltzmann machines [R]. Report of

Momentum, 2010, 9(1); 1-20.

[35]LIU J S. Monte Carlo strategies in scientific computing [M]. Berlin Heidelberg: Springer

Verlag, 2008.

[36]HINTON G E. Training products of experts by minimizing contrastive divergence [J].

Neural Computation, 2002,14(8): 1771-1800.

[37]HINT0N G E. Distributed representations [R]. Tech. Report, University of Toronto, 1984,

[38]BENGIO Y, LAMBLIN P, P0P0VICI D, et al. Greedy layer-wise training of deep networks [J].

Advances in neural information processing systems, 2007,19:153.

[39]SALAKHUTDINOV R, HINTON G. Learning a nonlinear embedding by preserving class

neighborhood structure [J]. International Journal of Computer Mathematics, 2007,84(7):

1265-1276.

[40]MANJUNATH B, CHELLAPPA R, VON DER MALSBURG C. A feature based approach to face

recognition[C]. Computer Vision and Pattern Recognition, Champaign, 1992:663-671.

[41]LADES M, VORBRUGGEN J C, BUHMANN J, et al. Distortion invariant object recognition in

the dynamic link architecture [J]. IEEE Transactions on Computers, 1993,42(3): 300-311.

[42]BEYMER D,P0GGI0 T. Face recognition from one example view [C]. Computer Vision,

Massachusetts, 1995:500-507.

[43]VETTER T, POGGIO T. Image synthesis from a single example image [C]. ECCV, Cambridge,

England ,1996: 652-659.

[44]VETTER T. Synthesis of novel views from a single face image [J]. International Journal

of Computer Vision, 1998, 28(2): 103-116-

[45]CHAI X, SHAN S, CHEN X, et al. Locally linear regression for pose-invariant face

recognition [J]. IEEE Transact ions on Image Processing, 2007,16(7): 1716-1725.

[46]R0HBAN M H, RABIEE H R, VAHDAT A. Face virtual pose generation using aligned locally

linear regression for face recognition [C]. IEEE International Conference on Image

Processing (ICIP), Cairo Egypt, 2009, 4121-4124.

[47]VAN 0UWERKERK J. Image super-resolution survey [J]. Image and Vision Computing, 2006,

24(10): 1039-】052.

[48]PARK S C, PARK M K, KANG M G. Super-resolution image reconstruction: a technical overview

[J]. Signal Processing Magazine, 2003, 20(3): 21-36.

[17]ZHANG X, GAO Y, LEUNG M. Recognizing rotated faces from frontal and side views: An approach toward effective use of mugshot databases [J]. IEEE Transactions on Information

Foronsics and Security, 2008,3(4): 684-697.

[18]TAN X,CHEN S, ZHOU Z H,et a]. Face recognition from a single image per person: A survey

[J]. Pattern recognition, 2006, 39(9): 1725-1745.

[19JFENTLAND A, MOGHADDAM B, STARKER T. View-based and modular eigenspaces for face

recognition [C]. Computer Vision and Pattern Recognition, Seattle, WA,USA, 1994 :84-91.

[20]TURK M A,PENTLAND A P. Face recognition using eigenfaces [C]. Computer Vision and

Pattern Recognition, San Diego, 1991:22-28.

[21]ZHAO W, KR1SHNASWAMY A, CHELLAPPA R, et al. Discriminant analysis of principal

components for face recognition [M]. Face Recognition. Berlin Heidelberg :Springer.

1998.

[22]GA()Y,LEUNG M K 11. Face recogni t ion using line edge map [J]. IEI:E Transactions on Pattern

Analysis and Machine Intelligence, 2002, 24(6): 764-779.

[23]WISK0TT L, FELL0US J-M, KUIGER N, et al. Face recognition by elastic bunch graph matching

[J]. ]EEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(7):

775-779.

[24]SHIN H-C, PARK J H, KIM S-D. Combination of warping robust elastic graph matching and

kerne]-basecl projection discriminant analysis for face recognition [J]. IEEE Trans on

Multimedia, 2007, 9(6): 1125-1136.

[25]0JALA T, PIET1KAINEN M, MAENPAA T. Mulliresolution gray-scale and rotation invariant

texture classification with local binary patterns [J]. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 2002, 24(7): 971-987.

[26]HINT0N G E,SALAKHUTDINOV R R. Reducing the dimensional i ty of data with neural networks

[J]. Science, 2006, 313(5786): 504.

[27]LEE T S, MUMF0RD D. Hierarchical Bayesian inference in the visual cortex [J]. J0SA 八,

2003, 20(7) : 1434-1448.

[28]LEE 丁 S, MUMFORD I), ROMERO R, et a]. The role oi\’ the primary visual cortex in higher

]eve] vision [J]. Vision research, 1998, 38(15): 2429-2454.

[29]SERKE T, KREIMAN G, K0UI1 M, et al. A quantitative theory of immediate visual recognilion

[J]. Progress in Brain Research, 2007, 165:33-56.

[30]ROSS1 A F,Dl^SIMONE R, UNGERLEIDER L G. Contextual modulation in primary visual cortex

of macaqucs [,】].The Journal of Neuroscience, 2001, 21(5): 1698-1709.

[31]BRAVKRMAN M. Pol y-logari thmi c independence I\’ool s bounded-depth boolean circuits [J].

Communications of the ACM, 2011,54 ⑷:108-115.

56

 

 

版权声明:本文为Iknowyou原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/Iknowyou/archive/2014/03/02/3577402.html