Statistical learning machines from ATR to DNA micro arrays: design, assessment, and advice for practitioners | ||||
The International Conference on Electrical Engineering | ||||
Article 187, Volume 6, 6th International Conference on Electrical Engineering ICEENG 2008, May 2008, Page 1-29 PDF (176.02 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/iceeng.2008.34641 | ||||
View on SCiNiTO | ||||
Author | ||||
Waleed A. Yousef | ||||
Faculty of Computers and Information, Helwan University. | ||||
Abstract | ||||
Abstract: Statistical Learning is the process of estimating an unknown probabilistic inputoutput relationship of a system using a limited number of observations; and a statistical learning machine (SLM) is the machine that learned such a process. While their roots grow deeply in Probability Theory, SLMs are ubiquitous in the modern world. Automatic Target Recognition (ATR) in military applications, Computer Aided Diagnosis (CAD) in medical imaging, DNA microarrays in Genomics, Optical Character Recognition (OCR), Speech Recognition (SR), spam email filtering, stock market prediction, etc., are few examples and applications for SLM; diverse fields but one theory. The field of Statistical Learning can be decomposed to two basic subfields, Design and Assessment. We mean by Design, choosing the appropriate method that learns from the data to construct an SLM that achieves a good performance. We mean by Assessment, attributing some performance measures to the designed SLM to assess this SLM objectively. To achieve these two objectives the field encompasses different other fields: Probability, Statistics and Matrix Theory; Optimization, Algorithms, and programming, among others. Three main groups of specializations—namely statisticians, engineers, and computer scientists (ordered ascendingly by programming capabilities and descendingly by mathematical rigor)—exist on the venue of this field and each takes its elephant bite. Exaggerated rigorous analysis of statisticians sometimes deprives them from considering new ML techniques and methods that, yet, have no “complete” mathematical theory. On the other hand, immoderate add-hoc simulations of computer scientists sometimes derive them towards unjustified and immature results. A prudent approach is needed that has the enough flexibility to utilize simulations and trials and errors without sacrificing any rigor. If this prudent attitude is necessary for this field it is necessary, as well, in other fields of Engineering. In the spirit of this prelude, this article is intended to be a pilot-view of the field that sheds the light on SLM applications, the Design and Assessment stages, necessary mathematical and analytical tools, and some state-of-the-art references and research. | ||||
Keywords | ||||
Statistical Learning; Machine Learning; Pattern Recognition; Pattern Classification; Automatic Target Recognition; Computer Aided Diagnosis; Classifier Assessment; Receiver Operating Characteristics | ||||
Statistics Article View: 105 PDF Download: 208 |
||||