Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins
since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this
secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins
within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra
with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a
novel machine learning based method that uses protein spectra for classification and identification of such proteins
within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear
combinations of original spectral components and then employs support vector machine (SVM) classification model
applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been
performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2
and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines
exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.