Intelligent Computing Lab.
Bioinformatics in NCTU, Taiwan.
Go back to IC Lab

SCMHBP: Prediction and analysis of heme binding proteins using propensity scores of dipeptides

Home | Download | Release 1.10, Last update: Aug 5 2014 


Heme binding proteins (HBPs) are metalloproteins containing a heme ligand (an iron-porphyrin complex) as a prosthetic group. HBPs exist in different forms that band c-types are the most common hemes. The b-type heme binds to proteins non-covalently, whereas the c-type heme vinyl group forms covalent bonds with two specific cysteine residues of the Cys-Xaa-Xaa-Cys-His motif. Other important hemes include a-, o-, and d-type hemes found in bacteria and eukaryotes. The heme binds to HBPs specifically according to their types. Understanding of structure-function relationships is useful for rational HBP engineering.

This study proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 92,309 putative non-HBPs) with identity 25% was established first. Consequently, a set of propensity scores of amino acids and dipeptides to be HBPs using SCM is estimated by maximizing the prediction accuracy of SCMHBP. Finally, we identify informative physicochemical properties by utilizing the estimated propensity scores for categorizing HBPs. The training and mean test accuracies of SCMHBP on three independent test datasets are 85.90% and 83.76%, respectively.

SCMHBP aims at discovering knowledge for further understanding HBPs rather than pursuing high prediction accuracy only. The used datasets and source codes of SCMHBP are available at HERE .

The flowchart of system designs for predicting and characterizing Heme binding proteins (HBPs).

Heat map of the heme protein propensity scores of dipeptides

Contact with:
Hui-Ling Huang,

Related publications of SCM

Huang HL, Charoenkwan P, Kao TF, Lee HC, Chang FL, Huang WL, Ho SJ, Shu LS, Chen WL, Ho SY: Prediction and analysis of protein solubility using a novel scoring card method with dipeptide composition. Bmc Bioinformatics 2012, 13.

Charoenkwan P, Shoombuatong W, Lee HC, Chaijaruwanich J, Huang HL, Ho SY: SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs. Plos One 2013, 8(9).