BioMed Research International

BioMed Research International / 2017 / Article

Research Article | Open Access

Volume 2017 |Article ID 7039245 | 14 pages | https://doi.org/10.1155/2017/7039245

Mining of Microbial Genomes for the Novel Sources of Nitrilases

Academic Editor: Jiangke Yang
Received25 Oct 2016
Revised14 Feb 2017
Accepted07 Mar 2017
Published12 Apr 2017

Abstract

Next-generation DNA sequencing (NGS) has made it feasible to sequence large number of microbial genomes and advancements in computational biology have opened enormous opportunities to mine genome sequence data for novel genes and enzymes or their sources. In the present communication in silico mining of microbial genomes has been carried out to find novel sources of nitrilases. The sequences selected were analyzed for homology and considered for designing motifs. The manually designed motifs based on amino acid sequences of nitrilases were used to screen 2000 microbial genomes (translated to proteomes). This resulted in identification of one hundred thirty-eight putative/hypothetical sequences which could potentially code for nitrilase activity. In vitro validation of nine predicted sources of nitrilases was done for nitrile/cyanide hydrolyzing activity. Out of nine predicted nitrilases, Gluconacetobacter diazotrophicus, Sphingopyxis alaskensis, Saccharomonospora viridis, and Shimwellia blattae were specific for aliphatic nitriles, whereas nitrilases from Geodermatophilus obscurus, Nocardiopsis dassonvillei, Runella slithyformis, and Streptomyces albus possessed activity for aromatic nitriles. Flavobacterium indicum was specific towards potassium cyanide (KCN) which revealed the presence of nitrilase homolog, that is, cyanide dihydratase with no activity for either aliphatic, aromatic, or aryl nitriles. The present study reports the novel sources of nitrilases and cyanide dihydratase which were not reported hitherto by in silico or in vitro studies.

1. Introduction

Advancement in the DNA sequencing technologies has led to sequencing of large number of genomes and the enormous sequence data are available in the public domain. The fourth-generation DNA sequencing has made it possible to sequence a bacterial genome within a few hours at a reasonably low cost [14]. As of today 5293 prokaryotic and 22 eukaryotic genomes have been completely sequenced and the sequence data are easily accessible in databases such as NCBI, GOLD, and IMG/ER. It is evident from previous studies that not all the gene/protein sequences in the databases are functionally characterized, which make these repositories a rich source for the discovery of novel genes and proteins [5, 6]. Genome mining has emerged as an alternate approach to find novel sources of desired genes/proteins as the conventional screening methods which involve isolation of microbes and their screening for desired products are time consuming, tedious, and cost intensive [7, 8].

Microbial nitrilases are considered to be the most important enzymes in the nitrilase superfamily that find application in the synthesis of fine chemicals, production of some important acids, and drug intermediates and in green chemistry [913]. Besides their wide applications nitrilases are prone to certain limitations, for example, their inactivation or inhibition by the acidic product, extremes of pH, temperature, and organic solvent [14, 15]. These limitations are being addressed either by the isolation of microorganisms from the extreme habitat or by enrichment techniques for specific substrate using conventional microbiological procedures [6] prone to limitation as mentioned above. The present communication focuses on in silico screening of publicly available bacterial genomes for nitrilase genes and in vitro validation of the predicted novel sources of nitrilases.

2. Material and Methods

2.1. Genome Screening Using Homology and Motif Based Approach

Primary screening of microbial genomes (data given as supplementary material in Supplementary Material available online at https://doi.org/10.1155/2017/7039245) was done using homology based approach. Tblastn and blastp were used to screen the sequenced genomes with query sequence to identify the presence and position of similar genes in the genome. Computationally predicted proteins from the bacterial genomes with keyword “nitrilase/cyanide dihydratase” were also downloaded using advanced search options in the IMG/ER database. Sequences with low (30%) and high similarity (80%) were discarded. Nitrilase gene in contigs showing the presence of nitrilase homologs was downloaded from IMG/ER. GenMark S tool was used to predict the ORFs in each contig, and the output was downloaded selecting protein sequence as output option. Amino acid sequences less than 100 amino acids were considered to be as false positive (FP) and were discarded. Small amino acid sequence database was created which was further subjected to local blast, to confirm the presence of nitrilase homolog in the contigs of the individual genome.

On the other hand, protein based manually designed motifs (MDMs) were used to screen the bacterial genome to search for the presence of conserved motifs using MAST (Motif Alignment and Search Tool) at MEME (Multiple Em for Motif Elicitation) suite. The motifs used are already described in our previous communication [12]. Motifs identified in sequences less than hundred amino acids were rejected, considered to be false positive (FP). Sequences above 100 amino acids were taken to be as true positive (TP).

2.2. Study of Physiochemical Properties and Phylogenetic Analysis of Predicted Nitrilases

Physiochemical data of the in silico predicted nitrilases were generated from the ProtParam software using ExPASy server and compared to the values deduced from the previous nitrilase study [16]. Some important physiochemical properties such as number of amino acids, molecular weight (kda), isoelectric point (pI), computing pI/Mw and the atomic compositions, values of instability index, aliphatic index, and grand average of hydropathicity (GRAVY) were calculated. A comparative chart was drawn between previously characterized and predicted nitrilases.

An output file of multiple aligned sequences using Clustal W for both previously characterized and predicted nitrilases was used to generate the Neighbor Joining (NJ) tree using MEGA 6 version. Phylogenetic tree was generated in order to predict the sequences as aliphatic or aromatic with previously characterized nitrilases.

2.3. Nitrilase Activity Assay

Culture of some of the bacteria predicted to have nitrilase gene (Shimwellia blattae, Runella slithyformis, Geodermatophilus obscurus, Nocardiopsis dassonvillei, Streptomyces albus, Flavobacterium indicum, Saccharomonospora viridis, Sphingopyxis alaskensis, and Gluconacetobacter diazotrophicus) was procured from Microbial Type Culture Collection (MTCC); Chandigarh Escherichia coli BL21 (DE3) from Invitrogen was used as negative control as this organism does not have nitrilase gene. These cultures were grown in the laboratory using different media (Table 1) for the production of nitrilase activity following the procedures described earlier [1719]. Nitrilase activity was assayed in 1.0 mL reaction mixture containing nitrile as substrate (1–10 mM) and 0.1 mL resting cells. After 30 min of incubation at 30°C the reaction was quenched with 0.1 M HCl and the amount of ammonia released was estimated using nitrilase assay, that is, modified phenate-hypochlorite method described by Dennett and Blamey [20]. One unit of nitrilase activity was defined as the amount of enzyme required to release 1 μmole of ammonia per min under the assay conditions.


Name of the organismMTCC numberComposition ()pHGrowth temperature

Shimwellia blattae
ATCC 29907
4155Beef extract: 1.0 g
Yeast extract: 2.0 g
Peptone: 5.0 g
NaCl: 5.0 g
Agar: 15.0 g
7.0–7.537°C

Runella slithyformis
ATCC 29530
9504Glucose: 1.0 g
Peptone: 1.0 g
Yeast extract: 1.0 g
Agar: 15.0 g
Glucose: 4.0 g
7.0–7.526°C

Geodermatophilus obscurus
DSM 43160
4040Yeast extract: 4.0 g
Malt extract: 10.0 g
CaCO3: 2.0 g
Agar: 12.0 g
7.2–7.528°C

Nocardiopsis dassonvillei
DSM 43111
1411Yeast extract: 4.0 g
Malt extract: 1.0
Glucose: 4.0 g
Agar: 20.0 g
7.2–7.428°C

Streptomyces albus
J1074
1138Yeast extract: 4.0 g
Malt extract: 1.0 g
Glucose: 4.0 g
Agar: 20.0 g
7.2–7.425°C

Flavobacterium indicum
DSM 17447
6936Tryptic soy broth with agar
(TSBA-100)
7.3–7.530°C

Saccharomonospora viridis
ATCC 15386
320Yeast extract: 4.0 g
Malt extract: 1.0 g
Glucose: 4.0 g
Agar: 20.0 g
7.2–7.445°C

Sphingopyxis alaskensis
DSM 13593
7504Beef extract: 1.0 g
Yeast extract: 2.0 g
Peptone: 5.0 g
NaCl: 5.0 g
Agar: 15.0 g
7.0–7.530°C

Gluconacetobacter diazotrophicus
ATCC 49037
1224Yeast extract: 5.0 g
Peptone: 3.0 g
Mannitol: 25.0 g
Agar: 15.0 g
7.0–7.328°C

Escherichia coli
BL21 (DE3)
Yeast extract: 5.0 g
NaCl: 10.0 g 
Casein enzymatic hydrolysate: 10.0 g
7.0–7.537°C

Negative control.

3. Results

3.1. Genome Screening Using Conserved Motifs and Homology Search

As many as 138 candidate sequences were identified using tblastn and blastp at IMG/ER on both gene and protein level. Identification of potentially coding nitrilase genes was done using homology based approach (blastp and tblastn) allowing the identification of nitrilase sequences. To identify newer sources of nitrilases, candidate sequences bearing unassigned functions (hypothetical, uncharacterized, or putative) were selected from the translated genomes (Table 2). The identified sequences shared 30–50% sequence identity to biochemically characterized Rhodococcus rhodochrous J1 nitrilase which was taken as query sequence. Catalytic residues were found to be conserved in all the predicted proteins. Nine predicted and translated sequences were further chosen for their in silico and in vitro validation based on the manually designed motifs (MDMs) (Tables 3 and 4) identified from previous study [12].


Name of organismScaffold or genome length (bp) with accession numberTotal number of ORF’s predicted in scaffold of complete genomePredicted coding region for nitrilaseNumber of
base-pairs

Acaryochloris marina
MBIC11017
NC_009925
(6503724 bp)
152200001–200999999

Acetobacter pasteurianus
IFO 3283-32
AP011157
(191443 bp)
120174107–173133974

Achromobacter xylosoxidans
A8
NC_014640
(7013095 bp)
406200001–200960960

Acidovorax avenae avenae
ATCC 19860
NC_015138
(5482170 bp)
188201035–2000011035

Acidothermus cellulolyticus
11B
NC_008578
(2443540 bp)
403200001–2011311131

Acidaminococcus fermentans
VR4
NC_013740
(2329769 bp)
293200924–200001924

Alcanivorax dieselolei
B5
CP003466
(4928223 bp)
343200001–200981981

Arthrobacter aurescens
TC1
NC_008711
(4597686 bp)
385200001–200930930

Azorhizobium caulinodans
ORS 571
NC_009937
(5369772 bp)
26289665–885801083

Azospirillum sp.
B510
NC_013854
(3311395 bp)
402200001–200921921

Bacillus pumilus
SAFR-032
NC_009848
(3704465 bp)
73201026–2000011026

Bradyrhizobium japonicum
USDA 110
NC_004463
(9105828 bp)
387200001–200966966

Bradyrhizobium sp.  
BTAi1
NC_009485
(8264687 bp)
392201146–2000011146

Bradyrhizobium sp.  
ORS278
NC_009445
(7456587 bp)
395201041–2000011041

Brevibacillus brevis
NBRC 100599
NC_012491
(6296436 bp)
182200001–200960960

Flavobacterium indicum
GPTSA100-9
HE774682
(2993089 bp)
317200001–200981981

Saccharomonospora viridis
P101
NC_013159
(4308349 bp)
315200001–200996996

Sphingopyxis alaskensis
DSM13593
NC_008048
(3345170 bp)
387200001–2010171017

Burkholderia cenocepacia
J2315
NC_011000
(3870082 bp)
393199944–2010501050

Burkholderia glumae
BGR1
NC_012720
(141067 bp)
15447491–484771017

Burkholderia gladioli
BSR3
NC_015376
(3700833 bp)
338200001–2010141014

Burkholderia phymatumNC_010623
(2697374 bp)
375199971–2010231023

Burkholderia phytofirmansNC_010681
(4467537 bp)
357200001–2010351035

Burkholderia sp.  
CCGE1002
NC_014119
(1282816 bp)
28072013–730411020

Burkholderia sp.  
CCGE1003
NC_014540
(2966498 bp)
344200019–2010411022

Burkholderia vietnamiensis G4NC_009254
(1241007 bp)
436199986–2010231037

Burkholderia xenovorans
LB400
NC_007951
(4895836 bp)
396200001–200996996

Caulobacter sp. K31NC_010335
(233649 bp)
219180936–181871935

Chlorobium phaeobacteroides
BS1
NC_010831
(2736403 bp)
382200001–200936936

Clostridium difficile 630NC_009089
(4290252 bp)
364200001–200927927

Clostridium difficile
CD196
NC_013315
(4110554 bp)
308200001–200927927

Clostridium difficile
R20291
NC_013316
(4191339 bp)
329200001–200927927

Clostridium kluyveri
NBRC 12016
NC_011837
(3896121 bp)
442200001–200930957

Clostridium kluyveri
ATCC 8527
NC_009706
(3964618 bp)
491200001–200930930

Conexibacter woesei
DSM 14684
NC_013739
(6359369 bp)
388200001–200942942

Cupriavidus necator
ATCC 17699
NC_008313
(4052032 bp)
318200001–2010171017

Cupriavidus necator
ATCC 43291
NC_015726
(3872936 bp)
318200001–2010171017

Cyanobium gracile
ATCC 27147
Cyagr_Contig81
(3342364 bp)
405200001–200999999

Deinococcus deserti
(strain VCD115)
NC_012529
(314317 bp)
269200001–200951951

Deinococcus peraridilitoris
DSM 19664
Deipe_Contig72.1
(3881839 bp)
412200001–200951951

Desulfomonile tiedjei
ATCC 49306
Desti_Contig107.1
(6500104 bp)
379200001–2010291029

Dickeya zeae Ech1591NC_012912
(4813854 bp)
194200001–200927927

Erwinia billingiae Eb661NC_014305
(169778 bp)
19487964–889651001

Erythrobacter litoralis
HTCC2594
NC_007722
(3052398 bp)
411200001–200969969

Flavobacterium indicum
DSM 17447
HE774682
(2993089 bp)
317200001–200981981

Frateuria aurantia
ATCC 33424
Fraau_Contig24.1
(3603458 bp)
366200001–200924924

Geobacillus sp.  
Y4.1MC1
NC_014650
(3840330 bp)
434200001–200966966

Geobacillus thermoglucosidasius
C56-YS93
NC_015660
(3893306 bp)
446200001–200966966

Geodermatophilus obscurus
DSM 43160
NC_013757
(5322497 bp)
24454102–54884783

Gluconacetobacter diazotrophicus
ATCC 49037
NC_010125
(3944163 bp)
333200001–200960960

Haliangium ochraceum
DSM 14365
CP002175
(2309262 bp)
377200001–200957957

Halanaerobium praevalens
ATCC 33744
NC_013440
(9446314 bp)
262200001–200999999

Hyphomicrobium sp.  
MC1
NC_015717
(4757528 bp)
392200001–200984984

Janthinobacterium sp.  
Marseille
NC_009659
(4110251 bp)
398200001–2010681068

Jannaschia sp.  
CCS1
NC_007802
(4317977 bp)
382200001–2010261026

Maricaulis maris
MCS10
NC_008347
(3368780 bp)
392200001–200933933

Methylobacterium extorquens CM4NC_011758
(380207 bp)
2117191–82671077

Methylobacterium extorquens
ATCC 14718
NC_012811
(1261460 bp)
436200001–2010771077

Methylobacterium extorquens DM4NC_012988
(5943768 bp)
378200001–200918918

Methylobacterium extorquens PA1NC_010172
(5471154 bp)
354200001–2011101110

Methylomonas methanica
MC09
Contig38
(5051681 bp)
402200001–200996996

Methylobacterium nodulans ORS2060NC_011892
(487734 bp)
425200001–2011161116

Methylobacterium populi
ATCC BAA-705
NC_010725
(5800441 bp)
19361617–626931077

Methylibium petroleiphilum PM1NC_008825
(4044195 bp)
364200001–2010741074

Methylobacterium radiotolerans
ATCC 27329
NC_010505
(6077833 bp)
377200001–2010771077

Methylocella silvestris
BL2
NC_011666
(4305430 bp)
439199971–2010291029

Mycobacterium intracellulare
ATCC 13950
CP003322
(5402402 bp)
383199938–200897897

Mycobacterium liflandii
128FXT
CP003899
(6208955 bp)
405200001–2010591059

Mycobacterium rhodesiae NBB3MycrhN_Contig54.1
(6415739 bp)
267200001–200957957

Mycobacterium smegmatis
ATCC 700084
CP001663
(6988208 bp)
377200001–200978978

Natranaerobius thermophilus
ATCC BAA-1301
NC_010718
(3165557 bp)
387200001–200930930

Nocardia farcinica
IFM 10152
NC_006361
(6021225 bp)
390198993–199811818

Nocardiopsis dassonvillei
DSM 43111
NC_014211
(775354 bp)
353201134–200001843

Oligotropha carboxidovorans
ATCC 49405
CP002826
(3595748 bp)
372200001–2010651065

Pantoea sp. At-9bNC_014839
(394054 bp)
349114577–1155811005

Peptoniphilus duerdenii
ATCC BAA-1640
NZ_AEEH01000050
(96694 bp)
8052942–53863921

Photorhabdus asymbiotica
ATCC 43949
NC_012962
(5064808 bp)
338200001–2010501050

Pirellula staleyi
ATCC 27377
NC_013720
(6196199 bp)
338200001–200909909

Polaromonas naphthalenivorans
CJ2
NC_008781
(4410291 bp)
389200001–2010411062

Polaromonas sp. JS666NC_007948
(5200264 bp)
398200001–200942942

Pseudomonas syringae pv. lachrymans
M302278PT
Lac106_115287.20
(115287 bp)
10747704–487471043

Pseudoalteromonas atlantica
ATCC BAA-1087
NC_008228
(5187005 bp)
397200001–200921921

Pseudomonas aeruginosa
P7-L633/96
Ga0060317_132
(369634 bp)
27091986–92801816

Pseudomonas brassicacearum
NFM421
NC_015379
(6843248 bp)
377200001–2010261026

Pseudomonas sp. TJI-51AEWE01000051
(6502 bp)
051482–24981017

Pseudomonas fluorescens
Pf-5
NC_007492
(6438405 bp)
349200001–200924924

Pseudomonas fluorescens
SBW25
NC_012660
(6722539 bp)
376200043–200930888

Pseudomonas mendocina
NK-01
NC_015410
(5434353 bp)
376200001–200883883

Pseudomonas syringae
pv. tomato DC 3000
PSPTOimg_DC3000
(6397126 bp)
377200001–2010111011

Pseudomonas syringae
pv. syringae B728a
NC_007005
(6093698 bp)
1968233–9231999

Pseudoxanthomonas suwonensis11-1NC_014924
(3419049 bp)
362200001–200885885

Pseudonocardia dioxanivorans
ATCC 55486
CP002593
(7096571 bp)
386200001–2010081008

Ralstonia solanacearum
GMI1000
NC_003295
(3716413 bp)
343200001–2010321032

Rhizobium hainanense
CCBAU 57015
Ga0061100_113
(148344 bp)
14661240–622801040

Rhizobium leguminosarum bv. Viciae 3841NC_008380
(5057142 bp)
397200001–2010471047

Rhizobium leguminosarum bv. trifolii
WSM1325
NC_012850
(4767043 bp)
21018450–19442993

Rhodopseudomonaspalustris
TIE-1
NC_011004
(5744041 bp)
387199980–2010501070

Rhodopseudomonas palustris
DX-1
NC_014834
(5404117 bp)
390200001–200954954

Rubrobacter xylanophilus
DSM 9941
NC_008148
(3225748 bp)
385200001–2010801080

Ruegeria pomeroyi
ATCC 700808
NC_006569
(491611 bp)
308118859–1198931035

Runella slithyformis
ATCC 29530
Unknown
(6568739 bp)
362200001–200933933

Saccharothrix espanaensis
ATCC 51144
HE804045
(9360653 bp)
347200001–2010201020

Saccharomonospora viridis ATCC 15386NC_013159
(4308349 bp)
315200001–200996996

Shewanella halifaxensis
HAW-EB4
NC_010334
(5226917 bp)
337200001–200945945

Shewanella pealeana
ATCC 700345
NC_009901
(5174581 bp)
333200001–200945945

Shewanella sediminis 
HAW-EB3
NC_009831
(5517674 bp)
337200001–200954954

Shewanella violacea
JCM 1017
NC_014012
(4962103 bp)
307200001–200936936

Shewanella woodyi
ATCC 51908
NC_010506
(5935403 bp)
327200001–2010051005

Shimwellia blattae
ATCC 29907
EBLc (4158725 bp)376200001–2010291029

Singulisphaera acidiphila
ATCC 1392
Sinac_Contig49.1
(9629675 bp)
337200001–2010141014

Sorangium cellulosum Soce56NC_010162
(13033779 bp)
329200001–2010291029

Sphingopyxis alaskensis
DSM 13593
NC_008048
(3345170 bp)
387200001–2010171016

Sphaerobacter thermophilus
DSM 20745
NC_013524
(1252731 bp)
335200097–201092995

Sphingomonas wittichii
RW1
NC_009511
(5382261 bp)
354200001–2010261026

Spirosoma linguale
ATCC 33905
NC_013730
(8078757 bp)
339200001–200906906

Starkeya novella
ATCC 8093
NC_007604
(2695903 bp)
402200001–2010051005

Streptomyces albus J1074CP004370
(6841649 bp)
2521635309–1636256948

Synechococcus elongatus
PCC 7942
NC_007604
(2695903 bp)
402200001–2010051005

Syntrophobacter fumaroxidans
DSM 10017
NC_008554
(4990251 bp)
337200001–200987987

Synechococcus sp.  
ATCC 27264
NC_010475
(3008047 bp)
431200001–2010081008

Synechococcus elongatus
PCC 6301
NC_006576
(2696255 bp)
402200001–2010051005

Synechococcus sp.
PCC 7002
NC_010475
(3008047 bp)
431200001–2010081008

Synechococcus sp.  
WH8102
NC_005070
(2434428 bp)
537200001–2010171017

Synechocystis sp.CP003265
(3569561 bp)
371200001–2010261026

Synechocystis sp.  
PCC 6803
NC_017052
(3570103 bp)
374200001–2010261026

Terriglobus roseus
KBS 63
Terro_Contig51.1
(5227858 bp)
354200001–200873873

Tistrella mobilis
KA081020-065
CP003239
(1126962 bp)
379200001–2010771077

Variovorax paradoxus (strain EPS)NC_014931
(6550056 bp)
360200001–2010351035

Variovorax paradoxus
S110
NC_012791
(5626353 bp)
420200001–2010531053

Verminephrobacter eiseniae EF01-2NC_008786
(5566749 bp)
337200001–2009871020

Zobellia galactanivorans
DSM 12802
FG20DRAFT
(5340688 bp)
331200001–200951951

Zymomonas mobilis subsp. Mobilis
ATCC 10988
NZ_ACQU01000006
(113352 bp)
11382520–83509990


NitrilasesManually designed motif

Aliphatic[FL]-[ILV]-[AV]-F-P-E-[VT]-[FW]-[IL]-P-[GY]-Y-P-[WY]
R-R-K-[LI]-[KRI]-[PA]-T-[HY]-[VAH]-E-R
C-W-E-H-[FLX]-[NQ]-[PT]-L
[VA]-A-X-[AV]-Q-[AI]-X-P-[VA]-X-[LF]-[SD]

Aromatic[ALV]-[LV]-[FLM]-P-E-[AS]-[FLV]-[LV]-[AGP]-[AG]-Y-P
[AGN]-[KR]-H-R-K-L-[MK]-P-T-[AGN]-X-E-R
C-W-E-N-[HY]-M-P-[LM]-[AL]-R-X-X-[ML]-Y
A-X-E-G-R-C-[FW]-V-[LIV]


NitrilasesManually Designed motif123456789

Aliphatic[FL]-[ILV]-[AV]-F-P-E-[VT]-[FW]-[IL]-P-[GY]-Y-P-[WY]A-F-P-E-V-F-V-P-A-Y-P-YF-P-E-L-W-L-P-G-Y-P-I-FF-P-E-V-F-I-S-G-Y-P-Y-W-N-WF-P-E-V-F-I-A-G-YF-P-E-T-F-V-P-Y-Y-P-Y
R-R-K-[LI]-[KRI]-[PA]-T-[HY]-[VAH]-E-RL-R-R-K-L-V-P-T-WR-R-K-L-K-P-T-H-V-E-RR-K-L-V-P-T-W-A-E-K-L-TR-H-R-K-L-V-P-T-W-A-E-RR-R-K-I-T-P-T-Y-H-E-R
C-W-E-H-[FLX]-[NQ]-[PT]-LC-G-E-N-T-N-T-L-AC-A-E-N-M-Q-P-LC-G-E-N-T-N-T-L-AC-G-E-N-T-N-T-L-A-R-F-SC-W-E-H-Y-N-P-L-
[VA]-A-X-[AV]-Q-[AI]- X- P-[VA]-X-[LF]-[SD]V-A-A-V-Q-A-A-P-V-F-L-D-PV-A-S-V-Q-A-EV-Q-T-A-P-V-F-L-N-V-EA-A-V-Q-A-A-P-V-F-LA-A-V-Q-I-S-P-V-L-

Aromatic[ALV]-[LV]-[FLM]-P-E-[AS]-[FLV]-[LV]-[AGP]-[AG]-Y-PF-Q-E-V-F-N-AP-E-S-F-I-P-C-Y-P-R-GF-P-E-A-F-L-G-T-Y-PS-E-T-F-S-T-G
[AGN]-[KR]-H-R-K-L-[MK]-P-T-[AGN]-X-E-RR-K-H-H-I-P-Q-VH-R-K-L-K-P-T-G-L-E-RH-R-K-V-M-P-T-G-A-E-RR-K-L-H-P-F-T
C-W-E-N-[HY]-M-P-[LM]-[AL]-R-X-X-[ML]-YC-Y-D-R-HC-W-E-N-Y-M-P-L-A-R-MC-W-E-N-Y-M-P-L-L-R-AC-Y-D-L-R-F-A
A-X-E-G-R-C-[FW]-V-[LIV]A-H-L-W-K-L-EA-L-E-G-R-C-F-V-L-AA-L-E-G-R-C-W-VA-I-E-N-Q-A-Y-V

3.2. Physiochemical Parameters and Phylogenetic Analysis

In silico identified nitrilases were analyzed for their physiochemical properties using ProtParam, an online tool at the ExPASy proteomic server. The selected candidates values for various properties were found to be very much similar to those with earlier published data by Sharma and Bhalla [16] as mentioned in Table 5. Average values deduced for aliphatic and aromatic nitrilases from earlier characterized proteins were taken as standard for the comparison of a predicted set of nitrilase. The values of the same were found to be very much similar to those with earlier published data by Sharma and Bhalla [16] as mentioned in Table 5. The total number of amino acids ranged from 260 amino acids (Nocardiopsis dassonvillei) to 342 amino acids (Shimwellia blattae) with different molecular weight. Isoelectric point ranged between 4.8 and 5.8 which is found to be closer to the consensus value, that is, the average data value from previously characterized aliphatic or aromatic nitrilases.


ParametersAverage value for aliphaticAverage value for aromatic123456789

Number of amino acids352.2309.8338.0331.0326.0260.0280.0310.0315.0342.0319.0

Molecular weight (Da)38274.033693.536154.936491.236364.727903.331464.134938.133821.537472.734678.7

Theoretical
pI
5.55.55.04.96.25.25.65.44.85.45.8

NCR41.735.841.044.040.032.036.043.043.041.039.0

PCR30.329.226.025.037.021.027.034.029.030.032.0

Extinction coefficients 
(M−1 cm−1) at 280 nm
50213.343975.045295.033015.043890.035200.062465.053400.047900.038305.031775.0

Instability index41.238.530.152.527.027.728.639.646.636.638.5

Aliphatic index89.4089.9094.187.993.681.176.090.986.292.889.3

Grand average of hydropathicity (GRAVY)00.1000.010.027−0.17−0.14−0.051−0.283−0.1090.045−0.052−0.002

NCR: negatively charged residues; PCR: positively charged residues.

Neighbor Joining (NJ) tree using MEGA 6 shows the phylogenetic analysis with in silico predicted sequences from completely sequenced microbial genomes with that of previously characterized nitrilase sequences. They were distinguished either as aliphatic or aromatic according to their position in the phylogenetic tree (Figure 1).

3.3. In Vitro Validation of Some In Silico Predicted Nitrilases

To validate for nitrile transforming activity of nine predicted novel sources of nitrilases, these were tested against common aliphatic, aromatic, and aryl nitriles and potassium cyanide (KCN). Gluconacetobacter diazotrophicus, Sphingopyxis alaskensis, Saccharomonospora viridis, and Shimwellia blattae were found to be more specific for aliphatic nitriles. On the other hand, Geodermatophilus obscurus, Nocardiopsis dassonvillei, Runella slithyformis, and Streptomyces albus exhibited nitrilase activity for aromatic nitriles. Flavobacterium indicum was the only organism which showed no activity for either aliphatic, aromatic, or aryl nitriles but was specific towards the degradation of the potassium cyanide (KCN) (Table 6). On the other hand, negative control, that is, Escherichia coli BL21 (DE3), showed no activity for any of the nitriles/substrates tested.


Organisms Substrates
ValeronitrileBenzonitrileMandelonitrileIsobutyronitrileAdiponitrile2-CyanopyridinePropionitrileAcrylonitrileKCN

Streptomyces albus
J1074
0.00150.0027ND0.0014NDNDNDNDND

Nocardiopsis dassonvillei
DSM 43111
ND0.00400.0024NDNDNDNDNDND

Geodermatophilus obscurus
DSM 43160
ND0.00430.0021NDNDNDNDNDND

Shimwellia blattae
ATCC 29907
0.0028ND0.00160.0019NDNDNDNDND

Runella slithyformis
ATCC 29530
ND0.02970.01520.0095NDND0.0169NDND

Gluconacetobacter diazotrophicus
ATCC 49037
ND0.00160.00200.00510.0048NDNDNDND

Sphingopyxis alaskensis
DSM 13593
ND0.00073ND0.00240.00075NDNDNDND

Saccharomonospora viridis
ATCC 15386
NDNDND0.0030NDNDNDNDND

Flavobacterium indicum
DSM 17447
NDNDNDNDNDNDNDND0.25

Escherichia coli
BL21 (DE3)
NDNDNDNDNDNDNDNDND

Expressed as µmole of ammonia released/min/mg dcw under the assay conditions; ND = not detected; negative control.

4. Discussion

Annotation of sequenced genomes to identify new genes has become integral part of the research in bioinformatics [2124]. The present investigation has revealed some novel sources of nitrilases. Homology and conserved motif approach screened microbial genomes and proteins predicted as nitrilase or cyanide dihydratase or carbon-nitrogen hydrolase in 138 prokaryotic bacterial genomes. Manually designed motifs (MDMs) also differentiated the in silico predicted nitrilases as aliphatic or aromatic [12] as the designed motifs are class specific. All the four motifs identified were uniformly conserved throughout the two sets of aliphatic and aromatic nitrilases as mentioned in Table 4.

The sequences belonged to the nitrilase superfamily, showing the presence of the catalytic triad Glu (E), Lys (K), and Cys (C) to be conserved throughout. Phylogenetic analysis using the MEGA 6.0 version for the aliphatic and aromatic set of protein sequences revealed two major clusters. Neighbor Joining (NJ) tree used for phylogenetic analysis revealed that in silico predicted proteins (this study) and previously identified nitrilases as aliphatic and aromatic [16] were found to be grouped in their respective clusters (Figure 1).

Aliphaticity and aromaticity of in silico predicted and characterized nitrilases were differentiated based on their physiochemical properties. The physicochemical properties of the predicted set of nitrilase were deduced using the ProtParam subroutine of Expert Protein Analysis System (ExPASy) from the proteomic server of the Swiss Institute of Bioinformatics (SIB), in order to predict aromaticity or aliphaticity. Several of the parameters (number of amino acids, molecular weight, number of negatively charged residues, extinction coefficients, and grand average of hydropathicity) listed in Table 5 are closer to the consensus values reported for aromatic and aliphatic nitrilases, supporting that the predicted set of nitrilase has aromatic or aliphatic substrate specificity (Table 5).

In silico predictions were verified by in vitro validation of the predicted proteins. Common nitriles (aliphatic, aromatic, and aryl nitriles) and potassium cyanide (KCN) were tested to check for the nitrile/cyanide transforming ability of the predicted proteins. Out of nine predicted proteins eight were found active for different nitriles, whereas Flavobacterium indicum was found to hydrolyze toxic cyanide (KCN) into nontoxic form (Table 6). The present approach contributed to finding novel sources of desired nitrilase from microbial genome database.

5. Conclusion

Genome mining for novel sources of nitrilases has predicted 138 sources for nitrilases. In vitro validation of the selected nine predicted sources of nitrilases for nitrile/cyanide hydrolyzing activity has furthered the scope of genome mining approaches for the discovery of novel sources of enzymes.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are thankful to the Department of Biotechnology (DBT), New Delhi, for the continuous support to the Bioinformatics Centre, Himachal Pradesh University, Summer Hill, Shimla, India.

Supplementary Materials

List of organisms with completely sequenced genomes avaliable at NCBI and IMG/ER.

  1. Supplementary Material

References

  1. J. L. Seffernick, S. K. Samanta, T. M. Louie, L. P. Wackett, and M. Subramanian, “Investigative mining of sequence data for novel enzymes: a case study with nitrilases,” Journal of Biotechnology, vol. 143, no. 1, pp. 17–26, 2009. View at: Publisher Site | Google Scholar
  2. M. Land, L. Hauser, S. Jun et al., “Insights from 20 years of bacterial genome sequencing,” Functional & Integrative Genomics, vol. 15, no. 2, pp. 141–161, 2015. View at: Publisher Site | Google Scholar
  3. D. Yadav, “Relevance of bioinformatics in the era of omics driven research,” Journal of Next Generation Sequencing & Applications, vol. 2, no. 1, article e102, 2015. View at: Publisher Site | Google Scholar
  4. Y. Feng, Y. Zhang, C. Ying, D. Wang, and C. Du, “Nanopore-based fourth-generation DNA sequencing technology,” Genomics, Proteomics & Bioinformatics, vol. 13, no. 1, pp. 4–16, 2015. View at: Publisher Site | Google Scholar
  5. S. G. Van Lanen and B. Shen, “Microbial genomics for the improvement of natural product discovery,” Current Opinion in Microbiology, vol. 9, no. 3, pp. 252–260, 2006. View at: Publisher Site | Google Scholar
  6. X. J. Luo, H. L. Yu, and J. H. Xu, “Genomic data mining: an efficient way to find new and better enzymes,” Enzyme Engineering, vol. 1, article 104, 2012. View at: Publisher Site | Google Scholar
  7. P. De Sousa-Pereira, F. Amado, J. Abrantes, R. Ferreira, P. J. Esteves, and R. Vitorino, “An evolutionary perspective of mammal salivary peptide families: cystatins, histatins, statherin and PRPs,” Archives of Oral Biology, vol. 58, no. 5, pp. 451–458, 2013. View at: Publisher Site | Google Scholar
  8. J. L. Adrio and A. L. Demain, “Microbial enzymes: tools for biotechnological processes,” Biomolecules, vol. 4, no. 1, pp. 117–139, 2014. View at: Publisher Site | Google Scholar
  9. J. Raj, N. Singh, S. Prasad, A. Seth, and T. C. Bhalla, “Bioconversion of benzonitrile to benzoic acid using free and agar entrapped cells of Nocardia globerula NHB-2,” Acta Microbiologica et Immunologica Hungarica, vol. 54, no. 1, pp. 79–88, 2007. View at: Publisher Site | Google Scholar
  10. M. A. Rao, R. Scelza, R. Scotti, and L. Gianfreda, “Role of enzymes in the remediation of polluted environments,” Journal of Soil Science and Plant Nutrition, vol. 10, no. 3, pp. 333–353, 2010. View at: Google Scholar
  11. P. Kaul and Y. Asano, “Strategies for discovery and improvement of enzyme function: state of the art and opportunities,” Microbial Biotechnology, vol. 5, no. 1, pp. 18–33, 2012. View at: Publisher Site | Google Scholar
  12. N. N. Sharma, M. Sharma, and T. C. Bhalla, “Nocardia globerula NHB-2 nitrilase catalysed biotransformation of 4-cyanopyridine to isonicotinic acid,” AMB Express, vol. 2, article 25, 2012. View at: Publisher Site | Google Scholar
  13. S. K. Bhatia, P. K. Mehta, R. K. Bhatia, and T. C. Bhalla, “Optimization of arylacetonitrilase production from Alcaligenes sp. MTCC 10675 and its application in mandelic acid synthesis,” Applied Microbiology and Biotechnology, vol. 98, no. 1, pp. 83–94, 2014. View at: Publisher Site | Google Scholar
  14. J.-S. Gong, Z.-M. Lu, H. Li, J.-S. Shi, Z.-M. Zhou, and Z.-H. Xu, “Nitrilases in nitrile biocatalysis: recent progress and forthcoming research,” Microbial Cell Factories, vol. 11, article 142, 2012. View at: Publisher Site | Google Scholar
  15. P. W. Ramteke, N. G. Maurice, B. Joseph, and B. J. Wadher, “Nitrile-converting enzymes: an eco-friendly tool for industrial biocatalysis,” Biotechnology and Applied Biochemistry, vol. 60, no. 5, pp. 459–481, 2013. View at: Publisher Site | Google Scholar
  16. N. Sharma and T. C. Bhalla, “Motif design for nitrilases,” Journal of Data Mining in Genomics & Proteomics, vol. 3, article 119, 2012. View at: Publisher Site | Google Scholar
  17. N. Sharma, R. Kushwaha, J. S. Sodhi, and T. C. Bhalla, “In silico analysis of amino acid sequences in relation to specificity and physiochemical properties of some microbial nitrilases,” Journal of Proteomics and Bioinformatics, vol. 2, no. 4, pp. 185–192, 2009. View at: Publisher Site | Google Scholar
  18. T. C. Bhalla, A. Miura, A. Wakamoto, Y. Ohba, and K. Furuhashi, “Asymmetric hydrolysis of α-aminonitriles to optically active amino acids by a nitrilase of Rhodococcus rhodochrous PA-34,” Applied Microbiology and Biotechnology, vol. 37, no. 2, pp. 184–190, 1992. View at: Publisher Site | Google Scholar
  19. V. Vejvoda, D. Kubáč, A. Davidová et al., “Purification and characterization of nitrilase from Fusarium solani IMI196840,” Process Biochemistry, vol. 45, no. 7, pp. 1115–1120, 2010. View at: Publisher Site | Google Scholar
  20. G. V. Dennett and J. M. Blamey, “A new thermophilic nitrilase from an antarctic hyperthermophilic microorganism,” Frontiers in Bioengineering and Biotechnology, vol. 4, article 5, 2016. View at: Publisher Site | Google Scholar
  21. J. K. Fawcett and J. E. Scott, “A rapid and precise method for the determination of urea,” Journal of Clinical Pathology, vol. 13, pp. 156–159, 1960. View at: Publisher Site | Google Scholar
  22. I. Friedberg, “Automated protein function prediction—the genomic challenge,” Briefings in Bioinformatics, vol. 7, no. 3, pp. 225–242, 2006. View at: Publisher Site | Google Scholar
  23. J. Armengaud, “A perfect genome annotation is within reach with the proteomics and genomics alliance,” Current Opinion in Microbiology, vol. 12, no. 3, pp. 292–300, 2009. View at: Publisher Site | Google Scholar
  24. M. S. Poptsova and J. P. Gogarten, “Using comparative genome analysis to identify problems in annotated microbial genomes,” Microbiology, vol. 156, no. 7, pp. 1909–1917, 2010. View at: Publisher Site | Google Scholar

Copyright © 2017 Nikhil Sharma et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1316 Views | 348 Downloads | 5 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder
 Sign up for content alertsSign up

You are browsing a BETA version of Hindawi.com. Click here to switch back to the original design.