NC State
BioResources
Li, X., Geng, X., Gao, L., Wu, Y., Wang, Y., Geng, A., Sun, J., and Jiang, J. (2019). "Optimized expression of a hyperthermostable endoglucanase from Pyrococcus horikoshii in Arabidopsis thaliana," BioRes. 14(2), 2812-2826.

Abstract

Manufacturing microbial cellulase in plants is an attractive strategy for the cost-effective production of cellulosic ethanol, especially the expression of thermostable cellulase, which causes no negative effects on plant growth and development. The beta-1,4-endogenous cellulase from Pyrococcus horikoshii (EGPh) is considered one of the most promising glycosyl hydrolase in the biofuel and textile industry for its hyperthermostability and its capability to hydrolyze crystalline celluloses, which has been researched extensively during recent years. In this study, the coding sequence of EGPh was expressed in Arabidopsis thaliana under the control of a CaMV35S promoter after codon optimization, with the addition of a eukaryotic Kozak sequence. The expression of EGPh caused no deleterious effects to the growth and development of transgenic A. thaliana. The heterologous EGPh showed relatively high activities, up to 111.69 and 13.35 U.mg-1 total soluble protein against soluble cellulose carboxymethyl cellulose (CMC) and insoluble microcrystalline cellulose (Avicel), respectively. The subcellular localization analysis showed that the EGPh protein was targeted to the plasma membrane and cell wall. Based on these results, it is proposed that EGPh is an ideal candidate for the commercial production of hyperthermostable endoglucanase using plants as biofactories.


Download PDF

Full Article

Optimized Expression of a Hyperthermostable Endoglucanase from Pyrococcus horikoshii in Arabidopsis thaliana

Xia Li, Xiaoyan Geng, Lu Gao, Yanfang Wu, Yongli Wang, Alei Geng, Jianzhong Sun,* and Jianxiong Jiang *

Manufacturing microbial cellulase in plants is an attractive strategy for the cost-effective production of cellulosic ethanol, especially the expression of thermostable cellulase, which causes no negative effects on plant growth and development. The beta-1,4-endogenous cellulase from Pyrococcus horikoshii (EGPh) is considered one of the most promising glycosyl hydrolase in the biofuel and textile industry for its hyperthermostability and its capability to hydrolyze crystalline celluloses, which has been researched extensively during recent years. In this study, the coding sequence of EGPh was expressed in Arabidopsis thaliana under the control of a CaMV35S promoter after codon optimization, with the addition of a eukaryotic Kozak sequence. The expression of EGPh caused no deleterious effects to the growth and development of transgenic A. thaliana. The heterologous EGPh showed relatively high activities, up to 111.69 and 13.35 U.mg-1 total soluble protein against soluble cellulose carboxymethyl cellulose (CMC) and insoluble microcrystalline cellulose (Avicel), respectively. The subcellular localization analysis showed that the EGPh protein was targeted to the plasma membrane and cell wall. Based on these results, it is proposed that EGPh is an ideal candidate for the commercial production of hyperthermostable endoglucanase using plants as biofactories.

Keywords: Heterologous expression; Hyperthermostable endoglucanase; Arabidopsis thaliana; Subcellular localization

Contact information: Biofuels Institute, School of the Environment and Safety Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang 212013, Jiangsu, China;

* Corresponding authors: jxjiang2002@ujs.edu.cn; jzsun1002@ujs.edu.cn

INTRODUCTION

Lignocellulose is the most abundant material on the earth. The annual yield of lignocellulose is estimated to be 150 to 170 × 109 tons, accounting for 70% of the global biomass production (Duchesne and Larson 1989; Poorter and Villar 1997; Pauly and Keegstra 2008). Therefore, the production of renewable liquid biofuels, such as ethanol, butanol, or other fermentative products from lignocellulose, has the advantages of a rich raw material, not competing with land use and food supply, as the first generation of biomass has done in the past. According to a report by the U.S. Department of Energy (DOE) and the U.S. Department of Agriculture (USDA), the production of lignocellulosic ethanol will reach 30% of liquid fuel by 2050 (Chen and Peng 2014). However, with current technologies, the cost for bioconversion of lignocellulose to ethanol remains high. The major barriers are the high cost of the transportation of feedstocks, the thermo-chemical pretreatment to make the cellulose more accessible to the cellulolytic enzymes, as well as a huge requirement of microbial-derived cellulases during the hydrolysis of cellulose (Devaiah et al. 2013; Singh et al. 2015). In such processes, cellulases account for 20% of the total cost of cellulosic ethanol (Phitsuwan et al. 2012). Therefore, cost-effective production methods of cellulolytic enzymes must be explored.

Plants were proposed as excellent bioreactors for manufacturing a large amount of cellulases at a low cost. It was reported that the cost of enzymes produced from plants was 3- to 70-fold lower than those from other production systems (Menkhaus et al. 2004). Moreover, plant biofactories can offer several other advantages including eukaryotic post-transcriptional modification, easy to control scale of production, and easy collection and storage (Twyman et al. 2003; Sharma and Sharma 2009). Expressing cellulase in lignocellulosic feedstock has become especially favorable, which provides the potential for the feedstock to play a dual role as both the biomass substrate and the enzyme provider. In recent years, a lot of progress has been made in this field. Three main enzymes for lignocellulose degradation, cellulases, hemicellulases, and lignin enzymes are successfully expressed in maize (Devaiah et al. 2013), Arabidopsis (Zeigler et al. 2000), rice (Chou et al. 2011), and tobacco (Gray et al. 2008). However, the expression of mesophilic cellulases causes deleterious effects on plant growth viacell-wall degradation at normal temperature, showing reduced growth, stunted growth, or reduced fertility (Gray et al. 2011; Klose et al. 2013). One strategy to prevent these harmful effects is the expression of thermostable cellulases with an optimal temperature over 60 °C, which is not active during plant growth (Jiang and Li 2009) and then the enzyme activity might be activated at a high temperature during post-harvesting treatments. Moreover, thermostable cellulases would benefit the industrial process of biomass degradation by eliminating bacterial contamination, and increasing the reaction rate and substrate solubility when the enzymatic hydrolysis was performed at high temperatures (Haki and Rakshit 2003; Kishishita et al. 2015). Thermostable cellulases from Acidothermus cellulolyticus and Thermomonospora fusca have been expressed in various plant species with no harmful effects and showed simplified processing and reduced exogenous enzyme loading in cellulosic ethanol production (Ziegler et al. 2000; Ransom et al. 2007; Chou et al. 2011)

The hyperthermophilic beta-1,4-endogenous cellulase (EC 3.2.1.4) (EGPh; glycosyl hydrolase family 5) was identified from Pyrococcus horikoshii, which is the first hyperthermostable endoglucanase to which celluloses are the best substrates, including Avicel, carboxymethyl cellulose (CMC), and -glucose oligomers (Ando et al. 2002). With strong hydrolysis activity toward crystalline celluloses, the optimum reaction temperature at 95 °C, and its ability to hydrolyze cellulose completely to glucose at high temperature in combination with the hyperthermophilic -glucosidase (EC 3.2.1.21) from Pyrococcus furiosus, this enzyme was considered an ideal candidate for the industrial hydrolysis of cellulose (Kashima et al. 2005; Kim and Ishikawa 2010a). Therefore, it has been extensively researched in recent years. Its crystal structure was determined in a previous study (Kim and Ishikawa 2010b). Then, the relationship between its function and crystal structure was studied (Yang et al. 2012; Kim and Ishikawa 2013). This endocellulase was successfully produced with over 100 mg/L by fungus Talaromyces cellulolyticus, which was the first step for the industrial scale production of EGPh (Biswas et al. 2006).

The objective of this research was to test the effects of expressing EGPh in biomass crops on reducing cellulase loading during the pretreatment process to reduce the bioconversion cost of lignocellulose to ethanol. To achieve a high expression level, the codon optimization was conducted, and the Kozak sequence was added immediately preceding the AUG codon. The enzyme activity, the subcellular localization of the recombinant EGPh, and the phenotype of the transgenic plants were analyzed to evaluate the application prospect of heterologous EGPh in industry.

EXPERIMENTAL

Materials

Arabidopsis thaliana wild-type Columbia (Col-0) and Agrobacterium tumefaciens EHA105 were preserved in the authors’ lab (Zhenjiang China). The plant expression vector pBI121 was given by the Nanjing Forestry University (Nanjing, China). The Taq DNA polymerase, T4 DNA ligase, and the DNA extraction kit were purchased from Takara Biotechnology Co., Ltd. (Dalian, China).

Methods

Codon optimization and gene synthesis

The coding sequence of the hyperthermostable -1,4- endonuclease EGPh gene (Gene ID: PH1171) of P. horikoshii was optimized based on Sorghum bicolor codon usage via the OptimumGeneTM algorithms codon optimization technology (GenScript Co., Ltd., Nanjing, China). The Kozak sequence ACCACC was added immediately preceding the initiator codon ATG of the optimized sequence. The XbaI and SmaI restriction sites were added at the 5’ and 3’ ends, respectively. The whole sequence was synthesized via GenScript Co., Ltd. (Nanjing, China) and cloned into the pUC57 plasmid.

Construction of expression vectors and transformation into A. thaliana

After verification by sequencing, the plasmids pUC57-EGPh were digested with XbaI and SmaI. Then, the EGPh coding sequence was cloned into pBI121-GFP binary vectors under the control of the cauliflower mosaic virus 35S promoter (CaMV35S). Subsequently, the pPBI121-EGPh-GFP plasmid was transferred into the competent cells of A. tumefaciens EHA105 using the freeze-thaw method (Hoekema et al. 1983). Then, the transformation of A. thaliana was performed by the floral-dip method (Bechtold and Pelletier 1998).

Isolation and phenotype analysis of transgenic A. thaliana

Transgenic T1 plants were selected on half-strength Murashige and Skoog medium with 50 mg/L Kanamycin. The Kan-resistant plants were transferred into soil and their morphology was observed throughout the development. The transformation of pPBI121-EGPh-GFP into transgenic A. thaliana was confirmed by polymerase chain reaction (PCR). The total genomic DNA was isolated from the leaves of the transgenic plants using a Takara DNAiso reagent kit (Code No: 9770Q, TaKaRa, Dalian, China).

Cellulase activity assay

The total soluble proteins (TSP) were extracted from the leaf tissues of transgenic and wild type A. thaliana using the modified method (Thomas et al. 2001; Mei et al. 2009). Briefly, 600 mg of fresh leaf tissue was ground into powder with liquid nitrogen. Then, 1.8 mL grinding buffer (50 mmol L−1 of sodium acetate, 10 mmol L−1 of ethylenediaminetetraacetic acid, and a pH of 5.0) were added and mixed thoroughly, and then the mixture was centrifuged at 20,000 g at 4 °C for 20 min. The supernatant was precipitated using 70% saturated ammonium sulfate, and centrifuged at 20,000 g at 4 °C for 10 min. The subsequent pellet was re-suspended with a 30 uL grinding buffer. The extracts were quantified following the Bradford method using a standard curve generated from bovine serum albumin. The activities of heterologous EGPh to convert cellulose into glucose was assessed by measuring the reaction of TSP extracted from the leaves of transgenic and wild type A. thaliana with the soluble sodium carboxymethyl cellulose (CMC) (Sigma) or the insoluble microcrystalline cellulose Avicel (Analtech) as substrates. Briefly, 2 uL TSP, 100 uL 1% (wt/mL) CMC, or 1% Avicel was added in 98 L of 100 mm acetate buffer (pH 5.6). The mixture was incubated with agitation at 80 °C for 10 min and cooled down in ice water (Hiromi et al. 1963). The total reducing sugar was determined using the modified Somogi-Nelson method (Lever et al. 1973). The reaction was terminated by adding 200 μL of 0.5 M NaOH. After the addition of 800 μL 4-hydroxy-benzoic acid hydrazide (PAHBAH) and being boiled for 10 min and then cooled down in ice water, the released reducing sugar was spectrophotometrically quantified at 420 nm and compared with the glucose standard curves. One unit of cellulase activity was defined as the amount of enzyme that catalyzed the releasing of 1 μmol reducing sugar per minute.

Subcellular localization analysis

The subcellular localization of EGPh was predicted based on the identification of signal peptide sequences by ProtComp v.9.0 (Softberry, Inc., NY, USA) by Psort (Computational Biology Research Center, Tokyo, Japan). To determine the subcellular localization of the recombinant EGPh, the transient expression of EGPh-GFP in onion epidermal cells was analyzed. The constructs pPBI121-EGPh-GFP and pPBI121-GFP were transformed into onion (Allium cepa) epidermal cells mediated by A. tumefaciens EHA105 as described by Sun et al. (2007). Transformed cells were put in 10% sucrose for plasmolyzing. Green fluorescent protein was visualized using the inverted epifluorescence microscope (AxioVert.A1; Carl-Zeiss, Oberkochen, Germany). The images were captured on an Axio Cam IC Zeiss Camera (Oberkochen, Germany) using ZEN lite 2012 software (AxioVert.A1; Carl-Zeiss, Oberkochen, Germany).

RESULTS AND DISCUSSION

Codon optimization, gene synthesis, and the construction of plant expression vector

The codons of the EGPh gene were optimized by the OptimumGeneTM algorithms (Genscript, Nanjing, China) according to the codon bias in S. bicolor. The variety of parameters critical to the efficiency of gene expression were optimized, codon adaptation index (CAI ) was upgraded from 0.71 to 0.84, the guanine and cytosine (GC) content was optimized from 39.85 to 47.65 to prolong the half-life of the mRNA, and the percentage of high frequency codons (< 90%) increased to 95% after optimization. The optimized sequence was submitted to the GenBank Centre with accession numbers of MH830298 and was chemically synthesized from GenScript Co., Ltd. (Nanjing, China) (Fig. 1). The construction of pPBI121-EGPh-GFP was confirmed by double digestion (Fig. 2a). The transformation of A. tumefaciens was confirmed by PCR with primer EGPh F-1 and EGPh R (Table 1), in which the predicted 400 bp fragments were amplified (Fig. 2b).

Though plants are well suited to the production of industrial enzymes for biomass treatment, the most important factor is to ensure competitive production cost (Xue et al. 2003; Tremblay et al. 2010). The best way to achieve this is to boost expression (Nandi et al. 2005; Streatfield 2007).

C:\Users\Administrator\Desktop\图片3.jpg

Fig. 1. The alignment of the original EGPh and codon optimized EGPh sequences based on Sorghum bicolor codon bias. The letters in red indicate the replaced codons; Optimized EGPh: codon optimized sequence of EGPh gene and Original EGPh: original sequence of EGPh gene. The predicted signal peptide-like sequence for membrane-binding is underlined.

Fig. 2. The confirmation of pPBI121-EGPh, transgenic A. tumefaciens EHA105 and transgenic A. thaliana; M: DL2000 DNA marker; (a) The confirmation of pPBI121-EGPh vector; 1: pPBI121-EGPh; 2 to 3: Double digestion of PBI121-EGPh by XbaI/SmaI; (b) The confirmation of pPBI121-EGPh in A. tumefaciens EHA105; 1 to 6: Clones of transformed A. tumefaciens EHA105; (c) 1: Wild-type A. thaliana (ecotype Columbia); 2: pPBI121-EGPh vector; 3 and 4: Transgenic A. thaliana

There are many strategies available to boost the expression of heterologous enzymes in plants, including the use of strong promoter, enhancer, codon optimization, 5’ or 3’ untranslated regions, and targeting to subcellular sites (Streatfield 2007; Desai et al. 2010). Among them, codon bias was increasingly realized to have profound impacts on the expression level of heterologous proteins (Kane 1995). After codon-optimization, the increases in the expression level of mammalian proteins was up to 5- to 15-fold (Gustafsson et al. 2004). The optimized coding sequence of the human cystatin C gene increased the expression and secretion of its protein by approximately 3- to 5-fold in yeast (Li et al. 2014). The protein expression of a mycotoxin zearalenone (ZEN) detoxifying gene was improved in P. pastoris through codon optimization (Xiang et al. 2016). Thus, to improve the expression level of EGPh in plants, this gene was optimized to codons favored by S. bicolor.

In addition, the Kozak sequence (CCA/GCCATGG) that extends from approximately position -6 to position +6 (the A in AUG is considered +1) was proposed as the most important context required for the efficient initiation of translation (Kozak 1987). Point mutations in the Kozak sequence can lead to a leaky scanning of the initiator codon AUG and reduced translation initiation over a 20-fold range (Kozak 1991, 1997). A 10-fold higher luciferase activity was detected in the BmN4 cells transfected with the optimal consensus Kozak motif (Tatematsu et al. 2014). The ‘most preferred’ Kozak sequence in plants was reported as a 4-fold improvement of translation of a chitinase protein (Taylor et al. 1987). Thus, to improve the translation efficiency, the authors added an ACCACC Kozak consensus motif immediately preceding the ATG codon of the optimized sequence of EGPh. Moreover, the CaMV35S promoter was used in this study, which is a commonly used promoter of dicotyledonous plants that can enhance the transcription of heterologous genes more than specific promoters.

The phenotype of transgenic A. thaliana

The transgenic A. thaliana was confirmed by PCR with primer EGPh R and EGPh F-2 (Fig. 2c, Table 1), in which the predicted 1400 bp fragments were amplified. The 35S::EGPhtransgenic A. thaliana were healthy and developed normally compared with the wild type, which indicated that the expression of the exogenous gene EGPh had no negative effect on the growth and development of A. thaliana (Fig. 3).

Table 1. Primers Used in This Research

Fig. 3. The phenotype of transgenic A. thaliana; Col-0: wild type A. thaliana (ecotype Columbia) 35S::EGPh: transgenic A. thaliana

Thus far, the hyperthermostable cellulases have been highly expressed in Arabidopsis, rice, tobacco, potato, barley, corn, and other plants, with no deleterious effects to the growth and development and no obvious change in plant phenotypes (Ziegler et al. 2000; Ziegelhoffer et al. 2001; Devaiah et al. 2013). In this research, the 35S::EGPh transgenic A. thaliana were healthy and developed normally compared with the wild type, which indicated that the expression of the exogenous gene EGPh had no negative effect on the growth and development of A. thaliana. This may have been due to the limited activity of thermophilic cellulase at room temperatures in plants or the lack of direct access of thermophilic cellulase to the cellulose in the plant wall, which is present as a compact mixture together with lignin and hemicellulose (Sticklen 2006). However, this result was inconsistent with the expression of EGPh gene in tobacco chloroplast, in which transgenic plants demonstrated pale-green color and a slower growth rate than the wild-type plants (Nakahira 2013). Therefore, changes in the components and construction of the cell wall of transgenic A. thaliana were analyzed next to illustrate the effects of heterologous EGPh on plant cell wall recalcitrance.

The activity of heterologous EGPh in A. thaliana

The TSP were extracted from the leaf tissues of transgenic and wild type A. thaliana. With these three strategies applied (condon optimization, Kozak sequence, CaMV35S promoter), the activities of EGPh in transgenic A. thaliana were up to 111.69±6.53 U mg-1 and 13.35±0.24 U mg-1TSP against CMC and Avicel (Fig. 4), higher than EGPh expressed in tobacco chloroplast (20.5U mg-1TSP against CMC) (Nakahira 2013) and almost comparable with cellulase produced by microbial production system (220 U mg-1TSP against CMC) (Bao et al.2011; Ul Haq et al. 2015).

Fig. 4. The endocellulase activity of EGPh in transgenic A. thaliana; Col-0: wild-type A. thaliana (ecotype Columbia) 35S::EGPh: transgenic A. thaliana; CMC: The soluble sodium CMC was used as substrate; Avicel: The insoluble microcrystalline cellulose Avicel was used as substrate

As a highly promising application prospect both in the biofuel and textile industry for its hyperthermostability and capability of hydrolyzing crystalline cellulose, EGPh has been extensively studied in recent years. The high activities of EGPh in transgenic A. thaliana implied that it is an ideal candidate for the economic production of cellulases in biomass crops. The results also demonstrated that codon optimization, Kozak sequences, and a CaMV35S promoter do help in the active, high-level expression of P. horikoshii EGPh in A. thaliana and must be factors considered in the expression of heterologous enzymes in plants. However, although the activities of recombinant EGPh in transgenic A. thaliana are relatively high, it is still remarkably lower compared to the amount required for complete biomass degradation. To further increase the accumulation of EGPh, other regulation strategies need to be applied. For instance, enhancer and untranslated regions play important roles in improving the accumulation of heterologous cellulase in plants (Ziegler et al. 2000). Expressing only the catalytic domains was reported to greatly increase the amounts of heterologous enzymes (Ziegelhoffer et al. 2001).

Compartmentalization of heterologous EGPh in plants

The authors predicted the potential subcellular localization of the EGPh using ProtComp v.9.0 and WoLF PSORT. The ProtComp v.9.0 predicted that EGPh may be an extracellular (secreted) protein with a low score of 2.4. The WoLF PSORT predicted that it might localize in the plasma membrane. The subcellular localization of the EGPh protein was then analyzed in transient expression assays on the epidermal cells of onion. The EGPh-GFP protein was expressed on the plasma membrane and cell wall (Figs. 5g through 5o), while the GFP protein was observed in cytosol (Figs. 5a through 5f).