Can Iptg Concentration And Induction Change Solubility Of Recombinant Protein
Abstract
The solubility of proteins is usually a necessity for their functioning. Recently an emergence of machine learning approaches as trained alternatives to statistical models has been evidenced for empirical modeling and optimization. Here, soluble production of anti-EpCAM extracellular domain (EpEx) single concatenation variable fragment (scFv) antibody was modeled and optimized as a function of four literature based numerical factors (postal service-induction temperature, post-consecration fourth dimension, prison cell density of induction fourth dimension, and inducer concentration) and 1 chiselled variable using artificial neural network (ANN) and response surface methodology (RSM). Models were established past the CCD experimental data derived from 232 split up experiments. The concentration of soluble scFv reached 112.4 mg/Fifty at the optimum condition and strain (induction at cell density 0.six with 0.4 mM IPTG for 24 h at 23 °C in Origami). The predicted value obtained by ANN for the response (106.i mg/L) was closer to the experimental issue than that obtained by RSM (97.9 mg/Fifty), which once again confirmed a higher accuracy of ANN model. To the author'south knowledge this is the first report on comparison of ANN and RSM in statistical optimization of fermentation weather condition of East.coli for the soluble production of recombinant scFv.
Introduction
Due to its numerous advantages such equally the availability of different genome engineering tools and strategies, established loftier cell density culture techniques, high growth charge per unit and low protease, E. coli has been widely utilized as 1 of the well-nigh favoured microbial hosts for the production of recombinant proteins. Procedure development and cell engineering are two strategies widely employed to raise heterologous poly peptide production in this host1. The evolution of product conditions is one of the most influential steps in process development. To obtain all-time possible production conditions, optimizing variables based on "one-cistron-at-a-time" approach in addition to being a labour-intensive procedure, it is not able to identify interactions between the diverse parameters involved. Statistical-based and artificial intelligence-based approaches tin can overcome limitations of the conventional single parametric optimization methods2. Response Surface Methodology (RSM) is an efficient optimization method extensively utilized to constitute the quantitative relationship between the independent process parameters and responses. Moreover, in RSM, the effects of the variables alone or in combination can be analyzed via regression analysis. Optimum levels of process parameters for preferable responses are also robustly predicted in this methodthree. RSM combined to primal composite design has been widely employed in optimization of culture weather conditionfour. However, RSM is unable to accurately model a highly not-linear complex organization. And then, a express range of input process parameters can exist exactly modeled by RSM. Machine learning techniques such as artificial neural network (ANN), which is popular for non-linear multivariate modeling can successfully overcome this limitation of RSM and can be a promising tool for modeling of the biological systems5. Yet, according to their structure, ANN requires processors with parallel processing power. Moreover, there are no specific rules for determining the structure of artificial neural networks. Proper network construction is achieved through trial and error6. The most pop ANN network is organized in iii layers comprised of input layer, output layer and hidden layer. Different number of subconscious layers can exist found within a feedforward network7. The detail weights of the produced output data by the model are utilized to predict the new set of input data. By presenting sets of input/output data pairs to the neural network, ANN models can exist trained. After being trained on the model, the network can correctly predict the outputs corresponding to responses it never has seen before8. This approach was successfully utilized as a data assay tool in fermentation optimization like product of Fifty-asparaginase from Aspergillus niger 9. Several reports have shown that ANN models can work ameliorate than RSM when the same DOE has been used. For instance Bas and Boyaci results showed the superiority of ANN over RSM in enzyme kineticsx.
The optimal conditions for fermentative production of soluble anti- EpCAM extracellular domain (EpEx) single chain variable fragments (scFv) were evaluated in the electric current written report. The scFv represents a class of antibody fragments which is comprised of a heavy chain variable domain (VH) and a light chain variable domain (VL) of an antibiotic joined past a flexible peptide linker. Its molecular weight is considerably smaller than the total-length antibodies. Owing to modest size and low immunogenicity, scFv has brought much attending in biomedicine for theranostic purposes11. 4D5MOC-B scFv is a stable anti EpCAM extracellular domain-scFv (anti EpEX-scFv) with a very high affinity to its target. Information technology was generated from the binding residues of parental hybridoma MOC31 which was grafted onto the scFv 4D5 framework. EpCAM was ane of the first target antigens considered for tumor immunotherapy considering of its overexpression in epithelial-derived neoplasms12.
For the first fourth dimension, this study adopted ANN and RSM to model the effects of mail service-induction temperature, mail service-induction time, cell density of consecration time, and inducer concentration every bit numerical factors along with different strains every bit a categorical gene on soluble production of scFv. Here, the ANN was adult with a large number of experimental data points (232), which reduces problems with overfitting and allows more complex models to be used. Moreover, the optimum culture condition and strain recommended by model were experimentally verified.
Materials and methods
Bacterial strains and plasmid
4 E. coli strains including SHuffle T7 (gifted by Dr. Nematollahi, Pasteur institute of IRAN, Tehran, Iran), BW25113 (rrnB3 ΔlacZ4787 hsdR514 Δ(araBAD)567 Δ(rhaBAD)568 rph-1 γ (DE3), gifted from Prof. Dr. Silke Leimkühler, Academy of Potsdam, Potsdam, Frg), Origami (DE3) (Pasteur establish of IRAN, Tehran, Islamic republic of iran), and BL21 (DE3) (gifted by Dr. Keramati, Pasteur institute of Iran, Tehran, Iran) were used here as the host for antiEpEX-scFv expression. Heat shock method was used to transform the pETDuet-1 plasmid (gifted from Dr. Bandehpour, Shahid Beheshti Academy of Medical Sciences, Tehran, Iran) containing the antiEpEX-scFv gene into the chemically competent cells of each strain12.
Analytical methods
Protein expression
For initial conclusion of the anti EpEX-scFv expression. Several transformed clones were checked from each strain for their power to protein expression in similar condition (37 °C, OD 0.8, IPTG 0.eight mM, and 24 h) in 50 mL TY2x medium and results were confirmed past western blotting. In social club to perform the optimization experiments, East. coli cells were firstly pre-cultured in liquid TY2x medium supplemented with 100 µg/mL ampicillin overnight at 37 °C. And so, 50 mL of medium was inoculated with x% (v⁄v) of the pre-culture. This civilisation was used for all experiments designed past RSM-CCD methodology.
Sample training
After centrifugation of civilization medium (x,000 g for 10 min at four °C), the cell pellets were resuspended in 20 mL of lysis buffer containing one mg/mL lysozyme, xx mM Tris pH 7.5, 50 mM NaCl and 50% glycerol followed by incubation on water ice for 40 min. the cells were then sonicated for 20 min (twenty s on/3 southward off) at 400 W and centrifuged at 4 °C (15,000 × grand for thirty min). The obtained supernatants and pellets were collected equally soluble and insoluble fractions respectively.
SDS-PAGE and expression assay
The expression level of the recombinant protein was analyzed utilizing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS- PAGE). The samples were resuspended in iv × SDS sample buffer. Later heating at 100 °C for v min, 10 μL of each sample was loaded onto 15% SDS-Page gel and electrophoresis was carried out. The protein bands were detected via staining the gel with coomassie brilliant blue G-250 staining solution. The signal intensities of the protein bands in all 120 PAGEs were densitometrically determined utilizing ImageJ software (NIH, MD).
Western blotting
Later on separation, poly peptide bands were transferred from a SDS- Folio gel onto the polyvinylidene difluoride (PVDF) membrane using electroblotting (moisture Transblot, Bio-Rad, United states). After blocking in 5% non-fat milk in tris-buffered saline-tween (TBST) for ane h, the transferred membrane was washed with TBST for three times and incubated overnight with anti-6 × His tag antibody (Sigma, United kingdom of great britain and northern ireland). So, the membrane was washed by TBST three times and incubated in anti-mouse horseradish peroxidase (HRP)-labelled secondary antibody for 2 h (Sigma, United kingdom). The three,3-diaminobenzidine (DAB) (Sigma, Uk) was used for band detection.
Optimization methods and predictive modeling
Response surface methodology
After the initial expression of antiEpEX-scFv, we employed the RSM-CCD methodology for optimization of soluble expression of antiEpEX-scFv, using software package Design-Expert version eleven (Stat-Ease Inc., Minneapolis, United states of america). Based on our previously published data, the furnishings of contained variables including post-induction temperature, postal service-consecration time, optical prison cell density in 600 nm before the induction and concentration of inducer as numerical factors and effect of different strains as a categorical factor on the production of soluble antiEpEX-scFv fragment were examined in the current report. Each numerical variable was set to 5 levels with 2 replications: plus and minus i (factorial points), plus and minus alpha (centric points), and the central signal (12 central points and 48 non-central points in total) (Table i). Then the categorical factor with 4 levels was added, a total of 232 dissever experiments were carried out in 250 mL Erlenmeyer flasks containing l mL of TY2x medium (Supplementary Table 1). The estimated response obtained from RSM model was further compared with actual response in terms of coefficient of conclusion (R2) and Root mean square error (RMSE) using the Eqs. (one) and (2).
$${R}^{2}=ane-\frac{{\sum }_{i=ane}^{northward}({{y}_{i}-{y}_{di)}}^{two}}{{\sum }_{i=1}^{n}({{y}_{di}-{y}_{a)}}^{two}}$$
(1)
$$RMSE=\sqrt{\frac{1}{n}{\sum }_{i=1}^{n}({{y}_{i}-{y}_{di)}}^{two}}$$
(two)
where north represents the number of experiments, y i , the predicted value, y di the experimental value and y a is the average of experimental value.
Artificial neural network
Alongside RSM methodology, we used ANN for optimization. In this written report, Neural Designer software version iv.two.0 by Artelnics company feed-forward backpropagation in Multi-layer perceptron (MLP) was employed with four numerical and i categorical factor (with four levels). A multi-layer neural compages contains input, output and hidden layers. The input layer consisting of eight neurons represents the variables including postal service-consecration temperature, mail service-induction fourth dimension, optical cell density in 600 posnm before the induction and concentration of inducer and four different strains (BW25113(DE3), Origami(DE3), SHuffle T7 and BL21(DE3)). The raw results of densitometry analysis were used for input (Supplementary Table i). The output layer with one neuron represents soluble expression of antiEpEX-scFv (Fig. one). The neurons number in the hidden layer was called depend on Rtwo. Finally, considering R2 and RMSE, the estimated response obtained from ANN model was compared with actual response using the Eqs. (1) and (2).
Comparison of predictive capabilities and validation of the RSM and ANN-based models
The prediction capabilities of RSM and ANN models were compared using error parameters. A dataset having 145 information points was randomly selected from the full dataset. The bodily response of protein solubility was compared with estimated response achieved by RSM and ANN model in the randomly selected dataset in terms of R2 and RMSE using the Eqs. (ane) and (2). Smaller values of RMSE show fair performance of the prediction models. Moreover, the experimental response of soluble protein production was plotted forth with the corresponding predicted values of the RSM and ANN models. In add-on, the validity of the models was evaluated by experimentally assessing the combination of tested variables leading to the maximum predicted level of protein solubility.
Results
Poly peptide expression
The expression of the scFv protein was assessed in iv E. coli strains before optimization using SDS-PAGE method. Utilizing western blotting, anti-His-tag monoclonal antibiotic can confirm the expression of His –tagged scFv in all stains studied here (Fig. 2).
Predictive modeling and optimization methods
Response surface methodology modeling
Based on the published data, iv numerical (mail service-induction time, concentration of inducer, post-induction temperature, and optical jail cell density) and i categorical (different strains) factors were selected for statistical optimization. Every bit presented in Table ane, the five-level CCD with a total of 232 runs was employed (Supplementary Table ane). The dependent response (soluble production of scFv) was correlated with the independent numerical factors (coded values) in dissimilar strains using predicted post-obit equations:
$$ \begin{aligned} & \left( {\text{Y}} \right)^{{0.v}} : \\ & {\text{BW }}25113\left( {{\text{DE}}}3 \correct) \\ & {\text{Y}} = \, - 36.1443A + 137.399B - 6761.67C + 2608.58D \\ & \qquad- \, 0.501425AB + 17.572AC - 35.8366AD - 32.7466BC \\ & \qquad+ 48.2716BD - 2,588.53CD + 2.22388A^{two} - 2.01959B^{2} \\ & \qquad+ 7078.58C^{two} - 1294.25D^{two} + 420.482 \\ \finish{aligned} $$
$$ \begin{aligned} & {\text{Origami}}\left( {{\text{DE}}} 3\right) \\ & {\text{Y}} = 209.604A - 40.9083B - 12140.1C - 91.7678D \\ &\qquad - i.83904AB - 108.763AC - 16.7602AD + 167.013BC \\ & \qquad- 17.7474BD - 162.912CD - 1.43726A^{2} - \, 0.861316B^{2} \\ & \qquad+ 5869.2C^{2} + 447.055D^{2} + 4861.88 \\ \end{aligned} $$
$$ \begin{aligned} & {\text{SHuffle T}}7 \\ & 173.319A + 120.981B + 13795C + 1830.34D - 1.46395AB \\ &\qquad - 34.9288AC - 66.2975AD - 129.442BC - 7.38606BD \\ &\qquad - 4836.21CD - 2.36028A^{two} + \, 0.178132B^2 - 4977.67C^{ii} \\ & \qquad+ 1802.92D^{2} - 7034.69 \\ \end{aligned} $$
$$ \brainstorm{aligned} & {\text{BL}}21\left( {{\text{DE}}} iii\right) \\ &\qquad - 9.03435A - 2.64382B - 7902.75C - 687.916D + 1.13725AB \\ &\qquad - 134.612AC + 37.2579AD + 45.6254BC + 85.7176BD \\ &\qquad - 5129.02CD + ane.81334A^{2} - 1.73011B^{two} + 8359.54C^{2} \\ &\qquad + 853.879D^{2} +4084.09 \\ \end{aligned} $$
In the above equations, Y denotes response (soluble production of anti EpEX-scFv), and A, B, C, and D denotes post-induction time, post-induction temperature, cell density earlier induction, and IPTG concentration, respectively.
Co-ordinate to ANOVA results, significant "F value" (15.78) as well as insignificant "Lack of Fit for value of F" indicates that the model is valid to predict soluble production of scFv. The low p-value (Prob > F) (< 0.0001) of the model resignifies its significance. R2 (the coefficient of determination) of 0.950 implies that 95.0% of the variability in the response tin can exist described by the model. Furthermore, the difference value less than 0.ii confirms a high degree of correlation between the predicted Rii (0.7487) and adapted Rii (0.7906) values. Plot illustrated in Supplementary Fig. S1 confirms this correlation again. Too, the accuracy and predictability of the selected model were validated by the normal probability plot of the studentized residuals (Supplementary Fig. S1). Based on ANOVA results, the proposed model fits the experimental information well. So it tin exist finer utilized to navigate the blueprint space (Table ii).
As depicted in Table two, three linear terms (mail-induction fourth dimension (A), concentration of inducer (D) and different strains (E)) were plant to be significant for soluble production of scFv whereas post-induction temperature and optical prison cell density variables had no significant impact on solubility of scFv. All interactive terms except temperature- optical cell density (BC) were plant to exist significant which was axiomatic from their p-values (less than 0.05). As well, two quadratic terms (Atwo and D2) were non significant according to Table 2. Moreover, information technology can be concluded that postal service-induction fourth dimension is largely affecting soluble production of anti EpEX-scFv.
Utilizing 2-dimensional graphs, the interactive effects between 2 meaning contained variables (A and D (Fig. 3), A and B (Supplementary Fig. S2), A and C (Supplementary Fig. S3), B and D (Supplementary Fig. S4) and C and D (Supplementary Fig. S5)) were studied in different strains while keeping other two numerical factors at their constant eye levels. From Fig. three, and Supplementary Fig. S2 and S3, it was axiomatic that increasing the post-induction time led to solubility increase in three strains including BW25113(DE3), Origami(DE3) and BL21(DE3), and subtract in SHuffle T7. Moreover, upon increasing the concentration of inducer, the solubility had significantly decreased in Origami(DE3) and SHuffle T7 which was more substantial in SHuffle T7 than that in Origami(DE3) in similar post-induction time (Fig. 3). Besides, increasing the temperature had a negative effect on scFv solubility in Origami(DE3) (Supplementary Fig. S2). As illustrated in Supplementary Fig. S3, more soluble protein was provided in BW25113(DE3) when protein production was induced at college OD600 nm while the amount of soluble scFv obtained in Origami(DE3) and SHuffle T7 had been negatively affected by increasing the OD600 nm before induction. A significant interaction between temperature and inducer concentration is also indicated by ANOVA (p-value of 0.0017) (Table2). As depicted in Supplementary Fig. S4, when the levels of post-induction time (A) and optical prison cell density (C) were kept constant at their medium value (16 and 0.7 respectively), temperature raise could lead to increase the solubility in BW25113(DE3) and SHuffle T7. In BW25113(DE3), although increasing IPTG concentration at lower temperature decreased the amount of soluble fraction, an increase in inducer concentration at higher temperature had a positive effect on protein solubility. The dependency of OD600 nm before induction (C) and IPTG concentration (D) on scFv solubility when the postal service-induction time (A) also as temperature (B) is kept constant (sixteen °C and 30 °C respectively) is illustrated in Supplementary Fig. S5. According to this graph, an increase in OD600 nm at higher IPTG concentration (0.8) led to a subtract in solubility in BL21(DE3) and SHuffle T7 and at lower inducer concentration (0.4), increasing the OD600 nm enhanced protein solubility. Interestingly, Supplementary Fig. S5 also declares that increasing the OD600 nm at both IPTG concentration levels leads to a solubility increase in BW25113(DE3) and decrease in Origami(DE3). The interactive effects between each independent numerical variable and strain type were studied while keeping other three numerical factors at their constant eye levels. As depicted in Fig. four and confirmed past ANOVA results, postal service-consecration time was the about effective factor on soluble production of scFv in four strains studied here.
Bogus neural network modeling
Using artificial neural network (ANN) models, the beliefs of nonlinear multivariate systems can be predicted. The multilayer feed forward neural network with Quasi-Newton algorithm was the model considered for the present work. In this written report, the same DoE used in building the RSM model was too employed to develop the ANN-based model. The experimental data was divided into three subsets including preparation, testing and validation (lxx%, fifteen%, 15% of information respectively) (Table iii). A small amount of noise was added to the data set and regularization of weight was done to prohibit overfitting the training data and make smoother responses. The network topology adult for ANN determines the accuracy of a model prediction. To achieve optimal ANN structure for prediction, the number of hidden layers and neural limerick were determined past varying the number of hidden layers (1–v) besides as number of neurons (8–48). We had 8 neurons in the input layer and the scaling layers were gear up at automatic with eight neurons. For perceptron layers, different architectures were investigated and best results were accomplished when we had xv, 10 and 3 neurons in the offset, 2nd and final hidden layers respectively. Activation function in all hidden layers was a hyperbolic tangent. The scaled outputs from the hidden layers connected to the unscaled layer with one neuron to produce the original units. Moreover, the model selection was carried out to achieve amend network architecture with the all-time generalization. Finally, the operation of the developed network was examined based on NRMSE and R2 of testing information. The fitness of the model was confirmed by its overall Rtwo which was found to exist 0.87. NRMSE value also indicates a expert prediction of outputs (0.288).
Comparison of predictive capabilities and validation of the RSM and ANN-based models
In the current report, based on R2 and the error analyses, the effectiveness of the empirical models was statistically evaluated between estimated and bodily responses. A dataset having 145 data points was randomly selected from the full dataset. The experimental response along with the predicted data obtained for soluble production of scFv are given in Supplementary Table 2. Co-ordinate to obtained results, for random dataset, the R2 for ANN and RSM models are 0.913 and 0.856 respectively, demonstrating the ability of these models to describe 91% and 85% of the variations of the actual values respectively. The NRMSE is more for RSM model (0.264) than for the ANN model (0.154), which ways that the predicting capacity of the ANN model is higher over the RSM model. According to comparative plot for predicted and actual values, the ANN model has fitted the experimental responses with an splendid accuracy. Greater deviation is seen in RSM-based prediction for soluble scFv yield than ANN (Fig. 5). For validation of models, utilizing the RSM model based predicted optimum atmospheric condition (Table iv), experimental densitometric analysis effect of 112.4 mg/L was obtained for soluble fraction which was in adept correlation with the predicted value of 97.9 mg/L. When the levels of the variables were replaced in the ANN model, the maximum predicted response value was 106.1 mg/Fifty, which was closer to the experimental consequence (112.4 mg/50) than the RSM (97.9 mg/L). Reaffirms the higher accuracy of ANN model.
Give-and-take
Due to unsuitable folding of poly peptide, nearly of the heterologous proteins expressed in E. coli aggregate in inclusion bodies which are partly or completely devoid of biological action. Solubilization of these aggregates requires denaturing agents in a high concentration causing the loss of secondary structure of protein. Moreover, after refolding, the obtained proteins might be unstable. Therefore, focusing on environmental modification, many investigations accept tried to develop expression of well folded highly soluble proteins in Due east. coli during the past iii decades13.
The RSM and ANN models for optimized soluble production of scFv were studied hither for the first time. Too, hither, the effects of four numerical factors along with different strains as a categorical cistron on response were investigated for the showtime time. Based on the developed quadratic model, temperature and consecration OD were not significant and the other three terms (post-induction time, concentration of inducer and unlike strains) were found to be significant for soluble production of scFv. In agreement with our study, many investigations showed the influence of different East. coli strains on solubility of various proteins. For example, the event of different engineered hosts including BL21(DE3) pLysS, BL21(DE3) and Rosetta on soluble expression of recombinant TNF-α was assessed by papaneophytou et al. Their results showed lower yield of soluble TNF-α in Rosetta compared to the other 2 hosts14. The constructive role of the engineered strains on solubility demonstrated hither is also in line with Zhang et al.'s report which has showed the college solubility of IGF1-thioredoxin fusion in Rosetta-gami (DE3) than that in Rosetta (DE3) and Bl21 (DE3) leanerfifteen. Herein, the maximum soluble amount of scFv was achieved in Due east. coli Origami(DE3) which is a type of mutant strain with mutation in thioredoxin reductase (trxB) and glutathione reductase (gor) genes. Its oxidative environment enhances disulfide bonds formation in the cytoplasm which leads to lesser aggregating of misfolded proteins and inactive inclusion body formation16. Moreover, the optimal culture conditions obtained here were IPTG concentration of 0.four mM, cell density before induction of 0.6 nm, post-induction temperature of 23 °C and post-induction time of 24 h. Consequent with this finding, Heo et al. achieved the highest soluble amount of anti-c Met scFv in 0.5 mM concentration of IPTG. They showed that in Origami (DE3), higher inclusion body formation was associated with higher concentrations of IPTG and lowering IPTG concentration (1 to 0.5 mM) led to higher levels of functional anti-c Met scFv expression. This is considering lesser inducer concentration can lead to lower transcription charge per unit and higher efficiency of intracellular folding of the target proteinxvi. Consistently, soluble production of recombinant scFv against HBV preS2 in Origami2 (λDE3) was shown to be promoted at low concentration of inducer17. We too showed that maximum soluble amount could be accomplished at low temperature (23 °C). Our data was in understanding with the prior studies in which several proteins including human interferon α-two ricin A concatenation, subtilisin E, Fab fragments, and β-lactamase had higher solubility at low temperaturesxviii,19. Similarly, Emamipour et al. achieved maximum solubility at 23 °C for DsbA-IGF1 poly peptide using BBD methodologytwenty. This may exist a result of providing plenty time for the proper folding due to tiresome rate of cell processes such every bit transcription, translation, and cell division. Also, decreasing temperature has been shown to eliminate the heat-shock proteases which are induced during overexpression of heterologous proteins. Also, information technology has been reported that at low temperature, the expression and activity of some chaperones are increased which tin can facilitate corrected folding of the recombinant proteins21. The results of the current study also showed that a long incubation time was critical for the optimal expression of soluble scFv. This finding was consequent with the findings of Sina et al. which showed a significant increase in soluble expression of humanized anti-TNF-α scFv- GST fusion protein in E. coli Origami (DE3) when it was produced in the presence of low corporeality of inducer, in low cultivation temperature under a long incubation time22.
In the electric current written report, RSM and ANN methodologies are compared for their efficiency in optimization of fermentation media. Although both methods were shown to exist constructive in determining the optimum atmospheric condition to improve the response, comparing R2 accomplished from ANN model (0.913) to that obtained for RSM (0.856) showed the better ability of the former in modeling soluble product of scFv, due to its deliberate overtraining. Consistently, for culture medium optimization, car learning techniques take been shown to outperform the statistically-designed models in few investigations presented in the literature. For case, to maximize growth and lipid productivity of marine microalga Tetraselmis sp, the limerick of a culture medium was optimized by Mohamed et al. using both RSM and ANN models. They reported that ANN was a more appropriate method for increasing biomass concentration and lipid yield than the RSM-based optimization method23. Similarly, compared to RSM, a higher predictive capacity for ANN was reported by Rafigh et al. for optimizing the culture weather for curdlan production by Paenibacillus polymyxa 24.
Conclusion
In the present study, we have optimized fermentation condition for soluble product of antiEpEX-scFv by optimizing iv literature based numerical factors and one categorical variable using ANN and RSM. Based on the RSM, three linear terms (post-induction fourth dimension (A), concentration of inducer (D) and different strains (E)) significantly affected solubility of scFv whereas post-induction temperature and optical cell density variables had no pregnant impact on the response. Moreover, post-induction time was the virtually affecting parameter. Analysis of error parameters and Rtwo from a dataset having 145 data points randomly selected from the full dataset revealed the superiority of ANN model to RSM. Thus information technology may be ended that although RSM usually is the first selection for statistical modelling, automobile learning models can also exist utilized to optimize the fermentation condition. The best fermentation conditions estimated by RSM, (consecration at cell density 0.6 with 0.4 mM IPTG for 24 h at 23 °C in Origami(DE3)), allowed predicting a maximum soluble production of 97.ix mg/L which was in good correlation with the experimental value of 112.4 mg/Fifty However, predicted value by ANN model (106.1 mg/L) was closer to the experimental result (112.four mg/L) than that predicted by RSM (97.nine mg/L). Encouraging results of this report show that machine learning approaches can be applied for efficient soluble product of scFv which is highly applicative in diagnostic and therapeutic purposes.
References
-
Shariati, F. Southward., Keramati, M., Valizadeh, 5., Cohan, R. A. & Norouzian, D. Comparison of Due east. coli based self-inducible expression systems containing dissimilar man rut daze proteins. Sci. Rep. 11, 1–ten (2021).
-
Desai, G. Yard., Survase, S. A., Saudagar, P. S., Lele, S. & Singhal, R. S. Comparison of bogus neural network (ANN) and response surface methodology (RSM) in fermentation media optimization: case report of fermentative production of scleroglucan. Biochem. Eng. J. 41, 266–273 (2008).
-
Dikshit, R. & Tallapragada, P. Screening and optimization of γ-aminobutyric acid production from Monascus sanguineus nether solid-state fermentation. Front. Life Sci. eight, 172–181 (2015).
-
Shahzadi, I. et al. Calibration-up fermentation of Escherichia coli for the production of recombinant endoglucanase from Clostridium thermocellum. Sci. Rep. 11, 1–x (2021).
-
Youssefi, Southward., Emam-Djomeh, Z. & Mousavi, S. Comparing of artificial neural network (ANN) and response surface methodology (RSM) in the prediction of quality parameters of spray-dried pomegranate juice. Dry. Technol. 27, 910–917 (2009).
-
Dumitru, C. & Maria, V. Advantages and disadvantages of using neural networks for predictions. Ovidius Univ. Ann. Ser. Econ. Sci. 13, 444–449 (2013).
-
Sastry, A. et al. Machine learning in computational biology to accelerate high-throughput protein expression. Bioinformatics 33, 2487–2495 (2017).
-
Bhilare, Thou. D. et al. Machine learning modelling for the loftier-force per unit area homogenization-mediated disruption of recombinant Eastward. coli. Process Biochem. 71, 182–190 (2018).
-
Gurunathan, B. & Sahadevan, R. Pattern of experiments and artificial neural network linked genetic algorithm for modeling and optimization of L-asparaginase production by Aspergillus terreus MTCC 1782. Biotechnol. Bioprocess Eng. 16, 50–58 (2011).
-
Baş, D. & Boyacı, İH. Modeling and optimization II: comparison of estimation capabilities of response surface methodology with artificial neural networks in a biochemical reaction. J. Food Eng. 78, 846–854 (2007).
-
Kim, H. I. et al. Biomolecular imaging of colorectal tumor lesions using a FITC-labeled scFv-Cκ fragment antibody. Sci. Rep. 11, one–11 (2021).
-
Behravan, A. & Hashemi, A. Statistical optimization of culture conditions for expression of recombinant humanized anti-EpCAM unmarried-chain antibiotic using response surface methodology. Res. Pharm. Sci. xvi, 153 (2021).
-
Gutiérrez-González, M. et al. Optimization of culture weather for the expression of three different insoluble proteins in Escherichia coli. Sci. Rep. 9, 1–xi (2019).
-
Papaneophytou, C. P. & Kontopidis, M. A. Optimization of TNF-α overexpression in Escherichia coli using response surface methodology: Purification of the protein and oligomerization studies. Prot. Expr. Purif. 86, 35–44 (2012).
-
Zhang, D. et al. Loftier-level soluble expression of hIGF-1 fusion poly peptide in recombinant Escherichia coli. Procedure Biochem. 45, 1401–1405 (2010).
-
Heo, K.-A. et al. Functional expression of single-chain variable fragment antibody against c-Met in the cytoplasm of Escherichia coli. Prot. Expr. Purif. 47, 203–209 (2006).
-
Jun, Southward.-A. et al. Functional expression of anti-hepatitis b virus (hbv) pres2 antigen scfv by cspa promoter system in Escherichia coli and application as a recognition molecule for unmarried-walled carbon nanotube (swnt) field effect transistor (fet). Biotechnol. Bioprocess. Eng. 15, 810–816 (2010).
-
Sørensen, H. P. & Mortensen, K. Grand. Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli. Microb. Cell Fact. four, 1–8 (2005).
-
Vasina, J. A. & Baneyx, F. Expression of aggregation-prone recombinant proteins at low temperatures: a comparative study of theescherichia coli cspaandtacpromoter systems. Prot. Expr. Purif. 9, 211–218 (1997).
-
Emamipour, N., Vossoughi, M., Mahboudi, F., Golkar, M. & Fard-Esfahani, P. Soluble expression of IGF1 fused to DsbA in SHuffle T7 strain: optimization of expression and purification by Box-Behnken design. Appl. Microbiol. Biotechnol. 103, 3393–3406 (2019).
-
Soleymani, B. & Mostafaie, A. Assay of methods to improve the solubility of recombinant bovine sex determining region Y poly peptide. Rep. Biochem. Mol. Biol. 8, 227 (2019).
-
Sina, Chiliad., Farajzadeh, D. & Dastmalchi, South. Furnishings of ecology factors on soluble expression of a humanized anti-TNF-α scFv antibody in Escherichia coli. Adv. Pharm. Bull. 5, 455 (2015).
-
Mohamed, G. Due south., Tan, J. S., Mohamad, R., Mokhtar, K. N. & Ariff, A. B. Comparative analyses of response surface methodology and artificial neural network on medium optimization for Tetraselmis sp. FTC209 grown under mixotrophic status. Sci. World J. 2013, i (2013).
-
Rafigh, Due south. M., Yazdi, A. V., Vossoughi, M., Safekordi, A. A. & Ardjmand, Chiliad. Optimization of culture medium and modeling of curdlan production from Paenibacillus polymyxa by RSM and ANN. Int. J. Biol. Macromol. lxx, 463–473 (2014).
Acknowledgements
This work was supported by Shahid Beheshti University of Medical Sciences in Tehran, Iran (IR.SBMU.PHARMACY.REC.1399.08).
Author information
Affiliations
Contributions
A.H. has made substantial contributions to the formulation and pattern. 1000.B. and A.B. performed the experiments. A.H. and M.B. analyzed the data. A.H. wrote the main manuscript. A.H. supervised the project and provided the facilities and materials. All authors agreed to be responsible for the content of the work.
Respective author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional data
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Admission This commodity is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, accommodation, distribution and reproduction in any medium or format, as long equally you give advisable credit to the original author(southward) and the source, provide a link to the Creative Commons licence, and betoken if changes were fabricated. The images or other third party material in this commodity are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the textile. If material is not included in the article'due south Creative Commons licence and your intended apply is non permitted past statutory regulation or exceeds the permitted utilise, yous will need to obtain permission directly from the copyright holder. To view a re-create of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and Permissions
About this article
Cite this article
Hashemi, A., Basafa, Thousand. & Behravan, A. Machine learning modeling for solubility prediction of recombinant antibody fragment in 4 different E. coli strains. Sci Rep 12, 5463 (2022). https://doi.org/ten.1038/s41598-022-09500-six
-
Received:
-
Accepted:
-
Published:
-
DOI : https://doi.org/10.1038/s41598-022-09500-half-dozen
Comments
By submitting a comment you agree to abide past our Terms and Community Guidelines. If you find something abusive or that does non comply with our terms or guidelines delight flag it as inappropriate.
Can Iptg Concentration And Induction Change Solubility Of Recombinant Protein,
Source: https://www.nature.com/articles/s41598-022-09500-6
Posted by: andersonlifee1972.blogspot.com
0 Response to "Can Iptg Concentration And Induction Change Solubility Of Recombinant Protein"
Post a Comment