The overall goal of the following experiment is to generate predictive models for transient expression of proteins in plant leaves. This is achieved by setting up an experimental design to identify the most important parameters for transient expression and quantify their impact on protein accumulation. As a second step, genes are cloned and transferred into agro bacteria, which are then injected into leaves resulting in transient protein expression.
Next leaf samples are analyzed in order to determine protein expression levels. Results are obtained that show the pattern of transient protein expression levels in leaves of different ages and for different incubation conditions based on a design of experiment model evaluation. The main advantage of this technique over existing methods like one factor at a time, is that interactions between different parameters like leave age or positional leaf can be detected and quantified.
The one factor at a time approach that is often used to characterize the effect of certain factors on the outcome of an experiment is suboptimal because the individual runs during an experiment will be aligned like pearls on a string, thus achieving a low coverage of the design space. In contrast with the design of experiments of DOE strategy, the variation of more than one factor at a time enhances the coverage and thus the precision of the resulting models. Furthermore, the biased design space coverage in one factor at a time.
Experiments can also fail to identify optimal operating regions and predict suboptimal solutions, whereas DOE strategies are more likely to identify preferable conditions. This flow chart illustrates the process for planning A DOE strategy. The first step is to identify relevant factors and responses for inclusion in the design.
In this demonstration expression levels of a model anti HIV monoclonal antibody two G 12, and a fluorescent marker, protein DS red will be measured based on previous experiments. The minimum detectable difference regarded as relevant will be 10 micrograms per milliliter for two G 12 and 20 micrograms per milliliter for DS red. In addition, approximate values for the estimated standard deviation of the system for two G 12 and DS red will be four and eight micrograms per milliliter respectively.
The remaining steps for planning this DOE strategy will not be discussed here, but details can be found in the accompanying manuscript. Two G 12 and DS red will be transiently expressed in tobacco plants. To begin the procedure for plant cultivation, prepare 10 by 10 by eight centimeter rock wall blocks by flashing them extensively with deionized water to remove residual chemicals.
After that, equilibrate the blocks with a freshly prepared solution of fertilizer seed tobacco plants by placing one to two tobacco seeds on each rock wall block, followed by a short flush with fertilizer. Be careful to avoid washing the seeds away, germinate and cultivate the tobacco plants for 42 days in a greenhouse under appropriate conditions. Prepare the AUM FASS by growing a culture to an OD 600 nanometers of 5.0.
Dilute the culture with water and twofold, infiltration medium to match the OD 600 nanometers required for injection prior to injection. Confirm the OD 600 nanometers of the atum of fasion suspension to prepare the leaves for injection. Gently scratch the epidermis at the injection site with a pipette tip to facilitate the influx of AUM Fasion solution.
Avoid rupturing the leaf blade while doing so. Hold the syringe containing the AUM fasion suspension perpendicular to the leaf blade, touching the barrel against the intercostal field to be treated gently push the outlet onto the bottom side of the leaf. At the same time, press the top side of the leaf gently to prevent the leaf blade from shifting or rupturing.
Gently push down the syringe piston. The AUM fashion solution will enter the intercellular spaces within the leaf blade as indicated by the treated areas appearing darker, green and moist. Make sure the syringe remains perpendicular to the leaf during the injection.
Otherwise, the bacterial suspension may spurred out under high pressure. Repeat this procedure at several positions until the whole intercostal field is infiltrated with aum, aum, fass. Then continue with the next intercostal field after injection, incubate the plants under conditions determined by the DOE when the incubation period is complete.
Begin sampling. Stabilize the leaf with a handheld paper towel and use a cork borrower to remove four to five leaf discs from the treated intercostal fields at the positions and times indicated by the DOE. Do not remove the whole leaf from the plant during sampling.
Determine the mass of each sample and place it in a 1.5 milliliter plastic reaction tube labeled with the sample name and mass. Store the samples at negative 20 degrees Celsius or negative 80 degrees Celsius before protein quantitation. The process may be paused at this stage for several months, depending on the sample stability and the storage temperature to extract proteins from the leaf disc samples.
Add three milliliters of extraction buffer per milligram of sample mass and grind leaf discs in the reaction tube using an electric pestle until no large fragments remain. To avoid overheating the sample, place the tube on ice whenever the tube feels warm after centrifugation. To remove dispersed solids, transfer the supernatant to a clean 1.5 millimeter reaction tube.
Measure the DS red fluorescence twice sequentially. In a 96 well played reader fitted with 530 25 nanometer excitation and 590 35 nanometer emission filters for each sample. Average the fluorescence over the two reads and the three technical replicates and subtract the value recorded for the blank control containing zero micrograms per milliliter DS red.
Also subtract this value from the reeds of the standard dilutions and use these blank corrected values for a linear regression yielding a reference curve. The slope of the reference curve is then used to convert the fluorescence measured for the samples into DS red concentrations. The procedure for determining the concentration of the two G 12 antibody will not be shown here, but is detailed in the accompanying manuscript.
Design Expert software is used for data analysis and evaluation in the analysis node. Choose the response to be analyzed and initially select none in the transformation tab. Continue to the fit summary tab, which provides general information about factors that are important for the system under investigation.
The software will suggest an initial model based on its significance in the model tab. An initial model is preselected based on the fit summary results. Use the automated mode to edit this model in the Innova tab.
Investigate the suggested model and the included factors if necessary. Manually remove any factors with P values above a predefined threshold or those that are unlikely based on mechanistic consideration by switching back to the model tab, changing selection to manual and eliminating the appropriate factors from the model. Continue to the diagnostics tab to confirm the quality of the model and detect potential outliers in the dataset that have a strong influence on the model.
By examining all the tabs in the diagnostics tool, adjust the transformation type in the according tab if suggested by the box Cox plot and restart the analysis procedure in the model graphs tab. Visualize the evaluated model for a limited number of numeric factors such as three. The response surface representation is useful to assess optimal characteristics.
Manually response surfaces only illustrate the impact of two factors on the response under investigation. The effect of any additional factor on the response is revealed by changing its value or level in the factors tool window. Alternatively, factors can be assigned to the plot axis by right clicking them in the factors tool window and selecting the desired independent variable axis.
Manipulate the factor levels and assign them to the graph coordinates using the factors tool. Export graphs using the export graph to file command in the file tab. Use the numerical sub node in the optimization node to optimize the desired response numerically, depending on the model factors to which in turn certain constraints can be applied via the criteria tab.
Calculate and examine numerical solutions in the solutions tab based on the input provided in the criteria tab. Export these solutions to other software such as a spreadsheet for further analysis revealing the factor settings associated with high or low response values. This is helpful if more than three numeric factors are investigated, and 3D representation is difficult In this representative study.
The DOE strategy was used to examine the effects of different promoters and five prime RS on transient expression of ds. Red factors included in the transient expression model and the investigated ranges are shown in this table. The factors in bold are unique to this experiment.
The factors in italics are for another experiment that will be described later. At least three levels were selected for all discreet numeric factors to allow the calculation of a quadratic base model. A de optimal selection algorithm was chosen for the selection of the DOE runs to obtain the most accurate estimates for the coefficients of the regression model.
The design initially suggested by design expert consisted of 90 runs, but the FDS was insufficient to achieve a 1%standard error of prediction. De optimal augmentation of the design to a total of 210 runs resolved this issue and resulted in an FDS of 100%with more uniform prediction accuracy across the design space indicated by the flat curve, the DS red concentrations were determined for all 210 runs and the data were log 10 transformed. Model factors were chosen by automated backwards selection from a cubic model with an alpha level of 0.100.
This resulted in a significant model with insignificant lack of fit and high values for the multiple correlation coefficients. The P value of all model factors was less than 0.05, and thus no further manual manipulation of the model was required. The model contained three factor interactions highlighted in bold that were not part of the initial quadratic base model reevaluation of the FDS graph.
Using all factors included in the final prediction model revealed that the FDS for the standard error of prediction had not significantly diminished by including the additional three factor interactions. The model quality diagnostic tools in design expert indicated that the data transformation was useful and there were no missing factors in the model because the normal plot of residuals showed linear behavior and no specific pattern was observed in the residuals versus predicted plot. There was also no trend throughout the course of the experiment to indicate a hidden time dependent variable.
Instead, the model predictions were in very good agreement with the observed Diaz red fluorescence as all points lie close to the diagonal. It was therefore assumed that the selected model was useful to predict the transient expression of DS red in non leadin tobacco leaves derived by different promoter five prime UTR combinations during a post infiltration incubation period lasting eight days, and artificial linear regression model without data transformation was also selected to illustrate the consequences of wrong factor selection and transformation. As clearly seen here, the normal plot of residuals deviates from the expected linear behavior and there is a V-shaped pattern in the residuals versus predicted plot instead of a random scatter.
Additionally, the residuals versus run plot highlights two extreme values. While predictions were poor for both small and high values deviating from the diagonal, the optimal model response surfaces for transient DS red expression in tobacco leaves are shown here. The model predicted that leaf age was a significant factor with expression levels lower in old leaves, for example, leaf two in plot A and plot B compared to young leaves such as leaf six in plot C and plot D.The progression of DS red accumulation in leaves was not linear or exponential, but followed a sigmoidal curve during the eight days of post infiltration incubation.
The five prime UTR combinations with the CAMV 35 SS promoter resulted in stronger DS red expression than combinations with the NOS promoter. Although the five prime UTR also had a significant impact on DS right expression as shown by the comparison between TL and CHS, the expression strength was dependent on the accompanying promoter. The predictive model also indicated that certain pairs of promoter five prime UTR combinations such as nas CHS and CAMV 35 SS CHS resulted in balanced expression levels differing by less than 30%from a defined ratio across all leaves and incubation times of greater than two days.
Such balanced expression would be useful for the expression of multimeric proteins with a defined stoichiometry. The DOE approach was also used to optimize the incubation conditions and harvest schemes for the simultaneous production of two G 12 and DS red in tobacco. The factors affecting transient expression that were included in this experiment are in italics od 600 nanometers and incubation time.
A predictive model was established for the expression of each protein in plants at different ages. Young leaves were harvested at 40 days after seeding old leaves were harvested at 47 days after seeding. These four models were then evaluated and a consensus model established that included each factor found to be significant in the individual models.
It was subsequently confirmed that the consensus model was still a good representation of all initial data sets. The consensus model was subsequently used to identify optimal incubation temperatures and bacterial OD 600 nanometers for both proteins to predict protein concentrations in all leaves and leaf positions in young and old plants. Integration of the concentration profiles with biomass data then resulted in absolute protein yield.
Absolute protein amounts were then correlated with associated downstream costs, allowing a cost benefit analysis for the processing of each leaf per plant age. The same amount of DS red and approximately 65%of two G 12 was found in young plants compared to old ones, despite an approximately 50%lower average biomass reflecting the higher specific protein expression in young plants. This revealed that young plants were advantageous for transient expression because the proteins reached higher concentrations during shorter growth periods despite the lower overall biomass compared to old plants.
Lastly, it was also found that processing all the leaves from old plants was more expensive than discarding leaves one through three and increasing the number of plants per batch instead. Hence, DOE based models are suitable not only to mark the final step of an experiment, but also for combination with other data to facilitate more complex aspects of process analysis. After watching this video, you should have a good understanding of how to set up conduct and analyze A DOE to investigate transient protein expression in plants.