7
Application of Multivariate Analysis toward Biotech Processes: Case Study of a Cell-Culture Unit Operation Alime Ozlem Kirdar, Jeremy S. Conner, Jeffrey Baclaski, and Anurag S. Rathore* Process Development, Amgen Inc., Thousand Oaks, California This paper examines the feasibility of using multivariate data analysis (MVDA) for supporting some of the key activities that are required for successful manufacturing of biopharmaceutical products. These activities include scale-up, process comparability, process characterization, and fault diagnosis. Multivariate data analysis and modeling were performed using representative data from small-scale (2 L) and large-scale (2000 L) batches of a cell-culture process. Several input parameters (pCO 2 , pO 2 , glucose, pH, lactate, ammonium ions) and output parameters (purity, viable cell density, viability, osmolality) were evaluated in this analysis. Score plots, loadings plots, and VIP plots were utilized for assessing scale-up and comparability of the cell-culture process. Batch control charts were found to be useful for fault diagnosis during routine manufacturing. Finally, observations made from reviewing VIP plots were found to be in agreement with conclusions from process characterization studies demonstrating the effectiveness of MVDA as a tool for extracting process knowledge. Introduction The increasing use of multivariate data analysis (MVDA) in both basic research and applied scientific fields has created a way to examine variable interactions that were previously undefined. Data sets originating from manufacturing of biop- harmaceuticals are complex, and univariate or bivariate analysis can often be inefficient and result in misleading conclusions (Kourti, 2004). Key information in such cases lies in the correlation structure between variables and can lead to spurious results when tested independently. Multivariate data analysis by means of projection methods overcomes challenges associ- ated with such applications, such as multidimensionality of the dataset, multicollinearity, missing data, and variation introduced by disturbing factors such as experimental error and noise (Martin et al., 2002). Principal component analysis (PCA), partial least-squares (PLS), and multiple regression are some of the commonly used projection methods. These methods can project process data on lower dimensional spaces for easy inspection (Kourti, 2004). MVDA is a multistep technique where one can identify clusters, outliers, and trends evident in the process data, permitting subsequent identification of correlations among key variables. Gabrielsson et al. have published a comprehensive review of the various multivariate methods used in pharmaceuti- cal applications (Gabrielsson et al., 2002). Due to the prevalence of batch-based automation systems in the biotech industry, most publications focus on batch-to-batch comparison and detection of abnormal batches (Kourti, 2005). PCA has been applied to near-infrared spectral information derived from scanning un- processed culture fluid samples from a complex antibiotic production process (Vaidyanathan et al., 2001). It was shown that changes in the spectral information that correspond to variations in the bioprocess can be identified and that the loadings and score plots can assist with process diagnosis and rapid assessment of process quality. Shimizu et al. applied a nonlinear multivariate analysis, artificial autoassociative neural network (AANN) for bioprocess fault detection (Shimizu et al., 1998). For the case of a recombinant yeast process with a temperature-controllable expression system, they demonstrated successful use of AANN for detection of faulty temperature sensor and plasmid instability of recombinant cells. U ¨ ndey and C ¸ inar examined use of a multivariate statistical process monitor- ing framework for multistage, multiphase processes for a case study involving production of pharmaceutical granules (U ¨ ndey and C ¸ inar, 2002). Their monitoring scheme allowed for iden- tification of localized faults in process stages/phases and provided a means of decreasing the time required for the assessment of historical batch process data as well as new batches. More recently, several studies have addressed the topic of performing multivariate analysis on data from fermentation and cell-culture operations. Multiway PCA has been applied to assess seed quality using routinely gathered data from the manufactur- ing plant (Cunha et al., 2002). The objective was to investigate the benefits of including seed quality information into data- based models for final productivity estimation in an industrial antibiotic fermentation process. U ¨ ndey et al. developed an integrated online multivariate statistical process monitoring, quality prediction, and fault diagnosis framework for batch processes and applied it on simulated fed-batch penicillin fermentation (U ¨ ndey et al., 2003). They found that unfolding the three-way data array by preserving the variable direction allowed online monitoring without requiring future value estimation, and predicting the values of the end-of-batch quality during the progress of the batch provided useful insight to anticipate the effects of excursions on the final quality. This paper presents results from multivariate analysis of data from small-scale (2 L) and large-scale (2000 L) cell-culture batches. It is shown that MVDA can be extremely useful for supporting key activities for successful manufacturing of * To whom correspondence should be addressed. Ph: 805-447-4491. Email: [email protected]. 10.1021/bp060377u CCC: $37.00 © xxxx American Chemical Society and American Institute of Chemical Engineers PAGE EST: 6.9 Published on Web 01/04/2007

GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

Embed Size (px)

DESCRIPTION

Application of Multivariate Analysis toward Biotech Processes

Citation preview

Page 1: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

Application of Multivariate Analysis toward Biotech Processes: Case Study of aCell-Culture Unit Operation

Alime Ozlem Kirdar, Jeremy S. Conner, Jeffrey Baclaski, and Anurag S. Rathore*

Process Development, Amgen Inc., Thousand Oaks, California

This paper examines the feasibility of using multivariate data analysis (MVDA) for supportingsome of the key activities that are required for successful manufacturing of biopharmaceuticalproducts. These activities include scale-up, process comparability, process characterization, andfault diagnosis. Multivariate data analysis and modeling were performed using representativedata from small-scale (2 L) and large-scale (2000 L) batches of a cell-culture process. Severalinput parameters (pCO2, pO2, glucose, pH, lactate, ammonium ions) and output parameters (purity,viable cell density, viability, osmolality) were evaluated in this analysis. Score plots, loadingsplots, and VIP plots were utilized for assessing scale-up and comparability of the cell-cultureprocess. Batch control charts were found to be useful for fault diagnosis during routinemanufacturing. Finally, observations made from reviewing VIP plots were found to be inagreement with conclusions from process characterization studies demonstrating the effectivenessof MVDA as a tool for extracting process knowledge.

Introduction

The increasing use of multivariate data analysis (MVDA) inboth basic research and applied scientific fields has created away to examine variable interactions that were previouslyundefined. Data sets originating from manufacturing of biop-harmaceuticals are complex, and univariate or bivariate analysiscan often be inefficient and result in misleading conclusions(Kourti, 2004). Key information in such cases lies in thecorrelation structure between variables and can lead to spuriousresults when tested independently. Multivariate data analysisby means of projection methods overcomes challenges associ-ated with such applications, such as multidimensionality of thedataset, multicollinearity, missing data, and variation introducedby disturbing factors such as experimental error and noise(Martin et al., 2002). Principal component analysis (PCA),partial least-squares (PLS), and multiple regression are someof the commonly used projection methods. These methods canproject process data on lower dimensional spaces for easyinspection (Kourti, 2004).

MVDA is a multistep technique where one can identifyclusters, outliers, and trends evident in the process data,permitting subsequent identification of correlations among keyvariables. Gabrielsson et al. have published a comprehensivereview of the various multivariate methods used in pharmaceuti-cal applications (Gabrielsson et al., 2002). Due to the prevalenceof batch-based automation systems in the biotech industry, mostpublications focus on batch-to-batch comparison and detectionof abnormal batches (Kourti, 2005). PCA has been applied tonear-infrared spectral information derived from scanning un-processed culture fluid samples from a complex antibioticproduction process (Vaidyanathan et al., 2001). It was shownthat changes in the spectral information that correspond tovariations in the bioprocess can be identified and that theloadings and score plots can assist with process diagnosis and

rapid assessment of process quality. Shimizu et al. applied anonlinear multivariate analysis, artificial autoassociative neuralnetwork (AANN) for bioprocess fault detection (Shimizu et al.,1998). For the case of a recombinant yeast process with atemperature-controllable expression system, they demonstratedsuccessful use of AANN for detection of faulty temperaturesensor and plasmid instability of recombinant cells. U¨ ndey andCinar examined use of a multivariate statistical process monitor-ing framework for multistage, multiphase processes for a casestudy involving production of pharmaceutical granules (U¨ ndeyand Cinar, 2002). Their monitoring scheme allowed for iden-tification of localized faults in process stages/phases andprovided a means of decreasing the time required for theassessment of historical batch process data as well as newbatches.

More recently, several studies have addressed the topic ofperforming multivariate analysis on data from fermentation andcell-culture operations. Multiway PCA has been applied to assessseed quality using routinely gathered data from the manufactur-ing plant (Cunha et al., 2002). The objective was to investigatethe benefits of including seed quality information into data-based models for final productivity estimation in an industrialantibiotic fermentation process. U¨ ndey et al. developed anintegrated online multivariate statistical process monitoring,quality prediction, and fault diagnosis framework for batchprocesses and applied it on simulated fed-batch penicillinfermentation (U¨ ndey et al., 2003). They found that unfoldingthe three-way data array by preserving the variable directionallowed online monitoring without requiring future valueestimation, and predicting the values of the end-of-batch qualityduring the progress of the batch provided useful insight toanticipate the effects of excursions on the final quality.

This paper presents results from multivariate analysis of datafrom small-scale (2 L) and large-scale (2000 L) cell-culturebatches. It is shown that MVDA can be extremely useful forsupporting key activities for successful manufacturing of

* To whom correspondence should be addressed. Ph: 805-447-4491.Email: [email protected].

10.1021/bp060377u CCC: $37.00 © xxxx American Chemical Society and American Institute of Chemical EngineersPAGE EST: 6.9Published on Web 01/04/2007

Page 2: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

biopharmaceutical products including scale-up, process com-parability, process characterization, and fault diagnosis.

Theory

Multivariate Data Analysis (MVDA). Table 1 presents thenomenclature commonly used in PLS analysis. As seen in Figure1, batch processes yield a matrix,X, which can be illustratedas a three-dimensional data table composed of data collectedfor each process variable (K) over a defined time interval (J)for a number of batches (N). A number (M) of final results(such as titer and product quality) are designated in the datatableY.

The approach of batch statistical process control (BSPC) usedin this paper is illustrated in Figure 2. This approach uses allavailable batch data and analyzes data on two levels, observationlevel and batch level. Data analysis using MVDA of batch dataenables the correlation structure among measured variables tobe investigated, while separating representative (good) batchesfrom non-representative (bad) ones. By understanding theproperties that dominate a process, comparing batch-to-batchvariations enables real-time process monitoring and early faultdetection.

When using partial least-squares (PLS) as the projectionmethod, a space with K and M dimensions is created for eachmatrix (X and Y). Every observation in a dataset can bevisualized as one point inX-space and another point in theY-space. Thus, hyperplanes called principal components arecalculated to provide maximum correlation between points inthe X- andY-spaces. Each observation is then projected ontothis hyperplane and translated into latent variables known asscores (t for X-space projections,u for Y-space projections).On the basis of these score values, weights are assigned toexpress the correlation between points in theX- andY-spaces,respectively. Weights (w for X-space projections,c for Y-spaceprojections) are assigned on the basis of the variable’s influenceon the model at any given point in the batch evolution.

Modeling Approach. The approach followed in this paperis that of observation level modeling and involves unfolding ofthe three-dimensional batch data table illustrated in Figure 1.The unfolding is performed while preserving variable directionto a two-way data table. This type of unfolding has previouslybeen shown to be successful for detection of abnormal batches(Wold, 2001; Undey et al., 2003). As mentioned above,modeling was performed by projecting observations on thehyperplanes and translating them into latent variables. PLS wasused to relate the process data to a maturity-relatedY-variablerepresenting relative local batch time. This provides an ap-propriate maturity index model that can be used to explore howfar a batch has progressed. Next, a model diagnostics step isperformed to check that there are no outliers among the referencebatches. Outlier batches are excluded if a clear abnormality isfound, and the model building process is then repeated. Finally,the batch prediction control charts (scores giving the trajectory

of a batch) are created to differentiate between representativeand non-representative test set batches.

Materials and Methods

Software. A commercially available MVDA software pack-age, SIMCA-P+ 11 version 11.0.0.0 (Umetrics AB, Kinnelon,NJ) was used to perform the multivariate analysis. Prior toanalysis in SIMCA, process data were assembled in Excel(Microsoft, Redmond, WA).

MVDA Application to Cell Culture Process Dataset.Theflowchart illustrated in Figure 2 presents an overview of theobservation level batch modeling when working with theSIMCA P+ 11 software. Multivariate data analysis using theSIMCA-P+ software is a multistep process. First, the data mustbe imported into the program from a spreadsheet. Second, thedata must be preprocessed to remove outliers (data pointsstrongly deviating from normal process behavior) and accountfor missing values. Default tolerance limit in SIMCA-P+software for variables/observations for fit and prediction is 50%.When the missing values of a variable/observation exceed thistolerance limit, the software prompts for including or excludingthe variable or observation. This is largely a manual processand requires a great deal of familiarity with the source datasetand the process itself. Variables with 100% missing values areautomatically excluded. For example, in the cell culture dataused in this paper, a couple of outliers in the 2000 L datasetwere discovered and true assignable root causes were investi-gated. Post-investigation, some of these data were excluded fromthe analysis. This step of preparing the data prior to fitting amodel is one of the most important steps in the batch modelingprocess. Third, an observation level model is fit to the data.This step is largely automated through SIMCA-P+. Thescientist/engineer can then review the model using severalvisualization tools and determine if it is appropriate. If necessary,individual observations can be eliminated, and a new modelcan be fit to the remaining data. The resulting model should bean appropriate fit to the dataset, and from this model, variousstatistics, control charts, and reports can be generated.

A large dataset is available for each cell culture run, includingcontinuous online measurements of environmental and operatingparameters, as well as daily measurements of metabolic andcell growth parameters. Success in diagnosis of batch abnor-malities at the 2 L scale for a similar process by PCA analysisusing only the online environmental data has been reportedpreviously (Gunther et al., 2006). In the case study presentedhere, daily offline metabolic and cell growth measurements from14 center point runs (2 L scale) and 11 center point runs (2000L scale batches) were analyzed separately by PLS. MVDAanalysis and modeling was performed using representative small-scale (2 L) and large-scale (2000 L) batches to select key processvariables. Several input parameters (pCO2, pO2, glucose, pH,lactate, ammonium ions) and output parameters (percent purity,viable cell density, percent viability, osmolality) were includedin the analysis.

Summary of Fit: Model Overview. Each variable in theprocess data was first centered with respect to their means andthen scaled to unit variance. During the creation of the workingdata, local batch time was designated as theY-variable in thePLS model. The resulting PLS scores are new variables thatcapture linear (t1), quadratic (t2), and cubic (t3) relationshipsto local batch time. While it is important to select all variablesthat are of significance to the unit operation, the noise in themodel increases with the number of variables, and so processknowledge should be used to select the appropriate variables.

Table 1. PLS Modeling Nomenclature (Umetrics, 2005)

X-space multidimensional space formed byX-variablesY-space multidimensional space formed byY-responsesN observations or no. of batchesJ sampling time point intervalK multidimensionalX-variables: “factors” or “predictors”M multidimensionalY-responsest/u scores forX- andY-space, respectivelyw/c weights forX- andY-space, respectivelyR2 explained variation (goodness of fit)Q2 predictive ability of the model (goodness of prediction)

B

Page 3: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

The number of PLS components to be used to explain thesignificant variation inX is determined by the software. Figure3 illustrates the tradeoff between R2 and Q2 and demonstratesthe optimal balance of the two parameters. The R2 parameterrepresents the explained variation or goodness of fit of themodel. Q2 is the goodness of prediction and represents thepredicted variation. SIMCA P+ performs default cross-valida-tion (internal validation) and continues to add principal com-ponents until Q2 does not improve, and automatically avoidsover-parametrization or modeling of the noise. In the observationlevel model, it is necessary to extract enough latent variablesto explain sufficient variation in theX-block, while also ensuringadequate predictive power. R2 of X is important to confirm thatif the model has sufficient variation data representing % of the

X-variation counted for by the model. Q2 of Y provides theconfirmation of the correlation betweenX and Y. R2 of Y isuseful to confirm process dynamics.

Prediction of New Batches.The final step in batch modelingis importing a secondary dataset with new batches and compar-ing the new batch performance against the PLS model. Next,the PLS model can be used to generate batch control charts foruse in testing the fit of new batches. SIMCA-P contains aconvenient menu item for computation of predictions.

Results and Discussion

Table 2 presents the various MVDA outputs that are availablefrom the analysis. In the following, we will discuss these outputsfor the cell-culture process under consideration.

Score Plot.A visual summary of the process behavior overtime can be seen in the score plot, where the score vectors forthe first two principal components, t[1] and t[2], are plottedagainst each other (Cunha et al., 2002; Martin and Morris, 2002;Vaidyanathan et al., 2001). It is important to note the percentagevariability explained by these first two principal componentswhen interpreting the significance of the plot. It is common toalso plot an ellipse on this set of axes to represent the HotellingT2 95% confidence interval. The observations situated outsidethe ellipse are outliers. Hence, when performing processmonitoring, data that lie outside the ellipse can be consideredout of control.

Figure 4 presents the score plot for the 2 L scale data. It isseen that almost all of the data points fall within the ellipse,indicating an acceptable fit for the model. Further, a trajectoryfor the process becomes evident from the time evolution of the

Figure 1. Illustration of batch process data.

Figure 2. Overview of batch statistical process control (BSPC) via observation level modeling.

Figure 3. Variation in R2 and Q2 with number of variables includedin the analysis (Umetrics, 2005).

C

Page 4: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

data points (denoted in color). Since the batch time points shouldcoincide, the score plot provides a fingerprint of the process.When sets of score vectors for several batches are plotted onthe same set of axes, process abnormalities are often indicatedby time points for faulty batches that are far from the clusterrepresenting the normal position for the respective time point.

Figure 5 illustrates the score plot for the 2000 L scale. It isseen that the batch progression at the 2000 L scale is very similarto that at the 2 L scale. Location of the batch for a certain stageof the process is maintained across scales. The blue data pointsare in the first quadrant, the green in the second and third, theyellow in the third and fourth, and finally, the red in the fourth.This pattern can be useful for process monitoring as it canidentify the stage of the process at which a batch starts to deviatefrom typical performance.

Loadings Plot. PLS loadings are computed from the cor-relation between each of the x variables and the principalcomponents. Hence this plot between principal component 1(p1) and principal component 2 (p2) displays how thesevariables correlate with t1 and t2. The variables with largestabsolute values of p1 and/or p2 are situated far away from theorigin (on the positive or negative side) on the plot and dominatethe projection. Thus, for input parameters, the further we are

from the center (0,0) in the loadings plot, greater is the impactvarious parameters have on the performance of the cell culture.For output parameters, the further we are from center (0,0)greater is the impact the cell-culture process has on the outputparameter. Also, this plot illustrates how the different parametersare correlated with respect to each other. Variables near eachother (in the same quadrant) are positively correlated; variablesopposite to each other (opposite quadrants) are negativelycorrelated (Martin and Morris, 2002).

Figure 6 presents this plot for data from the 2 L scale. Thecell culture process does have a significant impact on VCD,titer, and viability. Since titer and VCD are in the same quadrant,this suggests that at earlier stages of the culture when the VCDis lower, the titer is lower. However, as the culture progresses(later stages in the culture) the productivity (titer) will increaseas the VCD increases not only because there are more cells butalso because the cells are in the production stage. Also, becausethe viability and VCD are in the opposite quadrants, this impliesthat at earlier stages of the culture when the VCD is lower, theviability is higher and vice versa. The input parameters pCO2,pH, NH4

+, glucose, and lactate levels have a significant impacton the performance of the cell culture process. Also, pH,

Figure 4. Scores scatter plot (t1 vs t2) for 2 L batches along with the 95% tolerance ellipse.

Table 2. Overview of MVDA Diagnostic Plots (Umetrics, 2005)

plot example theory

scores plots t[1]/t[2] windows in theX-space displaying the observationsas situated on the hyperplane

u[1]/u[2] windows in theY-space displaying the observationsas situated on the hyperplane

u[1]/t[1] display observations in the projectedX(T) andY(U)space; shows how well theY-space correlates withtheX space

loadings plots w*1/w*2 X* weightsc1/c2 Y weightsw*c[1]/w*c[2] shows bothX andY-weights; determines how the

X andY variables combine in the projections andhow theX variables relate toY

contribution plots “Point and Click” (or use contribution tool) onany other plot to see variable contribution

examines variable interactions and influence onthe batch model at any given point during batchevolution

batch control charts num/t[1] most variation is modeled by the first fewcomponents (contribute most to R2 and Q2);no anomalies should be seen in the first two tothree scores

variable control charts VCD vs run time ( 3 SD and model average shown by default

D

Page 5: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

glucose, and pCO2 levels have a similar effect on the perfor-mance of the cell-culture process, whereas lactate has the reverseeffect.

Figure 7 presents the loadings plot for the data from the2000 L scale. For the most part, the loadings plot is similar tothat for the 2 L scale. There are some changes in the loadingsfor osmolality and NH4+ levels suggesting changes in the cell-culture performance upon scale-up. It is well-known in theliterature that gas transfer is less efficient at large scale, leadingto buildup of CO2 in the cell culture vessel. This results in anincreased use of base in order to maintain pH at the intendedset point, and the increased base addition leads to higherosmolality (Gray et al., 1996; Mostafa and Gu, 2003; Zhu etal., 2005). This plot can be useful in providing a qualitative

assessment of success of scale-up and process comparabilityacross scales, equipment, or sites. For the case under consid-eration, it is seen that the scale-up is successful with respect tothe most significant process parameters, i.e., product quality andtiter.

An additional interpretation can be done, using the score andloadings plots together since they complement each other. Theposition of objects in a given direction in a score plot isinfluenced by the variables lying in the same direction in theloading plot. Thus we can use the loadings plot to interpretdirection and differences between observations in scores plot.For instance, we can observe that pH, pCO2, and pO2 are attheir highest levels on Day 1.

Variable Importance for the Projection (VIP) Plot. Thisplot summarizes the observations made from the score andloading plots by showing the relative importance of each

Figure 5. Scores scatter plot (t1 vs t2) for 2000 L batches along with the 95% tolerance ellipse.

Figure 6. PLS loadings plot for 2 L batches.

Figure 7. PLS loadings plot for 2000 L batches.

Figure 8. Variable importance for the projection plot for 2 L batches.

Figure 9. Variable importance for the projection plot for 2000 Lbatches.

E

Page 6: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

included variable in the analysis. The VIP values reflect theimportance of the terms in the model both with respect toY,i.e., its correlation to all the responses, and with respect toX(the projection).

Figure 8 presents this plot for the 2 L scale dataset. Someinteresting observations can be derived upon comparing the VIPplot to the loadings plot. The 2 L PLS model has extracted 5principal components, where principal component 1 explainsthe greatest of the variation in the model (86%) and thesubsequent components explain decreasing amounts of variation.As mentioned above, the loadings plot is a plot betweenprincipal component 1 (p1) and principal component 2 (p2). Itis seen that this cell-culture process has a strong influence on

the titer, VCD, and viability of the broth at the end of theprocess. For input parameters, pH, lactate, and glucose levelsare found to have the most impact on process performance. Onthe contrary, osmolality has one of the lowest loadings in bothprincipal component directions. Further, pCO2 and NH4

+ havehigher loadings in the principal component 2 direction and resultin a lower ranking on the VIP plot. It can be concluded that thefindings from the VIP and the loadings plot are in agreementwith each other.

Figure 9 presents the VIP plot for the 2000 L scale dataset.The 2000 L PLS model also extracted 5 principal components,where principal component 1 explains the greatest of thevariation in the model (84%) and the subsequent components

Figure 10. Batch control chart for 2 L batches.

Figure 11. Batch control chart for combined data from 2 and 2000 L batches.

Table 3. Summary of Utilization of MVDA Technique for Biopharmaceutical Manufacturing

MVDA output purpose potential usage unique attributes

score plot illustrate overall batch evolution trends;determine groups, trends, outliers andsimilarity within the dataset; ellipticalconfidence interval based on Hotelling T2;no anomalies should be seen in mostinfluential scores plots

scale-up, processcomparability, faultdiagnosis

can pin point the timing(w.r.t. batch evolution) ofatypical behavior

loadings plot determine variables exerting most influenceon batch evolution; complements therespective scores plot

scale-up, processcomparability, processcharacterization

qualitative assessment

VIP plot indicate which variables are the greatestcontributors to a given process shift;good starting point for fault analysis

scale-up,process comparability,process character-ization

quantitative assessment

batch controlchart

detect outlying observations with respectto the control limits

scale-up, processcomparability, faultdiagnosis

can pin point the timing(w.r.t. batch evolution) ofatypical behavior

F

Page 7: GDS International - Next - Generation - Pharmaceutical - Manufacturing - Summit - Europe - 4

explain decreasing amounts of variation. It is seen that therelative importance of the variables is unchanged for the mostof the variables. Most notably, the VIP score for the two mostsignificant parameters for this process, namely, titer and VCD,is the same at the 2 and 2000 L scales. An exception is pO2

level, which as mentioned above has a more significant impactat large scale. The VIP plots can be very useful for a quantitativeassessment of success of scale-up and process comparabilityacross scales, equipment, or sites.

Batch Control Chart. This chart is useful for identifyingthe time point at which a batch may deviate from normal processbehavior. It consists of a set of upper and lower limits calculatedat ( 3 SD, and a trajectory for which the process is expectedto follow. A new batch can be overlaid, and if the control limitsare violated, it can be reasonably expected that the batchperformance is abnormal.

Figure 10 presents the chart for the 14 small scale batchesthat were analyzed. It is evident from the overlay of the batchesthat the process is a well-controlled process.

A MVDA model was created using the 14 representativesmall-scale batches (2L), and then used to predict performanceat the 2000 L scale. The results are presented in Figure 11 andshow that the process at 2000 L is indeed very representativeof the 2 L process. Of the 11 large scale batches shown in theplot, only two of the batches deviate slightly from the controllimits between 12 and 13 time units. Further investigationindicated that one of these two lots had the highest VCD andthe other highest lactate levels compared to all of the runs. Thus,these two batches were somewhat “atypical”, illustrating theusefulness of the batch control charts in fault diagnosis duringmanufacturing including events such as equipment failures andraw material issues (Kourti, 2005). Like the score plots, thebatch control chart can also indicate the stage of the process atwhich a batch starts to deviate from typical performance.

Comparison of Results to Process Characterization.Dur-ing the course of process characterization, design of experiments(DOE) studies were performed at small scale to investigate theeffect of different input parameters on the performance of thiscell-culture process (data not shown here). Product titer andviability were found to be the most significant output parametersfor this process. Further, pH and pO2 levels were found to havethe most impact on the performance of this process. Theseobservations are very aligned with the earlier discussion on theVIP plots in Figures 8 and 9. Although MVDA cannot replacethe rigor of planned DOE studies, it certainly can be used as atool that provides process knowledge for guiding a more efficientprocess characterization effort, i.e., specifying which inputparameters should be examined during process characterizationstudies.

Conclusion

The purpose of this paper was to examine the usefulness ofMVDA with respect to the various activities that are involvedin biopharmaceutical manufacturing, including scale-up, processcomparability, process characterization, and fault diagnosis.Table 3 summarizes the utility of the different MVDA outputs

with respect to the above-mentioned aspects of manufacturing.The score plot illustrates the evolution of batch over time andcan be useful in identifying the stage of the process (vs time)when a batch deviates from typical performance. The loadingsplot presents a qualitative comparison of the loadings of thedifferent variables under consideration. The VIP plot providesa more quantitative assessment of the relative importance ofthe different variables under consideration. The batch controlchart offers a tool for monitoring the evolution of a batch andcomparison to control limits representing typical performance.

At present, alot of data are collected at small and large scalethat does not undergo the rigorous data analysis presented here.This and future publications attempt to demonstrate the powerof MVDA that enables us to extract useful process informationthrough analysis of the readily available data in order tomaximize our understanding of the process.

References and NotesCunha, C. C. F.; Glassey, J.; Montague, G. A.; Albert, S.; Mohan, P.

An assessment of seed quality and its influence on productivityestimation in an industrial antibiotic fermentation.Biotechnol.Bioeng.2002, 78, 658-669.

Gabrielsson, J.; Lindberg, N.-O.; Lundstedt, T. Multivariate methodsin pharmaceutical applications.J. Chemom.2002, 16, 141-160.

Gray, D. R.; Chen, S.; Howarth, W.; Inlow, D.; Maiorella, B. L. CO2

in large-scale and high-density CHO cell perfusion culture.Cyto-technology1996, 22, 65-78.

Gunther, J.; Seborg, D. E. Fault detection and diagnosis in industrialfed-batch cell culture.ADCHEM 2006, Gramado, Brazil.

Kourti, T. Process analytical technology and multivariate statisticalcontrol. Part 1: Process Anal. Technol.2004, 1, 13-19. Part 2:Process Anal. Technol.2005, 2, 24-28. Part 3: Process Anal.Technol.2006, 3, 18-24.

Martin, E. B.; Morris, A. J. Enhanced bio-manufacturing throughadvanced multivariate statistical technologies.J. Biotechnol.2002,99, 223-235.

Mostafa, S. S.; Gu, X. Strategies for improved dCO2 removal in large-scale fed-batch cultures.Biotechnol. Prog. 2003, 19, 45-51.

Shimizu, H.; Yasuoka, K.; Uchiyama, K.; Shioya, S. Bioprocess faultdetection by nonlinear multivariate analysis: application of anartificial autoassociative neural network and wavelet filter bank.Biotechnol. Prog.1998, 14, 79-87.

SIMCA-P and SIMCA-P+ 11 User Guide and Tutorial, Version 11.0;Umetrics AB: Umea, Sweden, 2005.

Undey, C.; Ertunc, S.; Cinar, A. Online batch/fed-batch processperformance monitoring, quality prediction, and variable-contributionanalysis for diagnosis.Ind. Eng. Chem. Res.2003, 42, 4645-4658.

Undey, C.; Cinar, A. Statistical monitoring of multistage, multiphasebatch processes.Control Syst. Mag. IEEE. 2002, 22, 40-52.

Vaidyanathan, S.; Arnold, S. A.; Matheson, L.; Mohan, P.; McNeil,B.; Harvey, L. M. Assessment of near-infrared spectral informationfor rapid monitoring of bioprocess quality.Biotechnol. Bioeng.2001,74, 376-388.

Wold, S.; Sjostrom, M.; Eriksson, L. PLS-regression: a basic tool ofchemometrics.Chemom. Intell. Lab. Syst.2001, 58, 109-130.

Zhu, M. M.; Goyal, A.; Rank, D. L.; Gupta, S. K.; Boom, T. V.; Lee,S. S. Effect of elevated pCO2 and osmolality on growth of CHOcells and production of antibody-fusion protein B1: a case study.Biotechnol. Prog.2005, 21, 70-77.

Received December 8, 2006. Accepted December 12, 2006.

BP060377U

PAGE EST: 6.9 G