Development of simple random mutagenesis protocol for the protein expression system in Pichia pastoris

We have developed a novel, rapid random mutagenesis strategy that enables the P. pastoris expression system to be used for directed evolution of eukaryotic proteins, despite
its low transformation efficiency. This method is composed of sequential plasmid amplification
by means of error-prone RCA and MDA of the error-prone RCA products, followed by protein
expression in P. pastoris (Fig. 1).

Error-prone rolling circle amplification

Phi29 DNA polymerase amplifies circular DNAs isothermally and yields linear DNAs composed
of tandem repeats of the circular DNA plasmids 28]. In error-prone RCA, the gene coding a target protein is first cloned into an expression
vector and then the whole vector is amplified in the presence of Mn
2+
to introduce mutations of the target gene 23], 24]. In this study, various amounts of circular pGAPZ? vectors including cel6A gene (50–500 pg) were amplified in the presence of 0–2 mM MnCl
2
.

The products thus obtained appeared at positions corresponding to more than 10 kbp
in agarose gel electrophoresis. Therefore, the products were digested with restriction
enzyme (BlnI) to afford linear fragments of 4.4 kbp, which is identical to the sum of vector
and insert. The yield of the products decreased with increase of Mn
2+
concentration and also with decrease of the initial amount of template, as shown in
Fig. 2a. The largest amount of amplified plasmids estimated from gel electrophoresis was ~1 ?g
in a total volume of 10 ?l. However, plasmids obtained in the presence of 2 mM Mn
2+
were less than 10 ng/µl, and no band of the expected size appeared in the presence
of 4 mM Mn
2+
(data not shown).

thumbnailFig. 2. Yield of the error-prone RCA and MDA products after digestion with restriction enzyme.
Various amounts of plasmid (pGAPZ?/cel6A) were amplified with MnCl
2
(a), and the obtained error-prone RCA products were amplified without MnCl
2
(b). Amplified DNAs were digested with restriction enzyme BlnI, which cleaves a single
site in the vector. The DNA separated by agarose gel electrophoresis showed a single
plasmid-sized band (4.4 kb)

The amount of the error-prone RCA products shown in Fig. 2a is more than enough for transformation in the E. coli expression system, as reported by Fujii and coworkers 23]. E. coli needs only a ng-level amount of plasmids for transformation, but on the other hand,
it is not enough for the Pichia expression system. The transformation efficiency of E. coli is generally 10
8
–10
11
transformants per µg of DNA, on the other hand, the electroporation of P. pastoris yields 10
3
–10
4
transformants per µg of linearized DNA 6], 29]. The low transformation efficiency of P. pastoris is unavoidable because the yeast cell has thicker walls compared with Bacteria and
plasmid must enter through the walls and be integrated into specific locations in
the chromosome. The difference in the vector integration results in the increased
stability of expression strains but the reduced transformation efficiency.

Multiple displacement amplification of the error-prone RCA product

To obtain plasmids with random mutation on a larger (?g) scale, we used Phi29 DNA
polymerase again for amplification of the linear DNA by means of MDA. Twelve samples
of the error-prone RCA product (0, 1, 2 mM Mn
2+
versus 50, 100, 250, 500 pg of template) were diluted 10-fold and directly used as
templates for MDA amplification. As shown in Fig. 2b, almost the same level of amplification was obtained for all samples, indicating
that the initial RCA products are long enough to serve as templates of MDA. The amount
of linear plasmids obtained after restriction enzyme treatment was about 5 ?g in a
total volume of 50 ?l, which is enough for P. pastoris transformation.

Analysis of mutation frequency using large-scale sequencing

To evaluate the mutation frequency after the two-step amplification, the MDA products
were sequenced with an Illumina sequencer. The number of sequenced bases for each
error-prone RCA condition was 0.6–4 gb/sample for the 4.4 kbp vector (Additional file
1: Table S1). After removal of low-quality bases, we analyzed errors in the cel6A region. As we used large-scale sequencing, we could calculate the mutation frequency
at each base (cel6A 1320 bp, A:255 T:256 G:330 C:479) for detailed analysis.

The minimum and maximum of total sequenced bases at each base of cel6A and the averaged mutation frequencies are shown in Table 1. The maximum mutation frequency (2.60 kb
?1
) was obtained with 2 mM Mn
2+
and 100 pg template. The mode and distribution of mutation frequency were also analyzed
and the results are shown as a histogram in Fig. 3. Under all conditions, the bases of cel6A were log-normally distributed. The number of errors increased with increasing concentration
of Mn
2+
in accordance with Fujii’s findings 23]. Surprisingly, at high template concentration (500 pg), the averaged mutation frequency
was high (2.07 kb
?1
) even in the absence of Mn
2+
in the reaction mixture. Thus, higher concentrations of Mn
2+
and templates tended to result in higher error ratios.

Table 1. Mutation frequency

thumbnailFig. 3. Histogram of per-base mutation frequency on a logarithmic scale. The distributions
of mutation frequency in each reference position of the PcCel6A gene were fitted to a log-normal distribution (solid line). Peak locations are shown with vertical bars

The types of substitution mutation varied, as shown in Additional file 1: Tables S2–S5. The mutation frequency of each substitution is shown on the left side
of each column to illustrate any inherent bias of this error-prone RCA-MDA method,
and the proportion of the total mutations in cel6A is shown in parentheses. The exchange of A to C revealed an overall trend towards
high mutation frequency, reaching 1.44–1.65 kb
?1
under 9 conditions. The transition/transversion ratio was 0.4–0.6, except for the
condition with 1 mM Mn
2+
and 50 pg templates. These results are different from Fujii’s findings, in which mutations
were strongly biased in favour of C to T and G to A (66 %) and the transition/transversion
ratio was 2.7 23]. The distribution of mutations is analysed and shown in relation to GC content at
each base of cel6A (Additional file 1: Figure S1). There were some regions where mutations continuously appeared (positions
770–830, 1000–1050), indicating that errors were more frequently distributed in GC-rich
regions (Additional file 1: Figures S2–S3). The mutations were also found in the whole vector regions including
promoter regions and selection marker gene (Additional file 1: Figure S4), which potentially affect the transformation efficiency or protein productivity.

Transformation, enzyme production, and measurements of cellulase activities

Error-prone RCA-MDA products obtained under two conditions (2 mM manganese and 100,
250 pg template) were successfully transformed into P. pastoris after restriction enzyme digestion. The number of colonies was ~100 per plate, which
is slightly fewer than the plates of wild-type colonies obtained by the usual plasmids
preparation protocol (100–300 colonies per plate).

All colonies from error-prone RCA-MDA plates, excluding small colonies, and 96 colonies
from the control plate were incubated in liquid culture for cellulase production.
Then the amorphous cellulose (PASC)-degrading activity of the crude enzymes was measured,
and the results obtained for the transformants of error-prone RCA-MDA are shown with
those of wild-type in Fig. 4. The activity of wild-type transformants were dispersed (Fig. 4c), indicating that the activities would be influenced by differing amounts of enzymes
in the culture resulting from the different productivity of the enzymes. Especially,
multiple gene integration events occur with detectable frequency and greatly enhance
the expression level of a target protein 30], 31]. The numbers of transformants whose activities were less than 10 % of that of wild-type
PcCel6A (the median activity of 96 control transformants was 0.37 mM glucose in 2-hour
incubation) were 40 and 37 % under the conditions with 100 and 250 pg template, respectively
(Fig. 4a, b). In contrast, the corresponding number for wild-type PcCel6A was 2 % (2 of 96 colonies, Fig. 4c). We consider that the high levels of transformants with markedly lowered activity
from the error-prone RCA-MDA plates are mainly due to the introduction of mutations.

thumbnailFig. 4. Histogram of amorphous cellulose (PASC)-degrading activities of transformants. a The activities of 87 transformants obtained by error-prone RCA-MDA under the conditions
of 2 mM Mn
2+
and 100 pg template. b The activities of 100 transformants obtained by error-prone RCA-MDA under the conditions
of 2 mM Mn
2+
and 250 pg template. c The activities of 96 transformants obtained by the usual plasmids preparation protocol
(wild-type control)

Degradation of amorphous and crystalline celluloses by mutants

Next, we compared the activities of the crude mutant enzymes to degrade amorphous
and crystalline celluloses. Figure 5a shows the amorphous cellulose (PASC)-degrading versus crystalline cellulose (cellulose
III
I
)-degrading activities of 87 randomly selected transformants from error-prone RCA-MDA
plates, which had activities of more than 10 % of that of the wild type in Fig. 4. The data points in blue in Fig. 5a shows the transformants for which DNA sequencing revealed no mutation or no change
in amino acid sequence. Mutants that had at least one mutation in the cel6A gene are shown in red in Fig. 5a. The activities of these mutants were affected by relatively large difference in
protein expression levels, as the total protein concentrations of culture supernatants
were varied from 0.01 to 0.2 mg/ml. The results of SDS-PAGE analysis of some mutants
were shown in Additional file 1: Figure S5.

thumbnailFig. 5. a Plot of amorphous cellulose (PASC)-degrading activity versus crystalline cellulose
(cellulose III
I
)-degrading activity of PcCel6A mutants. Forty-two transformants were selected from error-prone RCA-MDA plates
under the conditions of 2 mM Mn
2+
and 100 pg template (numbered 1–42), and 45 transformants from error-prone RCA-MDA
plates under the conditions of 2 mM Mn
2+
and 250 pg template (numbered 43–87), and their activities were measured. The transformants
with no mutation or no change in amino acid sequence are indicated in blue. Mutants with at least one mutation in the cel6A gene are shown in red. Mutants discussed in the text are shown in black. b The mutations found in mutants #13, #15, and #23. The structures of PcCel6A catalytic domain and CBM were modeled and the locations of altered amino acids
are indicated with the mutant numbers. The active site loops of the catalytic domain
are colored in cyan and the direction of the incoming cellulose chain is indicated by an orange arrow. Two disulfide bridges in CBM are colored in yellow

Most of the mutants had an amorphous/crystalline cellulose-degrading activity ratio
similar to that of the wild type, though two mutants revealed a clearly different
character. Mutants #13 and #15 had lowered degrading activity towards crystalline
cellulose III
I
while retaining activity towards amorphous cellulose. The DNA sequencing of #15 revealed
a single mutation, W267C. This tryptophan residue is located at the entrance of the
active site tunnel of PcCel6A (Fig. 5b), and the importance of the residue has been demonstrated in Trichoderma reesei Cel6A: it is not necessary for hydrolysis, but is requisite for loading a cellulose
chain from the crystalline surface 32]. On the other hand, the DNA sequencing of #13 revealed several mutations: C25Y, A105D,
G346D, as well as two mutations that would not cause any amino acid substitution.
The most influential mutation is probably C25Y, because C25 is expected to form a
disulfide bridge with C8 in the carbohydrate-binding module (CBM) of PcCel6A (Fig. 5b), and this mutation is expected to result in reduced affinity and adsorption on
the crystalline surface 33]. However, the DNA sequencing traces of #13 showed overlaps with the wild-type sequence,
probably due to multiple-copy gene integration, while the multiple integration event
occurs less than 10 % of transformed colonies 34]. It might result in the production of different proteins in a single Pichia cell, and the activity would be a mixture of mutants. Interestingly, deletion of
17 amino acids at the C-terminus did not cause a significant change of activity, as
seen in the case of mutant #23 with a stop codon at W423 and two silent mutations.

A major advantage of the P. pastoris expression system is that the enzyme is secreted extracellularly, so that the cellulase
activity of PcCel6A mutants can be easily measured by the direct use of culture filtrates. In the
present study, we found two mutants with altered crystalline cellulose degradation
by PcCel6A in relatively small libraries, supporting the idea that P. pastoris is a useful system for screening of secreted proteins. The use of P. pastoris could be especially advantageous for screening of enzymes with insoluble substrates
or substrates that are unable to diffuse through cell membranes. Moreover, by utilizing
?-glucosidase-producing P. pastoris, direct screening in terms of growth difference might be possible.