Analysis of predicted proteins

Fig. 4 shows the amino acid sequences of the predicted proteins of AtFpg-1, -1a, -2, -3, and -4.

Exons 1, 2, 3, 5, 6, and 7 were entirely conserved in all the Arabidopsis cDNA clones. The polypeptide chains encoded by exons 1, 5, 6,and 7 represent the major conserved regions between Arabidopsis and bacterial FPGs , showing between 29 and 54% identity between Arabidopsis and E. coli amino acid sequences. The N-terminal sequence of exon 1 and the lysine of exon 5 (K155) of E. coli FPG have been associated with the active site. The predicted amino acid sequence coded by exon 1 shows a surprising relationship to a sequence from DNA photolyase, another DNA repair enzyme but one quite unrelated to FPG (Fig. 5). If this relates to DNA binding, it might explain how AtFPG-2, which lacks the C-terminal DNA-binding region present in AtFPG-1 (or the zinc-finger of E. coli FPG) might have the DNA cleavage activities measured by Gao and Murphy (Photochem. Photobiol., in press).

The optional exons are exon 4 (missing from AtFPG-3), exons 8-13 (present and missing in various combinations) and exon 14 (mostly missing from AtFPG-2). The dispensibility of exon 4 and the predominantly hydrophilic nature of the amino acids for which it codes suggest that the amino acid sequence forms a discrete loop on the surface of the enzyme, rather than an essential part of the protein core. The clear difference in exon selection in the exon 8-13 region between AtFPG-1/1a and AtFPG-3/4 makes surprisingly little difference in the amino acid sequence coded by this region. AtFPG-3/4 match the sequence of AtFPG-2 in the beginning part, but by staying in frame they recover identity with the AtFPG-1/1a sequence. We have not been able to locate any motifs or congeners to exons 4 or 8-13 that would suggest a function for the amino acid sequences coded by these exons. The polypeptide chain coded by exon 14 has a DNA binding motif described by Ohtsubo et al (1998) as well as a bipartite nuclear localization site, and the motif and nuclear localization are expected to be shared by AtFPG-1, -1a, -3, and -4.

The amino acid sequences of RTPRC-A, -B, and -C can be predicted by assuming that the mRNAs they represent had the same selection of N-terminal exons as the full length clones (either with or without exon 4). Fig. 6 shows that, surprisingly, the lengths of the predicted amino acid sequences are inversely related to the lengths of the PCR fragments. RTPCR-A terminates at a stop codon near the beginning of exon 8; RTPCR-B (AtFPG-2), at the beginning of exon 10; RTPCR-C, which starts at exon 10 but in a different reading frame from RTPCR-B, near the end of exon 12. None of these contains a nuclear localization site.

Although the cDNA clones and RT-PCR results demonstrate that these variant FPG mRNAs exist in Arabidopsis, we have not yet demonstrated that they are translated. Nevertheless, it is interesting to speculate on why the variant FPGs are potentially present in the eukaryote plant, but not in bacteria. The question is even more interesting because plants apparently have a gene for OGG, an enzyme with similar specificity that is present in yeast and human cells, but not in bacteria, making the plants unique in having both FPG and OGG. We suggest that the various enzymes function in different organelles, as has been suggested for OGG (Nishioka et al., 1999), uracil-DNA glycosylase (Nilsen et al., 1997), MTH1 (Oda et al., 1999), and MutY homologs (Ohtsubo et al., 2000) in human cells. AtFPG-2, RTPCR-A, and RTPCR-C, which lack the nuclear localization signal, are candidates for plastid or mitochondrial enzymes, but we have not been able to identify clear transit sequences in them. The various enzymes may also differ in specificity to different base alterations. For instance, Gao and Murphy (Photochem. Photobiol., in press) have shown that AtFPG-1, but not AtFPG-2, cleaves oligonucleotides containing 8-oxo-G. Another possible difference might be in the binding of the FPG to proteins involved in subsequent steps of base excision repair, thus favoring small-patch or large-patch repair. Finally, differences in mRNAs that do not change the translation product, for instance, between AtFPG-1 and -1a, may yet be significant if they affect the transport of the mRNA, especially between cells. Intercellular transport of RNA may be a major developmental process in plants (Jorgensen et al, 1998).