Latest DHMRI News
Jan 1, 2021
Abstract ‘HoneySweet’ plum (Prunus domestica) is resistant to Plum pox potyvirus, through an RNAi-triggered mechanism. Determining the precise nature of the transgene insertion event has been complicated due to the hexaploid genome of plum. DNA blots previously indicated an unintended hairpin arrangement of the Plum pox potyvirus coat protein gene as well as a multicopy insertion event. To confirm the transgene arrangement of the insertion event, ‘HoneySweet’ DNA was subjected to whole genome sequencing using Illumina short-read technology. Results indicated two different insertion events, one containing seven partial copies flanked by putative plum DNA sequence and a second with the predicted inverted repeat of the coat protein gene driven by a double 35S promoter on each side, flanked by plum DNA. To determine the locations of the two transgene insertions, a phased plum genome assembly was developed from the commercial plum ‘Improved French’. A subset of the scaffolds (2447) that were >10 kb in length and representing, >95% of the genome were annotated and used for alignment against the ‘HoneySweet’ transgene reads. Four of eight matching scaffolds spanned both insertion sites ranging from 157,704 to 654,883 bp apart, however we were unable to identify which scaffold(s) represented the actual location of the insertion sites due to potential sequence differences between the two plum cultivars. Regardless, there was no evidence of any gene(s) being interrupted as a result of the insertions. Furthermore, RNA-seq data verified that the insertions created no new transcriptional units and no dramatic expression changes of neighboring genes. Introduction ‘HoneySweet’ plum (Prunus domestica) is highly resistant to the devastating disease, Sharka, for which the causal agent is Plum pox potyvirus (PPV) 1 , 2 . ‘HoneySweet’ is derived from a ‘Bluebyrd’ x unknown pollen parent seed that had been transformed with a PPV coat protein gene (CP), driven by the 35S promoter 3 . Standard molecular analyses including DNA blots, RNA blots, and protein blots, suggested that there were 3–4 copies of the CP gene inserted, yet very low levels of CP RNA and no detectable CP 3 . Further analyses deduced that ‘HoneySweet’ was resistant to PPV through an RNAi mediated response 4 . The introduced CP was methylated, transcription rates were high, but RNA was undetectable outside of the nucleus and PPV RNA was not detectable after challenge inoculations. Further DNA blotting experiments suggested that there was a rearrangement of the inserted DNA resulting in a predicted hairpin of two CP genes as well as a separate multicopy arrangement of the transgenes 4 . To verify this, a BAC library was constructed from ‘HoneySweet’ DNA. Two BAC clones with NPT sequence but not containing the CP were isolated along with a single clone that contained the CP gene. Upon sequencing the CP clone and confirmation by PCR, it was found that this fragment contained an inverted duplication resulting in a tail-to-tail arrangement of the CP gene as well as a double 35S promoter sequence driving each of the copies. An incomplete 3′UTR was situated between the two copies of the CP gene resulting in a short intervening region that was not duplicated. This CP hairpin fragment was transformed back into plum and plants were tested for resistance. Four independent lines (out of 8) showed high levels of resistance through three cycles of dormancy. These results confirmed that this unintended hairpin could be responsible for the resistance to PPV of ‘HoneySweet’ 5 . To more precisely define the insertion event(s) within ‘Honeysweet’ and their potential impacts, NextGen sequencing was used to sequence ‘HoneySweet’ 6 . In addition, the hexaploid Prunus domestica ‘Improved French’ was sequenced and phased. The phased genome was then used to identify the insertion sites within the ‘HoneySweet’ genome. Comprehensive RNA-seq analyses were performed to evaluate if the insertion event(s) resulted in the production of new transcripts or altered the expression of neighboring genes Results Plum genome assembly and annotation ‘Improved French’, from which a great majority of the commercial production of dried plums (prunes) is derived, was chosen to provide a phased genome sequence, such that all six copies of each chromosome would be represented. This was performed by NRGene using second-generation sequencing resulting in 210× coverage and third-generation sequencing resulting in 55× coverage. The data were assembled into 27,870 scaffolds representing a genome of 1,399,321,220 bases (Table 1 ). Using the number of conserved genes Benchmarking Universal Single-Copy Orthologs (BUSCO) 7 to evaluate the completeness and accuracy of the genome assembly, 1385 or 96.2% of all the genes were found to be complete, and of which 1318 were duplicated. In addition, there were 9 fragmented genes (0.6%) and 46 missing genes (3.8%). Table 1 presents summaries of various aspects of the genome. The assembled genome was then annotated resulting in 130,866 gene models and is available on Genome Database for Rosaceae (GDR), https://www.rosaceae.org/ 8 . Table 1 Phased plum genome description The TPMs were recalculated for these potentially significant genes using only the unique reads, as these were the only transcripts that could definitively be mapped to each allele. On doing this, only three genes had an average TPM with a standard deviation that did not overlap with ‘Stanley’. All three had very low expression from 0 to 1 TPM per library which for most of the libraries meant less than five reads (Table 2 ). Heat maps were constructed from the TPM values for the unique reads (Tables S5 and S6 ). These show the variation between each tree sampled is for the most part, greater than the variation by cultivar similar to that seen when sampling fruit composition from these trees 10 . To confirm that no new genes were formed by the insertion events, all the RNA-seq reads from the 16 ‘HoneySweet’ libraries were mapped to the two predicted insertion events which included ~1000 bases of flanking plum sequence (Fig. S 6 ). Only the expected reads for the coding sequences of NPTII, CP, UIDA, and partial transcripts from the interrupted BLA gene in MUA-10 sequence were detected. A few random transcripts were detected but were less than one read per library and hence not significant. Discussion ‘HoneySweet’ was chosen as the PPV resistant line to carry through to commercialization because it was the only line out of over 100 independent transformants, with stable resistance to PPV infections in a containment greenhouse. Unfortunately, or fortunately, it had a complex arrangement of transgenes as initially determined by DNA blotting 1 , 3 , 4 . Unfortunate, since understanding the complex arrangement would be difficult but fortunate, since the original transgene construct was intended to be an overexpression of the PPV-CP gene, but the complex arrangement resulted in an RNAi mode of resistance before we understood RNAi. The occurrence of rearrangements of insertions is not unusual using Agrobacterium transformation: in addition, other insertional effects happen, including but not limited to rearrangements, translocations, deletions, and incorporation of additional DNA 11 . Normally a plant line with a complex arrangement of the introduced transgenes would be discarded at an early stage because of the difficulty predicting the effects on flanking genes as well as potential segregation issues when used in a breeding program. But because of the value of a PPV resistant line of plum, ‘HoneySweet’ was kept and subjected to more extensive analyses of the insertion event. The CABI Invasive Species Compendium states that “Plum pox virus disease (Sharka) is one of the most destructive diseases of stone fruits.” ( https://www.cabi.org/isc/datasheet/42203 ). The purpose of understanding transgene insertion sites is to verify that no genes have been interrupted or influenced, potentially causing variations in expression that could lead to undesirable phenotypic changes in the transgenic plant. Secondly, that knowledge is used to verify that no new RNAs and potential proteins are generated. And lastly, to verify that no unexpected DNA insertions have happened from carrier DNA such as Agrobacterium or even E. coli used in the propagation of the vectors. Previous work had suggested that there were two insertions in ‘HoneySweet’ plum, one containing a multicopy, rearranged version of the introduced transgenes (insertion 1) and the second containing an inverted repeat of the CP gene from PPV 4 , 5 (insertion 2). The three flanking plum DNA borders from BAC sequence were compared to the peach genome assembly, and indicated that the insertions were in the same region of Pp08 and did not interrupt any genes. To verify this assumption, specific plum genome information was needed. We utilized newer technologies 12 , 13 , 14 , 15 , including next generation sequencing of ‘HoneySweet’, and the assembly of a phased genome for the major plum cultivar, ‘Improved French’. ‘Improved French’ was chosen for the genome assembly because it is the major cultivar for the dried plum Industry representing over 65% of the world crop for dried plums. ‘Bluebyrd’, the maternal parent of ‘HoneySweet’, would have provided half of the genome but is only a minor cultivar. The sequence of ‘Improved French’ will have a greater impact on future breeding efforts. Whole genome sequencing of ‘HoneySweet’ yielded a complete picture of the rearranged insertion event 1 (Fig. 1 ) that contained seven copies of the transgene insert with various deletions and inversions. These were determined using junction sequences that contained part of the transgene insert joining either plum DNA or an unexpected part of the transgene insert. This also yielded two border sequences, though because of the short reads, the new border sequence could only be extended the length of the short read. Those reads that overlapped the border junction were partially extended with overlapping reads only if there were unique SNPs to verify that the overlapping read was unique from all others, indicating it was part of the same chromosome. Because few of the overlapping reads had unique SNPS, a consensus sequence was used beyond that, which represented the sequence from a majority of the six different chromosomes. The other border sequence was compared to the BAC sequence, which represented only the chromosome that contained the insertion event. The second insertion event, the hairpin, had been completely covered by a BAC sequence, but was confirmed in the whole genome sequencing by three unique junction sequences as predicted. Reads were then matched to the BAC sequence to verify the plum flanking sequence. When it had been previously aligned in the peach genome, the flanking sequences were found in multiple places and the placement was based on the relative distance to a coding region of one gene found in the BAC flanking sequence, ppa007619 which is equivalent to Prupe.8G066600 in the peach genome assembly 2.0 (ref. 9 ). To specifically locate the insertion events in the plum genome, a phased plum genome was developed from ‘Improved French’ DNA that would ideally separate out the sequences for the six copies of each chromosome. To gauge the assembly several factors were measured. The first was size (Table 1 ). The assembly size was 1,399,321,220 bases and because plum is hexaploid, it suggests it should be ~6× the peach genome size (2.274 × 108 haploid) or ~1.3644 × 109. The second was the representation of genes expected to be present as single-copy genes or BUSCO evaluation (Tables 1 and S 7 ). The vast majority of the genes were found in the plum assembly (96%) and interestingly, only 30 of the 1385 were found with six alleles, the majority were found with five alleles (636), with decreasing amounts in 4, 3, and 2 phases and only 76 with a single phase. This reflects not only the hexaploid nature but the high degree of heterozygosity in plum. The last measure of assembly was the number of genes annotated. Again, the expectation would be that it could be up to six times that of the haploid peach (26,873). The plum assembly yielded 130,866 gene models or 4.8× that of the peach. This is consistent with the BUSCO evaluation where only ~2% had six alleles. With a newly assembled phased plum genome the sequences flanking the insertions of transgenes were used to determine which plum scaffolds could contain the insertions. Scaffolds 1234, 1429, 2675, and 1650 were identified that had sequence homology to the flanking regions of insertion 1 and 2. A fifth scaffold, Scaffold 1332 had homology to only insertion 1 flanking sites, and three additional scaffolds, 4101, 4359, and 6796, had homology to insertion 2 flanking sites. These seven scaffolds (not Scaffold 1332) had significant synteny to a region of the peach Pp08. Looking at the synteny and looking at the mapping, we hypothesize that the nine scaffolds represent six syntenic regions of plum. Scaffolds 1234, 1429, and 1650 each represent one unique phase (one copy of the chromosome). Scaffolds 1332 and 4101 might represent the same phase as one begins where the other ends. Scaffold 2675 and Scaffold 6796 may represent the fifth copy of the chromosome. That hypothesis is based on the decreased number of DMR6 sites in Scaffold 2675 which ends prior to the region of synteny from Prupe.8G066700.1 which is where Scaffold 6796 begins with multiple DMR6 sites. This leaves Scaffold 4359 which has only sequences related to insert 2. This scaffold may represent a sixth phase that is homologous at insertion 1 flanking sites with one of the other phases but diverges afterwards. In the assembly of phased sequencing reads, regions of enough homology will not separate into different phases leading to Scaffolds that do not cover the homologous regions. The fact that the two insertions are in a region of the chromosome that contains two repetitive genes, makes it very difficult to identify the specific scaffold that represents the insertion events. The first insert is near a series of arabinogalactan-9 family genes, but because of the uniqueness of the genes flanking insertion 1, the scaffolds representing that region are quite clear. The second insertion is between repetitive motifs which turn out to be genes from the super family 2-oxoglutarate-dependent dioxygenase or DMR6 like. This has been found to be represented conservatively by more than 100 genes in Arabidopsis 16 , most of which map in clusters. In these plum scaffolds the genes are present multiple times with small variations in sequences. The variation in the scaffolds with homology to insertion 2 plum borders may be due to the variability in the number of members of this DMR6-like super family. The identification with certainty, of the specific scaffold that represents where the insertions are, was not possible. The closest homology with ‘HoneySweet’ is Scaffold 2675, which was very homologous near insertion 1, and Scaffold 4359, which had the most homology at insertion 2. The uncertainty may be related to intra-specific diversity between the two plum cultivars (‘Improved French’ vs. ‘HoneySweet’) and the nucleotidic/structural differences between homologous chromosomes. Without longer sequencing reads for ‘HoneySweet’, this uncertainty remains. Regardless of which scaffold represents the insertion events, none of them interrupt predicted genes. The insertions are in intergenic spaces. Even though the insertions are not in genes, they might have had an influence of gene expression of flanking genes. When the RNA expression from leaves and fruit from eight ‘HoneySweet’ trees and ten ‘Stanley’ trees was compared, a number of flanking genes from the eight scaffolds had statistically significant differences. But, when looked at using reads that were unique to the flanking genes versus the other related or family members, only four genes showed real differences and for these the expression of three of them did not yield enough reads (>5) to be real and the fourth was also ~1 TPM or again quite low. So our conclusion is that very little change in expression could be attributed more to the presence of the insertions than to differences in environment. Another aspect of understanding the insertion events of transgenic plants is to understand the breeding potential. One of the problems for plum genetics has been the polyploid aspect where some traits appeared to segregate in a diploid manner, yet the genome is hexaploid. In the case of ‘HoneySweet’ transgenes, UIDA and the PPV-CP, both appear to segregate close to a 1:1 ratio 17 , 18 . The UIDA and PPV-CP also co-segregate suggesting that the two insertions are on the same linkage group and close. Most single-copy genes looked at in the BUSCO analysis of the plum genome were found in five different scaffolds implying that there were five variations of those regions. The sixth should/could be a near duplicate of one of those five. In addition, the conclusions of studies looking at the origin of the hexaploid Prunus domestica was that it consisted of at least two different Prunus genomes 19 , 20 , 21 , an interspecific hybrid of a diploid P. cerasifera and a tetraploid P. spinosa that itself may have been an interspecific hybrid of P. cerasifera and an unknown Eurasian plum species 21 , 22 , 23 . It could be then that two copies come from one species and the other four from a second species. When they assort in meiosis, the four randomly assort and the two segregate so any allele on one of those two segregates as a diploid while the others segregate randomly as a tetraploid. Since it appears that the ‘HoneySweet’ transgenes segregate as a diploid, they should be located on one of the diploid chromosomes. Looking at the diversity at the region of the plum genome where the insertion events are, it could be easily understood that the chromosomes might not all randomly assort because the species divergence did not allow them to pair. In conclusion, there were two insertion events of the introduced transgenes in ‘HoneySweet’, both resulting in rearrangements and deletions, such that one insertion contained seven modified copies of the transgenes but with at least two complete copies of each gene and the second insertion event resulted in a hairpin arrangement of the PPV-CP transgene driven by a double 35S promoter on each end. These insertion events were each associated with a small deletion of plum DNA, 33 or 39 bases, and were not in any predicted gene. Neither of these insertion events had a dramatic effect on flanking gene expression. Lastly, the insertion events are in a region of the plum genome that has a high level of diversity amongst the different ‘chromosomes’ but without further long-read sequencing of ‘HoneySweet’ in this region, the specific ‘chromosome’ could not be determined. Materials and methods ‘Improved French’ DNA extraction, sequencing, and assembly DNA from young (in the end of bud burst) leaves of the ‘Improved French’ was extracted using modified protocol by Kubisiak et al. 24 . Briefly, nuclei were isolated using extraction buffer (0.35 M sorbitol, 10% polyethylene glycol 8000, 0.2% bovine serum albumin, 10 mM Tris-HCl (pH 8.0), 10 mM EDTA, 1 mM spermidine, 1 mM spermine, and 1% β-mercaptoethanol). Pelleted nuclei were washed with organelle wash buffer containing 0.35 M sorbitol, 10 mM Tris-HCl (pH 8.0), 10 mM EDTA, 1 mM spermidine, 1 mM spermine, and 1% β-mercaptoethanol. The DNA from nuclei was extracted by treatment with proteinase K (5 µl per 400 ul) in lysis buffer containing 0.5% N-lauryl sarcosine, 1% CTAB, 0.7 M NaCl at 65 °C for 12 min, phase-separated with equal volume of chloroform: isoamyl alcohol (24:1 vol/vol) extraction and precipitated with 2-isopropanol (1:1 vol/vol). The DNA was washed twice in 70% ethanol, dried at room temperature and resuspended in 0.01 M tris HCl, pH 8.0. Then, the DNA was treated with 3 µl of the Ambion® RNase Cocktail™ (Thermo Fisher Scientific Inc., USA), for 30 min at 37 °C, followed by chloroform: isoamyl alcohol (24:1 vol/vol) extraction and precipitation with two volumes of ethanol. Finally, the DNA was resuspended in 100 µl of 0.01 M Tris-HCl, pH 8.0. The quality and integrity of the DNA were evaluated using Qubit 2 Fluorometer (Thermo Fisher Scientific Inc., USA), a NanoDrop ND-8000 (Thermo Fisher Scientific Inc., USA) followed by electrophoresis on 1% agarose gels. Sequencing and assembly were performed by NRGene (San Diego, CA) using a combination of Illumina™ technologies including paired-end reads, mate-pair reads of differing sizes, and sequencing and assembly of Chromium 10x libraries (Table S 8 ). The sequencing data were processed and assembled using DeNovoMAGIC™ assembler application version 3.0. The integrity of the assembly was verified using several quality-assurance procedures including the independent BUSCO benchmark ( http://busco.ezlab.org/ ) 7 which is used to specifically indicate the genic region integrity, ploidy, and zygosity characteristics of the assembled genome. The assembled scaffolds are available at GDR ( www.rosaceae.org ). Genome annotation A total of 2747 scaffolds (>10 Kb in length) from the genome assembly were annotated using the genome annotation platform GenSAS ( www.gensas.org ) 25 and the programs listed below which are integrated into GenSAS. Default settings were used unless otherwise noted. The genome sequence was masked using RepeatMasker ( www.repeatmasker.org ), other dicots RepBase dataset, and RepeatModeler ( www.repeatmasker.org/RepeatModeler ). The gene models were predicted using BRAKER2 ( https://github.com/Gaius-Augustus/BRAKER ) which was trained with a BAM file which contained ‘HoneySweet’ RNA-seq reads (see paragraph below for information on reads) aligned to the genome assembly using HISAT2 ( https://ccb.jhu.edu/software/hisat2/index.shtml ). tRNA and rRNA were identified using tRNAscan-SE, ( http://lowelab.ucsc.edu/tRNAscan-SE ) 26 , and RNAmmer, ( http://www.cbs.dtu.dk/services/RNAmmer ) 27 respectively. Functional annotation was performed using InterProScan, ( http://www.ebi.ac.uk/interpro/search/sequence-search ) 28 Pfam, SignalP, TargetP, and protein alignments with BLAST (SwissProt protein database) and DIAMOND (NCBI, RefSeq, Plant, and P. persica proteins from Genbank). BUSCO was run on the predicted proteins, and the genome annotation contains 92.4% of the complete, conserved BUSCOs. The RNA-seq reads used included a set of ~10 billion raw RNA-seq reads (150 bp paired end) derived from plum vegetative bud and leaf tissues at various stages of development. This RNA-seq set was created using the translatome profiling technique where epitope-tagged ribosomes are immunopurified to enrich for actively translating mRNAs. This technique enriches the mRNA fraction for fully spliced transcripts 29 , 30 . Homology of the Prunus domestica Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e−9 was used for the NCBI nr (Release 2018-05) and 1e−6 for the Arabidopsis proteins (TAIR10), UniProtKB/SwissProt (Release 2018-04), and UniProtKB/TrEMBL (Release 2018-04) databases. The best hit reports are available for download in Excel format at GDR. ‘HoneySweet’ genome sequencing DNA was extracted from ‘HoneySweet’ leaves utilizing a modified nuclei extraction procedure, where 1 g of leaf tissue was ground in liquid nitrogen, transferred to a cold 15 ml Dounce homogenizer containing 7 ml of ice-cold extraction buffer (0.5 M sucrose, 10 mM Trizma Base, 80 mM KCL, 10 mM EDTA, 1 mM spermidine, 1 mM spermine, final pH9.4-9.5 adjusted with NaOH). The material was passaged 4–10 times. The slurry was filtered through two layers of cheese cloth and one layer of miracloth (Calbiochem, San Diego, CA) into a cold centrifuge tube to enrich for nuclei. The nuclei were pelleted by centrifugation at 1800 g at 4 °C for 15 min. The pellet was washed with 5 ml extraction buffer with the addition of 0.15% beta-mercaptoethanol, filtered again through miracloth and re-pelleted. A repeat wash was done. The enriched nuclei were then the starting material for DNA extraction using GNome® DNA kit (BIO101, Vista, CA) where the pellet was resuspended in 1 ml of Suspension Solution and the DNA extracted by manufacturer’s protocol. The resulting DNA was resuspended in 200 µl of water and treated with 2 µl RNaseA(1 mg/ml) for 30 min at 37 °C, followed by phenol/chloroform (1:1 vol/vol) and chloroform extraction 31 . DNA was again precipitated and resuspended in 75 µl water at ~1 mg/ml. The DNA was then sent to DHMRI (David H. Murdock Research Institute, Kanapolis, NC) for sequencing. Initially seven lanes of 75 base paired-end sequences and then four lanes of 100 base-paired reads were processed. Sequences were then trimmed by removing any adapter sequences, low-quality scores and reads less than 60 bases from the 75-base runs and 90 bases from the 100-base runs. These were then analyzed using CLC Genomics Workbench (Qiagen, Germantown, MD), MacVector (MacVector Inc, Apex, NC), and comparisons with peach sequences were done using the Prunus persica Genome v1.0 (ref. 32 ) assembly available at GDR, www.rosaceae.org 8 . Sequence of transformation vector The sequence of the original transformation vector T-DNA was recreated through literature searches and available sequence. The sequence is presented in Supplementary Data Set 1 . The schematic of the gene arrangement is presented in Fig. 1 . PCR confirmation of insert sequence ‘HoneySweet’ leaf DNA was extracted using a CTAB protocol 33 , 34 ; starting with ~1 g of fresh leaf and 7.5 ml of extraction buffer. Primer design used MacVector to predict the best sites that flanked the junction sequences as well as other locations in the transgenes (Table S 8 ). Primers were synthesized by IDT (Coralville, IA). PCR used ~50 ng of DNA per 20 µl reaction using Applied Biosystems™ AmpliTaq™ DNA Polymerase with Buffer II (ThermoFisher Scientific) according to manufacturer’s directions. The standard cycle conditions were 5 min at 95 °C, then 30 s at 95 °C, 30 s at 55 °C, and 30 s at 72 °C for 25 cycles followed by 7 min at 72 °C. Variations were added to obtain single fragments of expected sizes, especially for longer fragments, including raising the temperature of annealing from 55 up to 70 °C, changing Mg concentration, adding DMSO, changing cycling program by adding increasing extension times at 72 °C, increasing cycle numbers and using Phusion TAQ polymerase—Phusion (NEB, Ipswitch, MA). The products were run on agarose gels and visualized with Typhoon FLA scanner (GE Healthcare Life Sciences, Marlborough, MA). RNA-seq Leaves and fruit from 8 ‘HoneySweet’ trees and 10 ‘Stanley’ trees located in various sites in Europe and the US were used to look at expression. RNAs were extracted from 20 mg of lyophilized leaf and fruit tissue using the Norgen Plant/Fungi Total RNA Purification Kit (Norgen Biotek Corp., ON) following the manufacturer’s protocol. The RNA was DNased using the TURBO DNA-free kit (Thermo Fisher Scientific, MA) according to the manufacturer’s protocol. The RNA quality and purity were then assessed by electrophoresis on a 1.2% agarose gel and visualized on a Typhoon FLA9500 scanner (GE Healthcare Life Sciences, IL), and by analyzing spectrophotometrically on a NanoDrop® ND-1000 (Thermo Fisher Scientific, MA). Fruit and leaf RNA from each of the individual trees was sent to DHMRI for sequencing (36 libraries), unidirectional reads of 100-base lengths. Analyses of plum DNA sequence and RNA sequences All analyses were done using CLC Genomics Workbench (versions 5.5–12.0). Map to reference was used to map both ‘HoneySweet’ DNA sequences to T-DNA sequence. See analytic pipeline (Fig. S 1 ), peach V1.0 as well as RNA sequences to predicted insertion events. RNA-seq was used to determine transcript rates for flanking genes. Synteny with peach, V2.0. was examined using the Synteny tool present on the GDR website Data availability RNA-sequencing data are available through Sequence Read Archive (SRA), accession number…. DNA-sequencing data are available through Sequence Read Archive (SRA), accession number…. Plum assembled and annotated genome is available through Genome Database for Rosaceae (GDR) https://www.rosaceae.org/ . References 1. Scorza, R. et al. ‘HoneySweet’(C5), the first genetically engineered plum pox virus–resistanT Plum (Prunus domestica L.) cultivar. HortScience 51, 601–603 (2016). 2. García, J. A., Glasa, M., Cambra, M. & Candresse, T. Plum pox virus and sharka: a model potyvirus and a major disease. Mol. Plant Pathol. 15, 226–241 (2014). 3. Scorza, R. et al. Transgenic plums (Prunus domestica L.) express the plum pox virus coat protein gene. Plant Cell Rep. 14, 18–22 (1994). 4. Scorza, R. et al. Post-transcriptional gene silencing in plum pox virus resistant transgenic European plum containing the plum pox potyvirus coat protein gene. Transgenic Res. 10, 201–209 (2001). 5. Scorza, R. et al. Hairpin Plum pox virus coat protein (hpPPV-CP) structure in ‘HoneySweet’C5 plum provides PPV resistance when genetically engineered into plum (Prunus domestica) seedlings. Jul.-K.ühn-Arch. 427, 141 (2010). 6. Guttikonda, S. K. et al. Molecular characterization of transgenic events using next generation sequencing approach. PLoS One 11, e0149515 (2016). 7. Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). 8. Jung, S. et al. 15 years of GDR: new data and functionality in the Genome Database for Rosaceae. Nucleic Acids Res. 47, D1137–D1145 (2018). 9. Verde, I. et al. The Peach v2. 0 release: high-resolution linkage mapping and deep resequencing improve chromosome-scale assembly and contiguity. BMC Genomics 18, 225 (2017). 10. Callahan, A. M., Dardick, C. D. & Scorza, R. Multilocation comparison of fruit composition for ‘HoneySweet’, an RNAi based plum pox virus resistant plum. PloS One 14, e0213993 (2019). 11. Schnell, J. et al. A comparative analysis of insertional effects in genetically engineered plants: considerations for pre-market assessments. Transgenic Res 24, 1–17 (2015). 12. Li, R. et al. Molecular characterization of genetically-modified crops: challenges and strategies. Biotechnol. Adv. 35, 302–309 (2017). 13. Pauwels, K. et al. Next-generation sequencing as a tool for the molecular characterisation and risk assessment of genetically modified plants: Added value or not? Trends Food Sci. Tech. 45, 319–326 (2015). 14. Shirasawa, K. et al. Phased genome sequence of an interspecific hybrid flowering cherry, Somei-Yoshino (Cerasusx yedoensis). DNA Res. 26, 379–389 (2019). 15. Siddique, K., Wei, J., Li, R., Zhang, D. & Shi, J. Identification of T-DNA insertion site and flanking sequence of a genetically modified maize event IE09S034 using next-generation sequencing technology. Mol. Biotechnol. 61, 694–702 (2019). 16. Kawai, Y., Ono, E. & Mizutani, M. Evolution and diversity of the 2–oxoglutarate‐dependent dioxygenase superfamily in plants. Plant J. 78, 328–343 (2014). 17. Scorza, R., Callahan, A. M., Damsteegt, V., Levy, L. & Ravelonandro, M. Transferring potyvirus coat protein genes through hybridization of transgenic plants to produce plum pox virus resistant plums (Prunus domestica L.). Acta Hortic. 472, 421–428 (1997). 18. Ravelonandro, M., Briard, P., Monsion, M. & Scorza, R. Stable transfer of the plum pox virus (ppv) capsid transgene to seedlings of two french cultivars ‘Prunier D’ente 303’ and ‘Quetsche 2906’, and preliminary results of PPV challenge assays. Acta Hortic. 577, 91–96 (2001). 20. Rybin, W. A. Spontaneous and experimentally produced hybrids between blackthorn and cherry plum and the descent problem of the cultivated plum. Planta 25, 22–58 (1936). 21. Zhebentyayeva, T. et al. Genetic characterization of worldwide Prunus domestica (plum) germplasm using sequence-based genotyping. Hortic. Res. 6, 12 (2019). 22. Reales, A., Sargent, D. J., Tobutt, K. R. & Rivera, D. Phylogenetics of Eurasian plums, Prunus L. section Prunus (Rosaceae), according to coding and non-coding chloroplast DNA sequences. Tree Genet. Genomes 6, 37–45 (2010). Eryomine, G. V. New data on origin of Prunus domestica L. Acta Hortic. 283, 27–30 (1990). 24. Kubisiak, T. L. et al. A transcriptome-based genetic map of Chinese chestnut (Castanea mollissima) and identification of regions of segmental homology with peach (Prunus persica). Tree Genet. Genomes 9, 557–571 (2013). 25. Humann, J. L., Lee, T., Ficklin, S. & Main, D. in Gene Prediction: Methods and Protocols. (ed Martin Kollmar). Structural and Functional Annotation of Eukaryotic Genomes with GenSAS. (Humana Press, New York, 2019). 26. Lowe, T. M. & Chan, P. P. tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 44, W54–W57 (2016). 27. Lagesen, K. et al. RNammer: consistent annotation of rRNA genes in genomic sequences. Nucleic Acids Res. 35, 3100–3108 (2007). 28. Mitchell, A. L. et al. InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 47, D351–D360 (2018). 29. Collum, T. D., Lutton, E., Raines, C. D., Dardick, C. & Culver, J. N. Identification of phloem-associated translatome alterations during leaf development in Prunus domestica L. Hortic. Res. 6, 16 (2019). 30. Collum, T. D. et al. Translatome profiling of plum pox virus infected leaves in European plum reveals temporal and spatial coordination of defense responses in phloem tissues. Mol. Plant Microbe Interact. 36, 66–77 (2020). 31. Sambrook, J., Fritsch, E. F. & Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd edn, (Cold Spring Harbor Laboratory Press, NY USA, 1989). 32. Verde, I. et al. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat. Genet. 45, 487 (2013). 33. Doyle, J. J. & Doyle, J. L. Isolation of plant DNA from fresh tissue. Focus 12, 39–40 (1990). 34. Callahan, A. M., Morgens, P. H., Wright, P. & Nichols, K. E. Comparison of pch313 (pTOM13 homolog) RNA accumulation during fruit softening and wounding of two phenotypically different peach cultivars. Plant Physiol. 100, 482–488 (1992). Rights and permissions Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .