Home | BLAST | Search
About Galdieria
Sequence Data
Project Team
Project Description
Project Funding
Education and Outreach
Related Links
Related References
Citing the Galdieria Genome Project
Project Description

Library construction

For this project, we will generate four new G. sulphuraria whole-genome shotgun libraries using standard protocols and commercially available library construction kits.

Plasmid Library: The majority of the shotgun sequencing will be accomplished using a plasmid library constructed in pSMART-HC Kan (Lucigen Corp., Middleton, WI). This system uses transcriptional terminators to minimize unintended transcription out of DNA inserts. In addition, the pSMART vector does not use a promoter or indicator gene. Thus, transcription through the insert is virtually eliminated, thereby facilitating the cloning of toxic gene products and unstable elements, thus improving the representation of the plasmid library.
Genomic DNA will ber randomly fragmented by shearing using a HydoShear device and end repaired to generate blunt ends. Fragments of about 2kbp (isolated by prparative agarose gel electrophoresis) will be ligated into the pSMART vector. The project will require 110,000 plasmid transformants (see below). However, we will generate > 300,000 transformants to permit additional sequencing if needed.
Fosmid library: We will generate a Fosmid library containing 40 kbp inserts. Genomic DNA will be sheared, end repaired, size fractionated, ligated into the Fosmid vector pCC1FOS, and subsequently packed into Lambda phages. We will use the commercially available CopyControl Fosmid library production kit (Epicentre Technologies, Madison, WI). This system allows maintenance and propagation of single copy Fosmid clones, thus increasing clone stability. However, the copy number can be increased to 10−50 copies per cell by induction of the trfA gene product, thus permitting purification of high quality template DNA for direct sequencing of inserts. We will generate > 15,000 Fosmid clones.
BAC library: We will generate two BAC libraries (BamHI, EcoRI) containing 100 kbp inserts using commercially available CopyControl BAC Cloning Kit (Epicentre Technologies , Madison, WI). High molecular weight genomic DNA embedded in LMP agarose plugs will be partially digested by EcoRI or BAMHI, respectively. Fragments will be size fractionated by pulsed-field gel electrophoreseis and isolated by electro-elution. Fragments will be ligated into predigested and dephosphorylated pCCBAC vectors. pCCBAC vectors can be maintained as single copy clones in E. coli cells carrying a mutant trfA gene such as E. coli EPI300. The trfA gene product is required for initiation of replication from oriV. EPI300 cells contain an additional copy of trfA under the control of an inducible promoter. Induction of trfA gene expression increases the copy number of pCCBAC from single copy to 10−20 copies per cell, thus permitting isolation of sufficient quantities of purified DNA for BAC end sequencing, subcloning, and fingerprinting. We will generate > 1,000 BAC clones for each EcoRI and BamHI restricted DNA.

Template preparation and DNA sequencing

The GTSF Genomics Core will carry out template preparation and high-throughput DNA sequencing.
A GeneMachines Mantis colony picker will be used to pick bacterial colonies into 96-well plates. Two Qiagen 3000 robots will be used for liquid handling and plasmid purification, inaddition to one GeneMachines RevPrep Orbit plasmid purification system. Eleven PE9700 thermocyclers are available to carry out sequencing reactions and ABI 3730xl and ABI 3700 high-throughput capillary DNA sequencing systems will be used to acquire sequence information.

Computer assisted sequence assembly and bioinformatics

The GTSF Bioinformatics Core will carry out the sequence assembly. GTSF has a staff of four bioinformatics specialists who maintain the computational infrastructure and conduct bioinformatics analysis in support of the Genomics (sequencing) Core.
Approximately 250,000 sequence reads with an average Q20 Phred length of 500 nucleotides will be generated. These sequence reads, together with the pairing information, will be used as input for a sequence assembly program to generate an assembly of the paired-end sequence reads.
We will employ the Arachne program for assembling the Galdieria reads (alternative assembly programs are also available, for example Phusion. Arachne was developed at the Whitehead Institute/MIT Center for Genome Research and has been successfully used to assemble the sequence of the mouse genome. Ace files generated by Arachne will be used as input for Autofinish/Consed and Autofinish will be used to automatically choose finishing reads.