| |
C. elegans
online
Contents:
General resources
Comprehensive web site
Caenorhabditis elegans WWW
server
Links to almost everything on the web that concerns C. elegans,
including recent papers,
literature
search , meetings,
CGC,
labs,
researchers,
EM
center, announcements,
bionet.celegans,
software,
nematodes,
genome,
methods,
and cells, can be reached
from here. The site is well maintained, kept current (see the what's
new page).
Central database
WormBase
This is worm genetic and genomic information through a useful web interface.
The project is a consortium between biologists and computer scientists,
and is rund by the bioinformatics guru Lincoln Stein. It has many
features not available through the ACEDB format. The best place to
start looking for information on a worm gene. See also
Worm
PD.
ACEDB
"A C. elegans database"
-
WebACEDB
-- The Sanger Centre's web version - try the AltaVista
interface
This is the comprehensive worm genetics and genomics database.
"ACEDB" refers to both the software
program that accesses the data and the worm data itself. ACeDB can
show you the genetic map, physical map, genomic sequence, ESTs, information
on characterized genes, references, and pictures. It is particularly
useful for seeing the correspondance between loci on the genetic and physical
maps. You can download your own copy of ACeDB (4MB) by anonymous
ftp (the Macintosh version is called macace).
Other general information sources
Sequences
Genome sequence
Microarray data
The C. elegans "proteome"
-
All redicted proteins can be found through Genbank,
WormBase,
and Sanger's ACEDB.
-
Proteome, Inc maintains a well-annotated
database of predicted C. elegans proteins, Worm
PD, that includes such information as size, pI, expression patterns
(when known), and relevant references. This database has a variety
of search approaches, such as post-translational modifications and subcellular
localization. See a sample
output.
-
The C. elegans ORFemome
cloning project at Harvard -- a searchable/BLASTable database of the
worm ORFs put together for the purposes of expressing the sequences in
yeast and bacteria
-
Southeast Collaboratory
Structural Genomics -- a C. elegans protein expression and structural
biology project
Expressed sequence tags (ESTs)
BLAST servers
-
BLAST at NCBI
-- allows you to BLAST sequences from specific organisms, including C.
elegans
-
C.
elegans BLAST (Sanger)
-- all finished and unfinished genomic sequence,
ESTs, and predicted proteins (WormPep)
-- blastn, tblastn, blastx, tblastx, blastp
-- searches of genomic sequence (blastn, tblastn,
tblastx) return links to cosmid sequences and their WebACEDB entries
-- searches of protein sequence (blastp, blastx)
return predicted locus names and links to their WebACEDB entries
-
C. elegans
BLAST
(Wash U)
-- all finished and unfinished genomic sequence
-- blastn, tblastn
-- returns only cosmid names, no links
-
C.
elegans ESTs BLAST (Japan) -- all ESTs; blastn; complete links
-
C. briggsae BLAST
at WashU -- same genus, different species, about
20 to 50 million years diverged.
-- comparing genes between C. elegans
and C. briggsae can help identifiy important conserved sequences
C. briggsae assembled sequence BLAST at WashU
and at Sanger
-- preliminary assembly of 10x sequence data of the whole C. briggsae
genome.
Very useful for finding regulatory sequences, miRNAs, protein motifs, etc.
Advice: To find a worm homolog, use a protein sequence to search
WormPep by blastp. Then to get the DNA and protein sequences of the
homolog, go to Genbank and use the name of the hit to retrieve the annotated
cosmid sequence.
(What is the difference between the various BLAST programs, blastx and
tblastn, for example? See BLAST
search programs.)
Other sequence databases
Obtaining DNAs and worm strains
ESTs (cDNA clones)
See Yuji Kohara's EST
pages -- you can order for free worm ESTs (which are long and sequenced
from both ends) from: ykohara@lab.nig.ac.jp
Genomic clones
-
Specific cosmids or YACs of genomic DNA can be requested from Alan Coulson
at the Sanger Centre (alan@sanger.ac.uk).
Expression vectors
-
worm
expression vectors from the Fire
lab -- Andy Fire has provided the worm community with an extraordinarily
complete and useful collection of gene expression vectors, from a variety
of promoters, to multiple beta-gal and GFP fusion vectors in all frames.
Worm strains
Gene knockouts
Want to name a C. elegans gene or get your own designation for
strains and alleles?
Abstracts, news, and people
Abstracts and literature
News and information
-
bionet.celegans newsgroup
Have your questions answered, seek advice, or solicit information.
Announcements concerning ACeDB updates, meetings, etc. always appear here.
You can also search
archived posts for topics previously discussed. This link is
to an HTML version. You can also monitor it using a newsreader or
by email: send the message "subscribe CELEGANS" to biosci-server@net.bio.net.
The newsgroup is moderated, meaning it is free of 'spam'.
-
What's happening at the Boston
Area Worm Meeting and the New
York Area Worm Meeting
Contacting worm labs
Genetic Map and Nomenclature
-
For all issues concerning new genetic data and gene names, contact Jonathan
Hodgkin jah@bioch.ox.ac.uk
Protocols for worm work
Protocol collections
Reverse genetics
RNAi
Things to know about the worm
C. elegans is a nematode
Caenorhabditis elegans (Caenorhabditis means "new rod-shaped
thing") is a nematode, or roundworm. The earthworm is an annelid,
or segmented worm, and is in a different phylum. Nematodes
are some of the most abundant animals on the planet. They are found in
almost every environment, and many are harmful parasites of animals
and plants. C. elegans however is not a parasite. It is a
free-living nematode that lives in the soil. In a teaspoon of soil from
a garden it is possible to find many nematodes, some of which may
be C. elegans or its relatives. Soil nematodes eat bacteria.
C. elegans is microscopic and grows fast
C. elegans is barely visible with the naked eye -- a fully grown
adult is approximately 1 millimeter long, or about the size of Lincoln's
nose on a penny. Its eggs are among the smallest in the animal kingdom.
C.
elegans grows in about three days and has hundreds of offspring.
An egg is laid after being fertilized inside the mother and takes
about 15 hours to develop. After hatching from the egg into a larva
which looks like a miniature version of the adult, the worms develop
through four larval stages (called L1, L2, L3, and L4). At
the end of each larval stage they synthesize a new cuticle (a layer of
protein and carbohydrates that cover the animalís hypodermis,
or skin) and shed the previous cuticle by molting. After the last molt
they are mature adults capable of reproducing. Development
to adulthood takes about 2 1/2 days at 25°C, and 6 days at 15°C.
The total life-span of a worm under the best growth conditions is about
12 to 18 days at 20°C. In the laboratory we grow C. elegans
in petri plates that contain an agar medium suplemented with cholesterol.
On the surface of the agar we put a lawn of E. coli.
(Sometimes we accidentally contaminate the petri dishes with mold, yeast
or other bacteria from our hands, but it usually doesnít bother
the worms much.) The worms eat all the bacteria in a few days, get
crowded and begin to starve, so we transfer them to fresh plates.
In response to crowding, C. elegans can arrest development at the
end of the second larval stage, and last in that dormant state for
months to years. When these arrested worms, called dauer ("enduring")
larvae, are moved to fresh plates with bacteria to eat, they resume development
where they left off.
C. elegans is useful for experimental genetics
C. elegans has two sexes, male and hermaphrodite. The
hermaphrodites produce both sperm and eggs and are self-fertilizing, or
automictic. Hermaphrodites typically lay 300 fertilized eggs during
life. If they are fertilized by a male, they can produce hundreds
more. We can freeze all of our mutant worm strains in liquid
nitrogen (something one can't do with Drosophila). It's not
the dauers or eggs that survive the freezing best, but the young larvae.
Because the worms are very small, we use binocular dissecting microscopes
when we are examining them and moving them from plate to plate.
For other purposes, such as to see the anatomy of the worm in great
detail or to watch individual cells dividing, we use compound microscopes
with Nomarski (or Differential Interference Contrast, DIC) optics.
One of the most useful things about C. elegans to us is how easy
it is to find mutants that affect many different kinds of processes.
Using mutants, we can deduce how these processes work normally.
Mutants of C. elegans have been found that have altered development,
behavior, movement, ability to smell and taste, feeding, defecation,
rate of growth, aging and programmed cell death.
C. elegans is arguably the most throroughly understood animal
C. elegans is the only animal for which the entire
cell lineage is known from zygote to adult. It is also the
only animal for which the entire wiring of its nervous system is known.
It is the first animal to have its entire genome sequenced.
The genome size of C. elegans is 100 megabases or 1/30 the size
of the human genome. It has six chromosomes, five autosomes
and a sex chromosome, all of similar size. The chromosomes are
holokinetic, that is, they don't have a centromere. There is little
repeated DNA in the genome, which makes it good for sequencing.
There are about 18,000 protein coding genes, and 1000 RNA genes.
About half the genes have similarity to genes in other organisms,
and some are homologs
of disease genes in humans. Experiments using C. elegans are
done by thousands of scientists
around the world and even on the space
shuttle.
Some movies (Quicktime):
Things to know about C. elegans sequences
Genomic sequences are deposited in Genbank with annotations.
The genome sequencing project proceeds cosmid by cosmid (where there
are gaps between cosmids, YACs are sequenced). Once the sequence
of a cosmid is determined, its entire sequence (or at least that part that
does not overlap with that of neighboring clones) is deposited in Genbank
with the gene annotations. Therefore, ususally there are not separate
Genbank entries for each gene predicted on a cosmid (however, for some
there are; go figure). To retrieve the annotated sequence of a cosmid,
use its name (F19A6, for example) to search Genbank.
The Genbank file will have each predicted protein sequence within it (F19A6.4,
for example). To make a file of the DNA sequence of your gene alone
you will have to cut and paste according to the numbers given for your
gene in the Genbank file. The sequence annotations begin with the
ATG and end with the termination codon.
Most gene sequences are only predictions.
Be aware that most of the protein sequences are only predictions and
have not been verified experimentally. The DNA sequence itself is
very rarely in error, but the locations of splices, starts and stops are
sometimes predicted incorrectly by the Genefinder software. If there
are ESTs indicated that correspond to the gene (yk6b1.5, for example),
then there has been some confirmation of the gene structure given.
Careful re-analysis of the predicted sequence may be worth the effort.
In particular, be suspicious of 'bifunctional' genes -- see the note below
on trans-splicing.
Predicted genes are given cosmid names, confirmed genes are given different
names.
All predicted genes are named after the cosmids on which they are found:
for example, F19A6.4 is the fourth gene on the cosmid F19A6. Here
are examples of other types of DNA clones: Y69A9 is a YAC; yk6b1.5 and
yk6b1.3 are the 5' and 3' reads of the EST yk6b1. If you are searching
ACeDB with the name of a sequence that begins with "CE" or "CEL", you probably
aren't finding anything; search with the name without the "CE" or "CEL".
"Claimed" genes, including those identified by mutation, are given prefixes
of three lowercase italicized letters and a number (list
of gene prefixes and what they mean). Proteins are given the same names
as genes, but are in all caps, not italicized. For example, the protein
encoded by the gene lin-28 is LIN-28. For more information
on how genes, alleles, and genotypes are written, see the official genetic
nomenclature and list
of gene names. If you want to request a new gene name, write to Jonathan
Hodgkin jah@bioch.ox.ac.uk
Many worm genes are trans-spliced at their 5' ends.
Worm introns most often begin with GU and end with UUUCAG. Seventy
percent of worm mRNAs begin with a trans-spliced leader. The cis-signal
for trans-splicing is the same as the most common intron splice acceptor:
UUUCAG. The fact that cis and trans acceptors are the same means
Genefinder can predict a false 5' exon or to fuse two closely spaced genes.
You can determine whether an EST goes all the way to the 5' end if you
see some of the sequence of a spliced leader beginning the ".5" sequence.
The sequence of the spliced leaders are below. Over 90% of trans-spliced
leaders are SL1. A few percent of worm genes are "bi-cistronic":
two products are produced from a single primary transcript by an "internal"
trans-splice, commonly using SL2.
| spliced leader |
length |
sequence |
| SL1 |
22 |
cap-GGUUUAAUUACCCAAGUUUGAG- |
| SL2 |
22 |
cap-GGUUUUAACCCAGUUACUCAAG- |
| SL3 |
22 |
cap-GGUUUUAACCCAGUUAACCAAG- |
| SL4 |
22 |
cap-GGUUUUAACCCAUAUAACCAAG- |
| SL5 |
23 |
cap-GGUUUUAACCCAAGUUAACCAAG- |
|