Document 210016

Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Databases
End
How to find me
DD2396 Bioinformatics
Lars Arvestad
Address: Roslagstullsbacken 35
Albanova
Phone:
5537 8565
Email:
arve@csc.kth.se
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
What is Bioinformatics?
What is Bioinformatics?
• Bioinformatics 6= sequence analysis
There is:
• Automatic analysis of literature
• Analysis of gene expression
• Proteins as molecules, not sequences
• Interactions between gene and/or
proteins
• Modeling the cell
”In silico biology”, computational biology
Refinement of lab results
Reduction of wet-lab work by computational
predictions:
• ”Which gene could cause this?”
• ”What 3D structure does this protein have?”
• Structuring and handling of datasets in
molecular biology
Structure: ”What is a gene? How do we store
its information?”
• Infrastructure: ”How can researchers access
this data?”
•
Contact
What is Bioinformatics?
Application examples
This course
Databases
This course
What it is not
My take:
• Extracting knowledge from biomolecular
data
•
•
•
Application examples
End
Contact
What is Bioinformatics?
Central theme : Methods
Application examples
This course
Databases
End
Course goals
• Understand possibilities and limitations with
Bioinformatics
• Be able to use lab-supporting
Method = Algorithm + Data + Art
Contact
What is Bioinformatics?
Application examples
This course
Databases
What is not in the course
Bioinformatics in the industry.
• Work in a lab and use Bioinformatics for
analyzing results.
• Prepare for PhD studies
End
Contact
What is Bioinformatics?
Application examples
This course
Databases
In general
• Is this a novel gene?
•
•
•
•
Wet lab
Biophysics
Statistics (well, a little bit)
No programming
• I have a novel protein. What structure does
it have?
• What genes are found in platypus?
End
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
Example: Wood genes and frogs
Application examples
This course
Databases
This course
Databases
End
Basic idea for Swedish Human Proteome
Resource:
• In what tissues are proteins/genes active?
Where in the cell?
• Consider all known human genes, choose
peptides
• Generate antibodies for selected peptides
• Stain tissues with antibodies
• Study, annotate, and store in database
Henrik Aspeborg, KTH:
• Found new gene in hybrid aspen
• Very active during wood formation
• 60 aa strongly similar to a pattern first found
in frog. Function: localizes to microtubuli
• Contains typical phosphorelation sites
What is Bioinformatics?
Application examples
Example: Lab support for HPR
Goal: How do you synthesize cellulose?
• Sequence genes active during wood
formation
• Compare with known genes: What’s new?
• Look for interesting properties
Contact
What is Bioinformatics?
End
Contact
What is Bioinformatics?
Example: Lab support for HPR
Application examples
This course
Databases
End
Databases
End
Databases
End
Evolution of HIV
Bioinformatics: How choose peptides?
Constraints:
• What genes?
• Avoid membrane bound proteins!
• Select unique peptides!
Leitner and Albert, 1995
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
What is Bioinformatics?
This course
What is Bioinformatics?
Application examples
This course
This course
Requirements
Course web: http://www.csc.kth.se/
DD2396/bioinfh09
Schedule: KTH web.
Lecture contents on the course page!
Registration: Link to form on course web
Lecturer: Me!
Assistants: Hossein Farahani, Joel Sjöstrand
Contact
Application examples
Databases
Computer labs
• Four scheduled lab times
• Three main assignment sets
• Present your results in lab (no report)
•
•
End
Contact
3 computer labs
Exam
What is Bioinformatics?
Application examples
This course
Home assignments
• 4 home assignments distributed.
• Deadlines to be determined.
• Approved solutions give points towards first
part of exam.
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
What is Bioinformatics?
Exam
Application examples
This course
Databases
End
Optional extra assignments
• Part 1: ”theory”
•
• Part 2: ”practice”
1 optional project
”Lab 4”, requires written report.
Bumps your grade one step!
• About 15 points each
• 15 points for passing grade
•
• Bonus only applicable on part 1
1 optional essay
Bumps your grade one step!
• Part 2 not graded if part 1 has less than 10
points
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
Litterature
What is Bioinformatics?
Application examples
•
•
This course
Databases
End
Hundreds, each with different angle!
• Proteins
• Genes
• Genomes
• Certain gene familes (GPCRs, TF)
• Gene expression
• Interactions
• Pathways
• Evolutionary histories
This course
Databases
End
Contact
Sequence databases
•
Application examples
Bioinformatics database
Book: Zvelebil and Baum, Understanding
Bioinformatics.
Lab notes: Buy at CSC student office, Osquars
Backe 2.
Extras: See course home page
Contact
What is Bioinformatics?
What is Bioinformatics?
Application examples
This course
Databases
End
Databases
End
In the beginning
Primary and secondary
Annotated and automatic
”Hobby collections” vs industrial
projects
Margaret Oakley Dayhoff:
Atlas of Protein Sequences
60’s: A book on protein sequences
1972: Electronic distribution
Contact
What is Bioinformatics?
Application examples
This course
Swiss-Prot
• Based on Atlas
• 1986: 3 900 sequences
• 2007: 252 616 sequences
• 2009: 405 506 sequences
• Curated database
• Part of UniProt consortium
Databases
End
Contact
What is Bioinformatics?
Application examples
This course
Example: Hemoglobin A
Sequence data in Fasta format
id
que
i
n
”, u
ion
s
r
s
ke
ce
Mar ”Ac
ive
ript
c
s
e
es d
m
i
t
e
n
som
atio
,
t
e
o
Nam
Ann
>Q3MIF5|Q3MIF5_HUMAN Hemoglobin, alpha 1 - Homo sapiens
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLS
HGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFK
LLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
Common problems with sequences
databases
What is Bioinformatics?
Application examples
This course
Databases
Application examples
This course
Databases
End
UniProt: foremost protein DB
• www.uniprot.org
• Redundancy:
• Different sources (genome projects, directed
sequencing)
• ”Posttranslational editing”
• Exactly the same protein, different species
• Actual manual errors:
• bad annotetion
• bad sequencing
• bad scientist!
• Automatic annotation — dangerous
• ”Hemoglobin, alpha 1” in a plant?
• Better? ”Similar to Q3MIF5_HUMAN
Hemoglobin, alpha 1 - Homo sapiens”
• ”Hypothetical protein”
Contact
What is Bioinformatics?
• International collaboration
• Stores
• sequence data
• ”non-redundant” data sets
• cross references to other DB’s
• literature references
• annotation about function etc.
• Powerful search methods
• User friendly
End
Contact
PDB: Protein Data Bank
What is Bioinformatics?
Application examples
This course
Databases
End
Nucleotide databases
• www.pdb.org
structure
• Today: 55 271
structures
• Structure details,
sequence data,
weight, known
function, litterature,
chemical and
biological knowledge
• Pretty pictures!
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Genome analysis at Ensembl
• www.ensembl.org
• Stores, analyses, and presents chosen
Eukaryotic genomes
• User friendly, yet advanced
• You find
• gene and protein sets
• species comparisons
• data on genomic variation
• (not the best user interface)
Application examples
• Source for refined collections
Hutchison,
Nucl Acid Res 2007
• Hosting
• GenBank, ”nr”, ”nt”, etc
• PubMed
• genome project
• taxonomy knowledge
• and more
What is Bioinformatics?
• Journals require deposition
Hemoglobin A
NCBI
Contact
• Hub for data gathering
GenBank
EMBL
DDBJ
• Molecules with solved
This course
Databases
Zoom from chromosome...
End
Contact
What is Bioinformatics?
Application examples
This course
...down to gene
Databases
End
Contact
What is Bioinformatics?
Application examples
This course
Databases
End
Gene for hemoglobin A
Contact
What is Bioinformatics?
Application examples
This course
Databases
Summary
• Many different sources: Search actively for
data!
• Some source are vital: Your first attempt
• Do not trust data!
Contact
What is Bioinformatics?
Application examples
Next time
•
•
•
•
Comparing sequences
Visualizing similarity
Alignments
Scoring systems
This course
Databases
End
End