California State University, Chico Department of Computer Science Syllabus Spring 2015 – CSCI 598 Advanced Topics in Computer Science Bioinformatics: Computational Methods for Next Generation Sequencing Data Analysis Course Title: Advanced Topics in Computer Science Bioinformatics: Computational Methods for Next Generation Sequencing Data Analysis Course Description: An introduction to computational methods for Next Generation Sequencing data analysis. Topics include mapping sequenced reads back to a reference genome; approximate string matching; intro to biostatistics: probability distribution, hypothesis testing; identification of SNPs (single polymorphisms); analysis of RNA-seq data: mapping RNA-seq reads, identification of splice-junctions, analyzing gene expression; analysis of methylation: mapping bisulfite-treated reads, estimation of methylation level, genome-wide associative analysis of methylation and gene expression. Prerequisites: CSCI-311 Course Times and Location: TR 11:00am – 12:15am in OCNL 254 Instructor: Dr. Elena Harris E-mail: eyharris@csuchico.edu Website: http://www.cs.ucr.edu/~elenah/ Office Phone: (530) 898-4304 Office: OCNL 221 Office Hours: MW 1:00pm-2:00pm and 3:00pm-3:45pm TR 12:30pm-1:30pm Or by an appointment Recommended Textbook: An Introduction to Bioinformatics Algorithms (Computational Molecular Biology), First Edition by Neil C. Jones and Pavel A. Pevzner, Copyright 2004 by MIT. ISBN-10: 0262101068 Course objectives: The goal of this course is to introduce students to the basic algorithms and techniques used to analyze the Next Generation Sequencing, NGS, data. One of the major components of this course is collaborative activities between Computer Science students and Biology students. After completion of this course Computer Science students will be able To program basic algorithms for NGS data including but not limiting to Pattern search (mapping sequenced reads back to a reference genome) Local approximate string matching To analyze NGS data in order to identify single-nucleotide polymorphism To use the state-of-the-art tools for NGS data analysis BRAT-bw for methylation analysis TopHat for gene expression analysis To apply biostatistics to carry out genome-wide associative analysis To work efficiently with biologists to Discuss biological problems Choose the appropriate tools and/or techniques for solving problems Convey the results and discuss the biological importance of the findings Attendance and Deportment Policy: Lecture attendance is mandatory. Positive professional attitude and interaction are mandatory. You are expected to actively participate during lecture and discussions: ask questions, answer questions when you are called, and participate in group work. It is required that you show up to class on time. Course Assignments: There will be programming or data analysis assignments on a weekly basis. Some of these assignments will be joint assignments with students from Biology Department. The course will culminate with a joint project between CSCI and BIOL students. Policy on Turning in Homework Assignments: Due dates are firm. No late assignments will be accepted unless serious illness or other excused absences merit allowances in the judgment of the instructor. Exams: There will no exams. Group work: Group work (including collaborative with BIOL) will be offered as needed to improve your understanding of the material presented and to facilitate your team working skills. There will be no make up for missed group work (no exceptions). Grade Evaluation Procedures: Students will be graded based on their performance in the following course components (grades will NOT be curved): Programming or data analysis assignments (7-11) 50% Group work (as needed) 20% Project (one) 30% Final Grades: Final grades will be expressed as a percentage of the maximum possible score of all evaluated materials. I will round up decimal points to the nearest integer. Final grade will not be curved. Letter grades will be given according to the following: Scale (inclusive) Letter Grade University Definition 93-100 90-92 87-89 83-86 80-82 77-79 73-76 70-72 67-69 60-66 0-59 A AB+ B BC+ C CD+ D F Superior Work Very Good Work Adequate Work Minimally Acceptable Work Unacceptable Plagiarism/Cheating Policy: Students who are in violation of the University’s policy on academic honesty and integrity will be reported to the Campus Student Judicial Affairs. Such violations include copying of other students’ work. University Policies and Campus Resources Academic integrity Students are expected to be familiar with the University’s Academic Integrity Policy. Your own commitment to learning, as evidenced by your enrollment at California State University, Chico, and the University’s Academic Integrity Policy requires you to be honest in all your academic course work. Faculty members are required to report all infractions to the Office of Student Judicial Affairs. The policy on academic integrity and other resources related to student conduct can be found at: http://www.csuchico.edu/sjd/integrity.shtml Campus Policy in Compliance with the American Disabilities Act If you need course adaptations or accommodations because of a disability, or if you need to make special arrangements in case the building must be evacuated, please make an appointment with me as soon as possible, or see me during office hours. Students with disabilities requesting accommodations must register with the DSS Office (Disability Support Services) to establish a record of their disability. Special accommodations for exams require ample notice to the testing office and must be submitted to the instructor well in advance of the exam date. Student Computing Computer labs for student use are available http://www.csuchico.edu/stcp located on the 1st floor of the Merriam Library Rm 116 and 450, Tehama Hall Rm.131 and the BMU Rm 301. Student Services Student services are designed to assist students in the development of their full academic potential and to motivate them to become self-directed learners. Students can find support for services such as skills assessment, individual or group tutorials, subject advising, learning assistance, summer academic preparation and basic skills development. Student services information can be found at: http://www.csuchico.edu/5.-studentservices.html. Disability Services Any student who feels s/he may need an accommodation based on the impact of a disability should contact me privately to discuss your specific needs. Please also contact the Disability Support Services office to coordinate reasonable accommodations for students with documented disabilities. Disabilities Support Services online: http://www.csuchico.edu/dss/studentServices/. Student Learning Center The mission of the Student Learning Center (SLC) is to provide services that will assist CSU, Chico students to become independent learners. The SLC prepares and supports students in their college course work by offering a variety of programs and resources to meet student needs. The SLC facilitates the academic transition and retention of students from high schools and community colleges by providing study strategy information, content subject tutoring, and supplemental instruction. The SLC is online at http://www.csuchico.edu/slc/. The University Writing Center has been combined with the Student Learning Center. Tentative Schedule of Topics Week# Topics covered Biology background: DNA, Genes structure (exons and introns), ORFs, 1 transcription and translation. Next Generation Sequencing Technology. Pattern search problem (string matching problem). 2 Hash tables for exact read mapping. Approximate string matching (mismatches). 3 Approximate string matching (indels). Smith-Waterman algorithm for local sequence alignment. Biostatistics: Intro to Probability. Probability distributions (binomial, 4 hypergeometric, normal). Intro to data analysis. Hypothesis testing. Identification of SNPs. Associative analysis of indels and 5 disease. RNA-Seq. Alignment of RNA-Seq reads to transcriptome. Updating Hash 6 table with sequence ID. Ambiguous reads. RNA-Seq. Identification of splice-junction. TopHat. 7 Analyzing RNA-Seq data: identifying differentiated gene expression. 8 Spring Break 9 Short coding ORFs from RNA-seq data. Date Jan 20 Jan 22 Jan 27 HW1 Jan 29 Feb 3 HW2 Feb 5 Feb 10 HW3 Feb 12 Feb 17 HW4 Feb 19 Feb 24 HW5 Feb 26 March3 HW6 March5 March10 HW7 March 12 March 17 March 19 March 24 March 26 Epigenomics: Methylation. Mapping bisulfite-treated reads. 10 11 12 13 14 15 Methylation level. Differential Methylated regions, DMRs. Hypomethylated Methylated regions, HMRs. Associative analysis of methylation level and gene expression. microRNA mapping. Associative analysis of miRNA and mRNA. Project Presentations Project Presentations March 31 HW8 Apr 2 Apr 7 HW9 Apr 9 Apr 14 HW10 Apr 16 Apr 21 HW11 Apr 23 Apr 28 Apr 30 May 5 May 7
© Copyright 2024