[ Announcements | Homeworks | Projects ]



CSC 450 Algorithms in Bioinformatics

Time and place Fall 2017
Tuesday and Thursday, 3:30-4:45pm
Biological Sciences West 208
Dates
August 22
November 23
December 5
  First class
  Thanksgiving holiday
  Last class
Instructor John Kececioglu
kece@cs.arizona.edu
Gould-Simpson 727
(520) 621-4526
Assistant Eric Welch
welche@email.arizona.edu
Gould-Simpson 934
Office hours
Monday, 3:30-4:30pm
Wednesday, 10-11am
Wednesday, 1:45-2:45pm
and by appointment
  John, GS 727
  John, GS 727
  Eric, GS 934
 
Web
Course homepage
Instructor homepage
  www.cs.arizona.edu/classes/cs450/fall17/
  www.cs.arizona.edu/people/kece/
Course description This course introduces fundamental results in discrete algorithms for combinatorial problems in bioinformatics and computational biology. The emphasis is on realistic models of computational problems that arise in the analysis of biological data, and practical algorithms for their solution. The content has depth in the area of biological sequence analysis, and breadth in areas such as DNA sequence assembly, evolutionary tree reconstruction, and genome rearrangement analysis. Grades are based on homeworks, exams, programming projects, and a class presentation.
Prerequisites CSC 345 or permission of instructor
Learning outcomes Students successfully completing the course will learn fundamental algorithms that are widely used in bioinformatics, and will have experience in designing new algorithms for bioinformatics and computational biology applications. The impact of algorithm design on the performance of software in terms of running time and space usage will also be demonstrated through algorithm implementation and evaluation. Skill in algorithm design for bioinformatics will be acquired through problem solving in homeworks and exams; skill in algorithm implementation and evaluation will be gained through programming projects and computational experiments.
Teaching format The course is taught in a lecture-only format, with lectures given by the instructor. Homeworks, in-class exams, and programming projects all involve individual work, while research projects involve group work.
Attendance Attendance of lectures is required. Roll is taken at the start of class with a sign-in sheet; if your signature is not on the roll sheet, it counts as an absence. Students who miss class due to illness or emergency are required to bring documentation from their healthcare provider or other relevant, professional third parties. Failure to submit third-party documentation will result in unexcused absences.

The attendance policy is that you are allowed one unexcused absence in the first week, two unexcused absences in the first four weeks, and four unexcused absences in the first eight weeks. On exceeding these limits, you may be dropped from class.

The UA policy concerning class attendance, participation, and administrative drops is available here. The UA policy regarding absences for any sincerely held religious belief, observance or practice will be accommodated where reasonable, and is available here. Absences preapproved by the UA Dean of Students (or dean’s designee) will be honored (see the policy here).

Online communication Online communication outside the lectures will be via postings on the announcements section at the bottom of this page, and by email with the instructor at the address given above.
Required text Wing-Kin Sung,
Algorithms in Bioinformatics: A Practical Introduction,
Chapman and Hall, Boca Raton, Florida, 2010.
Optional texts Wing-Kin Sung,
Algorithms for Next Generation Sequencing,
Chapman and Hall, Boca Raton, Florida, 2017.

Veli Mäkinen, Djamal Belazzougui, Fabio Cunial, and Alexandru Tomescu,
Genome-Scale Algorithm Design: Biological Sequence Analysis in the Era of High-Throughput Sequencing,
Cambridge University Press, Cambridge, United Kingdom, 2015.
Related books The following books are also relevant.
Homeworks There are three homework assignments, which emphasize algorithm design and analysis.
Exams There are two exams: a midterm and a final. The final is comprehensive. (The UA final exam regulations are here, and the UA final exam schedule is here.)
Projects There are two course projects: a programming project, that emphasizes algorithm implementation, and a research project, that is a paper surveying the literature on a topic outside the lectures or presents work toward developing a new result in bioinformatics.
Grading

Course grades use the regular scale of A, B, C, D, and E. Grades are determined from a weighted average of the percentage of points earned on homeworks, exams, and projects.

30%   Homeworks
20%   Midterm
20%   Final
15%   Programming project
15%   Research project

Attaining a weighted score of at least 90% guarantees an A in the course, at least 80% guarantees a B, at least 70% guarantees a C, and at least 60% guarantees a D. These thresholds may be lowered, but will not be raised.

The awarding of points on homework and exam questions is roughly according to the following scheme. Having the right solution idea and the right technical execution of this idea earns at least 90%. (Only a perfect write-up that cannot be improved earns 100%.) Having the right idea but with errors in its execution is at least 80%. Having the wrong idea and errors in its execution, but demonstrating comprehension of the material, is at least 70%. Having the wrong idea, errors in execution, and deficiencies in comprehension, is roughly 60%. Work that shows no understanding is roughly 50%.

On homework and exam questions, writing an answer that relates to the question guarantees at least 50% of the points for the question (which is a failing percentage). No points are awarded for writing nothing.

Homeworks may contain bonus problems. Points earned on bonus questions are not added to points for required questions. Bonus points are tallied separately at the end of the semester, and considered subjectively as a measure of effort when deciding whether to move up a student who is near a letter-grade boundary.

On homework, very-high-level ideas can be discussed with friends, but solutions must represent individual work and must be written up separately. Use of solutions from previous offerings of the course is not permitted. Any material from the Internet that is used in a solution must be cited by its URL; to not cite it is plagiarism, which is considered cheating. (For more information, see the Department of Computer Science Course Policy on Collaboration and the University Code of Academic Integrity.)

When turning in solutions to homeworks, write only on one side of the paper, start each problem on a separate page, put the problems in the correct order, and staple the pages together. Neatness, and especially conciseness, is required to earn the highest marks. If a problem eludes solution, state this up front and write down only what you know to be correct. Rambling at length about attempts that didn't succeed may result in more points being deducted.

Requests for regrading homeworks or exams must be submitted within one week of receiving the graded homework or exam.

Homework that is late (submitted after the due date and time) is not accepted.

Requests for an incomplete (I) or a withdrawal (W) must be made in accordance with university policies.

Honors credit Students who wish to take the course for honors credit should email the instructor to make an appointment to discuss the terms and to sign the honors course request form.
Course topics We cover the topics of the course in two parts. We begin with the classic area of sequence analysis, focusing on exact and inexact string matching. We then move on to higher-level analysis of genomic data, including sequence assembly, evolutionary tree reconstruction, and genome rearrangement.

I   Biological sequence analysis

  • String matching: Exact matching of sequences using suffix trees, suffix arrays, and the Burrows-Wheeler transform.
  • Sequence aligment: Inexact matching of sequences, local and global alignment, sequence database searching, affine gap costs, convex gap costs, multiple sequence alignment, inverse parametric alignment, and genome alignment.

II   Higher-level analysis

  • Sequence assembly: Reconstructing the DNA sequence of a genome from next-generation sequencing data; mapping sequencing reads to a reference genome.
  • Evolutionary tree reconstruction: Reconstructing evolutionary trees by compatibility-, parsimony-, maximum likelihood-, and distance-based methods, using sequence or gene-order data.
  • Genome rearrangement: Computing evolutionary distances between genomes in terms of inversion, translocation, and transposition events.
Weekly schedule We cover these topics in the following weekly schedule.

August 22 August 24 Introduction and biology background
August 29 August 31 Suffix arrays and their use in string matching
September 5 September 7
(Homework 1 out)
Suffix array construction in linear time
September 12 September 14 Height array construction
September 19 September 21
(Homework 1 due)
(Homework 2 out)
Burrows-Wheeler transform and Ferragina-Manzini index
September 26 September 28
(Homework 1 back)
Applications of suffix arrays
October 3 October 5
(Homework 2 due)
Applications continued; homework solutions discussed
October 10
(Homework 2 back)
October 12
(Midterm exam)
Exam review; in-class midterm exam
October 17 October 19
(Midterm exam back)
(Homework 3 out)
Sequence alignment, and equivalence with shortest paths
October 24 October 26 Linear space alignment
October 31 November 2 Multiple sequence alignment
November 7
(Homework 3 due)
(Programming Project out)
November 9 Sequence assembly
November 14
(Research Project out)
(Homework 3 back)
November 16 Evolutionary tree reconstruction
November 21 (November 23)
(Thanksgiving)
Genome rearrangement
November 28 November 30
(Programming Project due)
Additional lectures
December 5
(Research Project due)
(Programming Project back)
Additional lecture
Code of conduct

The Department of Computer Science is committed to providing and maintaining a supportive educational environment for all. We strive to be welcoming and inclusive, respect privacy and confidentiality, behave respectfully and courteously, and practice intellectual honesty. Disruptive behaviors (such as physical or emotional harassment, dismissive attitudes, and abuse of department resources) will not be tolerated. The complete Code of Conduct is available on our Department website. We expect you to adhere to this code, as well as the UA Student Code of Conduct, while you are a member of this course.

Classroom behavior

To foster a positive learning environment, students and instructors have a shared responsibility. We want a safe, welcoming, and inclusive environment where all of us feel comfortable with each other and where we can challenge ourselves to succeed. To that end, our focus is on the tasks at hand and not on extraneous activities (such as texting, chatting, reading a newspaper, making phone calls, web surfing, and so on).

Inclusive Excellence is a fundamental part of the University of Arizona’s strategic plan and culture. As part of this initiative, the institution embraces and practices diversity and inclusiveness. These values are expected, respected, and welcomed in this course.

Students are asked to refrain from disruptive conversations with people sitting around them during lecture. Students observed engaging in disruptive activity will be asked to cease this behavior. Those who continue to disrupt the class will be asked to leave lecture or discussion and may be reported to the Dean of Students.

Some learning styles are best served by using personal electronics, such as laptops and iPads. These devices can be distracting to other learners. Therefore, students who prefer to use electronic devices for note-taking during lecture should use one side of the classroom.

Threatening behavior

The UA Threatening Behavior by Students Policy prohibits threats of physical harm to any member of the University community, including to oneself.

Preferred name and pronoun

It is already UA policy that class rosters are provided to instructors with a student’s preferred name. Students may share their preferred name and pronoun with members of the teaching staff and fellow students, as desired, and these gender identities and gender expressions will be honored in this course. As the course includes group work and in-class discussion, it is critical to create an educational environment of inclusion and mutual respect. In this class, to be inclusive of all gender identities and expressions, students will be referred to by their first or last names, the pronoun of their choice, or by default, the pronoun “they”.

Accessibility, accommodations

Our goal in this classroom is that learning experiences be as accessible as possible. If you anticipate or experience physical or academic barriers based on disability, please let the instructor know immediately to discuss options. You are also welcome to contact the Disability Resource Center (520-621-3268) to establish reasonable accommodations.

If you have reasonable accommodations, please plan to meet with the instructor by appointment or during office hours to discuss accommodations and how course requirements and activities may impact your ability to fully participate.

Please be aware that the accessible table and chairs in this room should remain available for students who find that standard classroom seating is not usable.

Academic integrity

Students are encouraged to share intellectual views and discuss freely the principles and applications of course materials. However, graded work and exercises must be the product of independent effort unless otherwise instructed. Students are expected to adhere to the UA Code of Academic Integrity as described in the UA General Catalog.

The University Libraries have excellent tips for avoiding plagiarism, available here.

Selling class notes or other course materials to other students or to a third party for resale is not permitted without the instructor’s express written consent. Violations to this and other course rules are subject to the Code of Academic Integrity and may result in course sanctions. Additionally, students who use D2L or UA e-mail to sell or buy these copyrighted materials are subject to Code of Conduct Violations for misuse of student e-mail addresses. This conduct may also constitute copyright infringement.

Nondiscrimination, anti-harrassment

The University is committed to creating and maintaining an environment free of discrimination and harrassment (see the policy here). Our classroom is a place where everyone is encouraged to express well-formed opinions and their reasons for those opinions. We also want to create a tolerant and open environment where such opinions can be expressed without resorting to bullying or discrimination of others.

Additional resources

UA academic policies and procedures may be found here. Student assistance and advocacy information is available here. Campus health information may be found here. OASIS sexual assault and trauma services are available here.

Confidentiality

Information on confidentiality of student records is here.

Subject to change

Information contained in the course syllabus, other than the grade and absence policy, may be subject to change with advance notice, as deemed appropriate by the instructor.

Distributions Grade distributions are available here.
Announcements

The grade distribution for the overall course score is posted above.



[ Top | Department ]

http://www.cs.arizona.edu/classes/cs450/fall17/
John Kececioglu (kece@cs.arizona.edu)
14 December 2017