Upload
janis-copeland
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to Computational Thinking
Vicky Chen
Fundamental Theorem of Informatics
Friedman C P J Am Med Inform Assoc 2009;16:169-170
What Informatics Is Not
Friedman C P J Am Med Inform Assoc 2009;16:169-170
Computational Thinking
Computational thinking is a way of solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science. To flourish in today's world, computational thinking has to be a fundamental part of the way people think and understand the world.
http://www.cs.cmu.edu/~CompThink/
Computational Thinking
• Analyzing and logically organizing data• Data modeling, data abstractions, and
simulations• Formulating problems so computers may assist• Identifying, testing, and implementing possible
solutions• Automating solutions via algorithmic thinking• Generalizing and applying this process to other
problems
Algorithm
• A finite list of instructions that describe all required steps to perform a computation, written in general language
Programming Steps
• Specification– What the code should do
• Design– Pseudocode
• Implement– Programming
• Test– Debugging
Data Type / Data Structure
• Integer• Floating point• Boolean• Character• String
• List• Dictionary• Hash Table
Data Types
List
Dictionary / Hash Table
Exercise 1
We have a matrix with mutation information for different tumor samples.
How can this data be represented?
List of Lists
• Data is a sparse matrix• Stores a lot of extra uninformative information
Dictionary
Opening Files
• Mutation matrix contains data on 2337 genes and 779 samples
• Inputting data by hand is not feasible• Data usually read in and processed from files
Opening Files
Input and print
For Loops
While Loops
Conditional Statements
Conditional Statements
• If, else if, else• and• or• not
Exercise 2
We have a dictionary that contains tumor sample mutation information.
We want to print out a list of tumor samples after receiving a mutated gene of interest from the user.
Opening Files Revisited
Opening Files Revisited
Data Extraction from Files
• Many files will contain extra information• Focus on extracting only pertinent data• Applicable to many types of data– Natural language documents (e.g. articles)– Sequence data (e.g. FASTA files)– Files from databases (e.g. NCBI Gene, TCGA)– Etc.
Regular Expressions
Reusing Code
• Some code can be useful in multiple situations• It is possible to just rewrite (or copy) the code
each time– Less efficient– Multiple locations to fix when debugging
Functions
Exercise 3
We have a document containing human gene information downloaded from NCBI.
We want to extract and store the Ensembl ID of each gene with its corresponding gene symbol.