This is an old revision of the document!
Table of Contents
Perl Exercises
read_and_write
Read from an input file and write the content to an output file. Assume the two filenames are provided in the command line.
Hint: you need to know about filehandle before attempting this exercise.
$ cat input.txt George Washington John Adams Thomas Jefferson James Madison James Monroe $ perl read_and_write.pl input.txt output.txt $ cat output.txt George Washington John Adams Thomas Jefferson James Madison James Monroe
See a sample answer here
count_first_name
Read an input file that contains several names (assuming one name in each line, the first name and the last name are separate by a space). Count how the number of times each first name appeared in the input. Print the result to STDOUT, sort the names alphabetically.
Hint: you need to know about filehandle, array, hash, and sorting before attempting this exercise.
$ cat input.txt George Washington John Adams Thomas Jefferson James Madison James Monroe $ perl count_first_name.pl input.txt George appeared 1 times James appeared 2 times John appeared 1 times Thomas appeared 1 times
See a sample answer here
unwrap_fasta
Read a fasta file, unwrap the sequences (i.e., remove all extra line breaks), and save the result to an output file.
Hint: learn about the $/
variable (input record separator) before attempting this exercise.
count_EcoRI_site
Read a fasta file and count the number of EcoRI restriction sites in each sequence. Remember to check both strands.
Hint: learn about the regular expression before attempting this exercise.
display_codon
Read a fasta file that contain protein-coding sequences. Re-format the sequences to show codons (10 codons per line) in the output file.
Hint: learn about the substr
function or more advanced regular expression before attempting this exercise.
$ cat coding.fasta >NP_414542 ATGAAACGCATTAGCACCACCATT accaccaccatcaccattaccacaggta ACGGTGCGGGCTGA >NP_414617 ATGACTCACATCGTTCGCTTTA TCGGTCTACTA ctactaaacgcatcttctttgcgcggta GACGAGTGAGCGGCATCCAGCATTAA $ perl display_codon_1.pl coding.fasta codon.fasta $ cat codon.fasta >NP_414542 ATG AAA CGC ATT AGC ACC ACC ATT ACC ACC ACC ATC ACC ATT ACC ACA GGT AAC GGT GCG GGC TGA >NP_414617 ATG ACT CAC ATC GTT CGC TTT ATC GGT CTA CTA CTA CTA AAC GCA TCT TCT TTG CGC GGT AGA CGA GTG AGC GGC ATC CAG CAT TAA