Programming and Algorithms II
Degree in Bioinformatics
Fall 2018
***************************************************
*** Lab 4: ***
*** More recursion ***
***************************************************
Part 1.
1. Give a recursive function flip(n) that given an integer n returns
another one that results from exchanging the even- and odd- position
digits of n. Assume 0 leading digit if necessary.
For example, for 1234 it should return 2143, and for 12345 it should return 103254.
You can't use floats, lists or other complex structures.
Only integer variables arithmetic and logical operations
such as +, *, //, <, ==, etc.
Hint: suppose that you can do this for integers up to n-1
digits. How do you do it then for integers with n digits?
2. (This is hard to do recursively from the start...)
Give an *iterative* algorithm that computes the reverse of
an integer in base 10. For example, the reverse of 1234
is 4321, and the reverse of 930 is 39.
Hint: observe that the reversal of a number with 2 digits
xy is 10*y + x. Now what if you have more digits.
***************************************************
Part 2.
Write a recursive function that says whether or not
a list contains an element that equals the sum of the elements before it
in the list.
For example, it has to return True on the lists [1,5,-2,4,2] (because 4 = 1+5-2)
and [0,4,2,9] (because 0 = sum of the empty list), but False
on the lists [1,2,4,8], [-1,2,3,-2], for example.
You will need to either add an extra parameter or return an extra result
to make it possible and efficient, and if you want to avoid list copying,
an integer parameter indicating the part of the list to look at.
***************************************************
Part 3.
In Python, the string type has a < operator that
tells whether a string is lexicographically less
than another. For example, "abracadabra" < "act"
and "bar" < "bat" < "car" < "cat" < "char".
Note that a longer string may be < than a shorter one.
So comparing the lengths doesn't help much.
Suppose, to complicate your life, that
we are only allowed to use < with single characters, not strings
For example, "a" < "z" and "b" < "c".
(This is how < for strings is implemented internally in python,
because computers basic instructions compare characters, not strings).
Give a recursive function that given two strings s1 and s2
says whether one is less than the other using only character comparisons.
For example, it returns 1 if s1 < s2, 0 if s1 == s2, and -1 if s1 > s2.
Hint: Suppose you can answer this for strings of length no more than n-1.
How do you solve the problem for strings of length n?
In other words, consider one character of s1 and one character of s2,
and decide based on these two characters and a recursive call with
the rest of the strings.
The characters can be the first ones, or the last ones. One choice
leads to a good recursive solution, the other does not. Try.
Additional: what is the running time or your solution?
Linear or quadratic in the length of s1 or s2? (look closely...)
If it is quadratic, how to you do it in linear time?
***************************************************
The rest are extra exercises for you to practice towards the exam.
Extra exercise 1:
Here is an implementation of the Karatsuba-Ofman algorithm
to multiply integers. The function KOprod calls the recursive
function KOprodrec, with a parameter n that is the number of
bits of x and y, whatever is larger.
def KOprodrec(x, y, n):
if (n == 1):
return x*y # this is really the product of 1-bit integers
else:
k = n//2
x1 = x >> k
x0 = x - (x1 << k)
y1 = y >> k
y0 = y - (y1 << k)
a = KOprodrec(x1,y1,k)
b = KOprodrec(x0,y0,k)
c = KOprodrec(x0 + x1, y0 + y1, k)
c = c - a - b
return (a << (2*k)) + (c << k) + b
def KOprod(x,y):
return KOprodrec(x,y,max(x.bit_length(),y.bit_length()))
Python instructions x<>k add k 0's and remove k lowest bits to x
in binary (equivalent to *2^k and //2^k, but in linear time)
1. study the algorithm and make sure you understand it (use class slides)
2. run it on large pairs of integers (x,y); measure the time it takes
as (x,y) grow. Do the times match what we said in class?
Generate x and y at random having n approximately bits, and try
n = 100, 200, 300, 400, ... o something like this, by controling
the range of generation of the random numbers.
3. An optimization is as follows: instead of going down to n==1 (1-bit numbers)
we could stop the recursion at larger n. For example, our CPU probably
multiplies any two 16-bit numbers by hardware without overflow. This should
make it faster because it avoids about 4 levels of recursion. Check it.
***************************************************
Extra exercise 2: Mergesort
Actually program it and run it.
If you are brave, reprogram it to sort the lines in a file.
You need a function split(file1,file2,file3)
that given file1, splits it into file2 and file3. File2 contains
the odd-numbered lines in file1, and file3 the even-numbered ones.
Then reprogram the merge function that reads lines from two sorted files
and created a file that contains the ordered merge of the two files.
***************************************************
Extra exercise 3: Quicksort
Consider the clever function
def partition(lst, begin, end):
pivot = begin
for i in range(begin+1,end+1):
if lst[i] <= lst[begin]:
pivot += 1
lst[i], lst[pivot] = lst[pivot], lst[i]
lst[pivot], lst[begin] = lst[begin], lst[pivot]
return pivot
It does the following: given a list l and two indices begin and end,
it rearranges the elements in lst and returns a value i (the "pivot")
such that
Every element in lst[begin..i] is <= every element in lst[i+1...end]
(but the elements within lst[begin..i] can be in any order,
and the elements in lst[i+1..end] can be in any order;
also, i can be anything between begin..end; it does not have to be
at all near (begin+end)/2
For example, with lst = [7.6, 3.2, 8.0, 1.3, 2.2, 5.9, 6.8, 7.2], begin=0, end=len(lst)-1 (all the list)
it returns pivot 6 and leaves lst = [7.2, 3.2, 1.3, 2.2, 5.9, 6.8, 7.6, 8.0]
which is ok because every element in
[7.2, 3.2, 1.3, 2.2, 5.9, 6.8] is smaller than every element in [7.6, 8.0].
With lst = [1.3, 7.6, 3.2, 8.0, 2.2, 5.9, 6.8, 7.2]
it returns pivot = 0 and lst = [1.3, 7.6, 3.2, 8.0, 2.2, 5.9, 6.8, 7.2],
which is ok because every element in [1.3] is smaller than every element in
[7.6, 3.2, 8.0, 2.2, 5.9, 6.8, 7.2].
Observe that partition runs in time O(end-begin).
Question 0. Understand why partition does what we claim.
Question 1. Give a function quicksort(lst,begin,end)
that sorts lst[begin..end], recursively. Think of a base case,
and how to split the problem into two subproblems.
Question 2. Think why we sort lst uses time O(n log n) with n=len(lst),
IF we are so lucky that at every recursive call we have pivot = (begin+end)//2.
(Hint: think mergesort).
Question 3: Using the second of the examples provided, prove
that the worst-case running time of quicksort is O(n^2).
FACT: However, on randomly sorted lists, Quicksort is also O(n log n).
You are not expected to see this right away; it is a complicated analysis.
Conclusion: In the worst case quicksort is quadratic, but
in average case it is O(n log n) - often with less comparisons
than mergesort.
Two good practices if you are going to use quicksort are:
- Randomly permute lst before calling quicksort, to be quite sure
you are in an "average" case.
- Stop doing recursion for lists that are sufficiently small,
and use insertion instead (for example, when len(lst) <= 5.