MAS212 Class Test 1 (2016)

Aim:

  • To help you learn some essentials of Python to enjoy the remainder of this course.

How it works:

  • Complete the notebook below, as follows. Click on a question, then from the menu select Insert -> Insert Cell Below. From the dropdown box (next to the "stop" symbol) choose whether you wish to insert 'Code' or 'Markdown' (i.e. text). Save regularly!
  • Press Shift-Enter to execute the code in a cell.
  • Submit via http://somas-uploads.shef.ac.uk/mas212 by 5pm on Mon 10th Oct. If you are off-campus, you will need to use a VPN to submit.
  • Your lecturer will mark each question as either 2 (good), 1 (needs revision), 0 (not attempted/wrong).
  • This is an open-book test, which means you may consult books, notes and internet resources. Do not discuss or share your test with anyone. Copy-and-pasting is not permitted. Please give answers in your own words.

Key Resources:

  • Lecture materials and links on course web page: http://sam-dolan.staff.shef.ac.uk/mas212/


A. The Python language

(Please answer the following questions in "raw text" or "markdown" cells)

1. Describe the three numeric data types in Python 3.

The three numeric types are int, float and complex. int is an integer, that is, a whole number: any finite member of the set $\mathbb{Z}$. float is a floating-point number: a representation of a member of the reals $\mathbb{R}$ with a finite precision for its mantissa and exponent. complex is a representation of a complex number in $\mathbb{C}$: it can be thought of as a pair of floats, representing its real and imaginary parts.

2. Give three examples of data types which are mutable. (N.B. Common data types include int, list, tuple, dict, set, string, float).

Data types which are mutable can have their value altered. list, dict, set are examples of mutable data types.

3. Describe two key differences between lists and arrays. (More specifically, between the list and ndarray types).

Some key differences between lists and arrays include the following:

  • Arrays are homogeneous, whereas lists are heterogeneous. In other words, all elements of an array must have the same data type, whereas elements of lists may have different data types.
  • Arrays are typically fixed in size, whereas lists can be lengthened or shortened.
  • Arrays are typically contiguous chunks of memory, whereas lists are not.
  • Arrays may be operated on with universal functions (ufuncs, see below), which operate on a whole array in an efficient manner. This enables vectorizaton. Universal functions are not available for lists.
  • Arrays can be multi-dimensional, whereas lists are one-dimensional (though lists-of-lists can be used to mimic multi-dimensional structures).

4. What is a Universal Function? (in the context of the numpy module)

A universal function, or ufunc, is a function which operates on a whole array, or arrays, in an element-by-element ("elementwise") fashion. numpy includes many ufuncs, which, under the hood, are typically implemented in C (a low-level compiled language) to improve their speed and efficiency. An example ufunc is np.log(A), where the argument A is any array.

5. What is "vectorization"?

In essence, "vectorization" is the replacement of "for" loops in code with whole-array operations implemented as ufuncs, in order to improve computational efficiency for batch operations.


B. Python code

(Please answer the following questions in "code" cells).

6. Python as a calculator: Calculate $3 \times 5$, $\ln(3)$, $\exp(i \pi / 4)$ and $\frac{1+\sqrt{3}i}{1-\sqrt{3}i}$, where $i$ is the unit imaginary number.

In [1]:
import math, cmath
[3*5, math.log(3), cmath.exp(math.pi*1j/4.0), (1 + 3**0.5*1j)/(1 - 3**0.5*1j)]
Out[1]:
[15,
 1.0986122886681098,
 (0.7071067811865476+0.7071067811865475j),
 (-0.5+0.8660254037844387j)]

7. Use numpy to find the following sum: $$\sum_{k=1}^{1000} \frac{1}{k^4} = 1 + \frac{1}{16} + \frac{1}{81} + \ldots + \frac{1}{10^{12}}$$

In [2]:
import numpy as np
print((1/np.arange(1,1001)**4.0).sum())
# Note that this is approximately equal to pi^4 / 90
1.08232323338

8. Use the decimal package to find the value of $\ln(2)$ to 30 decimal places.

In [3]:
from decimal import Decimal, getcontext
getcontext().prec = 30
Decimal(2).ln()
Out[3]:
Decimal('0.693147180559945309417232121458')

9. The Euler-Mascheroni constant $\gamma$ is defined by $$ \gamma = \lim_{n \rightarrow \infty} \left(-\ln(n) + \sum_{k=1}^n \frac{1}{k} \right) $$ Calculate $\gamma$ to an accuracy of 5 decimal places.

In [4]:
maxits = 100000
def eulermas(n):
    return (1/np.arange(1,n+1)).sum() - np.log(n)
print("%.5f" % eulermas(maxits)) 
0.57722

10. Write a function to find the product set $A \times B$, that is, the Cartesian product of two sets $A$ and $B$. Test your function by finding the product of the sets $A = \{1,2,3\}$ and $B = \{4, 5\}$

In [5]:
def productset(A, B):
    """Returns the product set of A and B."""
    return set([(a,b) for a in A for b in B])
A = {1,2,3}
B = {4,5}
productset(A,B)  
Out[5]:
{(1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)}


C. Continued fractions

Any positive real number $r$ can be written in the form of a continued fraction: $$ r = a_0+\cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+\cdots}}} $$ where $a_k \in \mathbb{N}$.

If $r$ is rational then the sequence of coefficients $[a_0 ; a_1, a_2, \ldots]$ is finite. Conversely, if $r$ is irrational then the sequence is infinite.

To calculate the continued fraction representation $[a_0 ; a_1, a_2, \ldots]$ of $r$, use the following approach. First, take the 'floor' (the integer part) of $r$; this is $a_0$. Now subtract the integer part from $r$ to get a real number in the interval $[0, 1)$. If it is zero, stop. Otherwise, take its reciprocal to get a number in the range $(1, \infty)$. Take its floor; this is $a_1$. Now subtract the floor. If the remainder is zero, stop. Otherwise, take its reciprocal and continue in this fashion. More detail here.

11. Write a function which takes a real number $r$ and returns a list of the first few coefficients of the continued fraction $[a_0, a_1, \ldots a_n]$ up to $n=6$, say. Test your function by confirming that the representation of $\pi$ is $[3; 7, 15, 1, 292, 1, \ldots ]$

In [1]:
import math
def getctdfrac(r, n=7):
    l = [math.floor(r)]
    r1 = 1/(r - l[0])
    for k in range(n-1):
        i = math.floor(r1)
        if i == 0:
            break
        l.append(i)
        r1 = 1 / (r1 - i)
    return l
getctdfrac(math.pi)
Out[1]:
[3, 7, 15, 1, 292, 1, 1]

12. Compute the real number with the continued fraction sequence $[2;4,4,4,4,4,4\ldots]$ to an accuracy of at least 8 decimal places

In [7]:
def calcctdfrac(l):
    r = l[-1]
    for k in range(len(l)-2,-1,-1):
        r = l[k] + 1/r
    return r
print("%.8f" % calcctdfrac([2,4,4,4,4,4,4,4,4]))
print("%.8f" % math.sqrt(5))
print("Note: The continued fraction sequence [2; 4, 4, 4, 4, ...] is equivalent to the square root of 5.")
2.23606798
2.23606798
Note: The continued fraction sequence [2; 4, 4, 4, 4, ...] is equivalent to the square root of 5.


D. Strings

(Please answer in "code" cells).

In [4]:
s = "All happy families are alike; each unhappy family is unhappy in its own way"

13. Open a new file "xxx.txt", where xxx is your registration number. Write this famous opening line (the string s) into the file, putting just one word on each line, like this:
All
happy
families
...

In [5]:
f = open("123456789.txt", "w")
l = s.split(" ")
for w in l:
    f.write(w + "\n")
f.close()

14. How many words are in this opening sentence? How many distinct words?

In [6]:
s1 = s.lower().replace(",", "").replace(".","")  #  Remove punctuation
words = s1.split(" ")   # list of words
print("There are %d words in this sentence." % len(words))       # number of words
print("There are %d different words in this sentence." % len(set(words)))  # number of distinct words
There are 14 words in this sentence.
There are 13 different words in this sentence.

15. Write code to invert the meaning of the sentence, by swapping "happy" and "unhappy". Print out the new sentence.

In [7]:
swapkey = "$$$$$$"  # A temporary string which is not in 's'
s.replace("unhappy", swapkey).replace("happy", "unhappy").replace(swapkey, "happy")
Out[7]:
'All unhappy families are alike; each happy family is happy in its own way'


E. numpy and linear algebra

(Please answer in "code" cells).

In [10]:
import numpy as np

16. Make a 2D array of dimension $3 \times 4$ containing the integers 1, 2, ... 12

In [11]:
np.arange(1,13).reshape(3,4)
Out[11]:
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

17. Make an array of 1000 random numbers drawn from the standard uniform distribution on the interval [0,1) (see lecture 2).
Calculate the standard deviation $\sigma$ of the data, and verify that it is approximately equal to $\frac{1}{2 \sqrt{3}}$.

In [15]:
n = 1000
a = np.random.random(n)
var = (a**2).sum()/n - (a.sum()/n)**2
stddev = np.sqrt(var)
print("The standard deviation of the data set is %.6f" % stddev)
print("which is close to the std dev of the probability dist, %.6f" % (1/(2*3**0.5)))
The standard deviation of the data set is 0.289152
which is close to the std dev of the probability dist, 0.288675

18. A square matrix $N$ is said to be nilpotent if $N^k = 0$ for some $k > 1$. Demonstrate that the following matrix is nilpotent: $$ A = \begin{bmatrix} 2 & 2 & -2 \\ 5 & 1 & -3 \\ 1 & 5 & -3 \end{bmatrix} $$

In [19]:
A = np.matrix("2,2,-2 ; 5,1,-3 ; 1,5,-3")   # This is a quick-and-easy way of defining a matrix
k = 3  # A^2 is not zero, but A^3 is. 
print(A**k)
if np.all(A**k)==0:
    print("A is nilpotent as A^%d = 0." % k)
[[0 0 0]
 [0 0 0]
 [0 0 0]]
A is nilpotent as A^3 = 0.

19. Using np.linalg.eig(A), find the eigenvalues of the real symmetric matrix $$ A = \begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 1 \\ 3 & 1 & -5 \end{bmatrix} . $$ and verify that the sum of the eigenvalues is zero.

In [21]:
A = np.matrix("1,2,3 ; 2,4,1 ; 3,1,-5")
eigs = np.linalg.eig(A)[0]
print("The eigenvalues are %.6f, %.6f and %.6f" % tuple(eigs))
print("The sum of eigenvalues is %.4e, which is zero to within machine precision." % sum(eigs))
The eigenvalues are -6.245242, 0.725182 and 5.520060
The sum of eigenvalues is 8.8818e-16, which is zero to within machine precision.


F. Cryptography

20. A musician has encrypted the notes of a famous melody using a Caesar shift. Write code to decrypt ciphertext using keytext, to recover the melody as a string of letters, A to G.
From what piece of music is this melody taken? (this part not assessed)

In [22]:
ciphertext = "SYFOMNHALYDMKVKJPUYUOBZDIJKSSGSVYRQHWNNERLPGPPWVYWDMHNJETVVZIM"
keytext = "MGASUSXDREASUITVPLIMRDEZUUULKWLIGLNXJRPZNUPXNOKJGJDUYRUYJIJEUQ"

def scrambleletter(l,c,opt=1):
    """Uses the letter 'c' to shift the letter 'l' around the 'letter wheel'"""
    i0 = ord("A")
    i1 = ord(l) - i0
    i2 = ord(c) - i0
    i3 = (i1 + opt*i2) % 26
    return chr(i0 + i3)

def scramblestring(plaintext, cipher, opt=1):
    """Scrambles a string using a One Time Pad. To unscramble, set 'opt' to -1."""
    if len(plaintext) != len(cipher):
        print("Warning: strings are different lengths!")
    return "".join([scrambleletter(plaintext[i], cipher[i], opt) for i in range(len(plaintext))])

print(scramblestring(ciphertext,keytext,1))
print("This is the theme of 'Ode to Joy', from Beethoven's 9th symphony.")
EEFGGFEDCCDEEDDEEFGGFEDCCDEDCCDDECDEFECDEFEDCDGEEFGGFEDCCDEDCC
This is the theme of 'Ode to Joy', from Beethoven's 9th symphony.

Note: The explanation of Q20 was flawed, because the message was actually encoded using a "One Time Pad", rather than a Caesar shift.
(NB. A Caesar cipher shifts all letters by the same amount, whereas a one time pad uses the letters of one string to shift each letter in another string, as above).
For this reason, Q20 will not be officially assessed.
Congratulations to the clever people who managed to decode the message despite the flaw in the question.