Numpy for Python

7/25/2019 Numpy for Python

1/27

1/11/2016 01-numpy

file:///home/fractaluser/Downloads/01-numpy.html 1/27

Numpy highlights

ndarray: fast and space-efficient multidimensional array with vectorized arithmetic and

sophisticated broadcasting

standard vectorized mathreading / writing arrays to disk

memory-mapped file access

linear algebra, rng, fourier transform

Integration of C, C++, FORTRAN

Creating an ndarray

The arrayfunction

The array function is the workhorse function for creating numpy ndarrays on the fly from other

Python sequence like objects: primarily tuples and lists.

In [2]:

importnumpyasnpimportrandom

np.array(range(3))

Out[2]:

array([0, 1, 2])

In [2]:

np.array((1, 2, 3))

Out[2]:

array([1, 2, 3])

In [3]:

np.array([1, 2, 3])

Out[3]:

array([1, 2, 3])

In [4]:

np.array(list('hello'))

Out[4]:

array(['h', 'e', 'l', 'l', 'o'],dtype='


2/27

1/11/2016 01-numpy


Nested lists are treated as multi-dimensional arrays.

In [5]:

random.seed(3.141)dataList = [[random.uniform(0, 9) forx inrange(3)] fory inrange(4)]

dataList

Out[5]:

[[4.844700289907117, 3.285473931529339, 2.1797684393413155],[5.824634396536993, 0.8824946651389621, 2.76732952458187],[1.8329068547314877, 4.527186437261438, 3.5724501538885134],[2.144914100647332, 4.951733405544532, 4.325440230285053]]

In [6]:

data = np.array(dataList)data

Out[6]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

In [7]:

Important information about an ndarray

print(data.ndim) # Number of dimensionsprint(data.shape)# Shape of the ndarrayprint(data.dtype)# Data type contained in the array

2(4, 3)float64

Other functions to create arrays

arange

This is equivalent to the range function, except returns a one-dimensional array instead of a range

object:

In [8]:

np.arange(10)

Out[8]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


3/27

1/11/2016 01-numpy


ones, ones_like, zeros, zeros_like

To create arrays filled with ones or zeroes with a given shape or with a shape similar to a given

object:

In [9]:

np.ones(3)

Out[9]:

array([ 1., 1., 1.])

In [10]:

np.ones((3, 3))

Out[10]:

array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])

In [11]:

np.zeros((4))

Out[11]:

array([ 0., 0., 0., 0.])

In [12]:

np.zeros((4, 4, 4))

Out[12]:

array([[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]])


4/27

1/11/2016 01-numpy


In [13]:

np.ones_like(data)

Out[13]:

array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])

In [14]:

np.zeros_like(data)

Out[14]:

array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.],

[ 0., 0., 0.]])

empty, empty_like

Just like ones and zeros but initializes an empty array with garbage values not zero.

In [15]:

np.empty((2, 3))

Out[15]:

array([[ 6.91635841e-310, 6.91636044e-310, 6.91635874e-310], [ 6.91635549e-310, 6.91635128e-310, 6.91635185e-310]])

In [16]:

np.empty_like(data)

Out[16]:

array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

The ndarray function is a lower level function to create numpy arrays with even more power.

However, we do not explore that here.

Choosing a data type

Most array creation functions take a dtype argument which can be used to explicitly specify the data

type with which the array should be created.


5/27

1/11/2016 01-numpy


In [17]:

np.array(dataList, dtype=np.int)# Everything is truncated to integers

Out[17]:

array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]])

In [18]:

np.array(dataList, dtype=np.float)

Out[18]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015],

[ 2.1449141 , 4.95173341, 4.32544023]])

In [19]:

np.array(dataList, dtype=np.bool)# All numbers > 0 are True

Out[19]:

array([[ True, True, True], [ True, True, True], [ True, True, True],

[ True, True, True]], dtype=bool)

In [20]:

np.array(dataList, dtype=np.unicode_)# Unicode strings

Out[20]:

array([['4.844700289907117', '3.285473931529339', '2.1797684393413155'], ['5.824634396536993', '0.8824946651389621', '2.76732952458187'],

['1.8329068547314877', '4.527186437261438', '3.5724501538885134'], ['2.144914100647332', '4.951733405544532', '4.325440230285053']],

dtype='


6/27

1/11/2016 01-numpy


In [21]:

np.array(dataList, dtype=np.object)# Arrays containing arbitrary objects.

Out[21]:

array([[4.844700289907117, 3.285473931529339, 2.1797684393413155], [5.824634396536993, 0.8824946651389621, 2.76732952458187], [1.8329068547314877, 4.527186437261438, 3.5724501538885134], [2.144914100647332, 4.951733405544532, 4.325440230285053]], dtype=object)

In [3]:

np.array([3, 3.141, 'Pi'], dtype=np.object)

Out[3]:

array([3, 3.141, 'Pi'], dtype=object)

Numpy provides a richer typeset than this. One may choose integers and floats of various different

sizes based on requirements. The details are available in the book. Please refer.

Casting to a data type

A numpy ndarray carries anastype method which may be used to cast the elements of the array to

a new type. Note that this operation always creates a copy of the original array even if the elements

are being cast to the same data type. Let's look at a few examples.

In [22]:

data

Out[22]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952],

[ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

In [23]:

data.astype(np.int8)

Out[23]:

array([[4, 3, 2], [5, 0, 2],

[1, 4, 3], [2, 4, 4]], dtype=int8)


7/27

1/11/2016 01-numpy


In [24]:

data.astype(np.uint32)

Out[24]:

array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]], dtype=uint32)

In [25]:

data.astype(np.float128)

Out[25]:

array([[ 4.8447003, 3.2854739, 2.1797684], [ 5.8246344, 0.88249467, 2.7673295], [ 1.8329069, 4.5271864, 3.5724502],

[ 2.1449141, 4.9517334, 4.3254402]], dtype=float128)

In [26]:

data.astype(np.complex64)

Out[26]:

array([[ 4.84470034+0.j, 3.28547382+0.j, 2.17976832+0.j], [ 5.82463455+0.j, 0.88249469+0.j, 2.76732945+0.j], [ 1.83290684+0.j, 4.52718639+0.j, 3.57245016+0.j],

[ 2.14491415+0.j, 4.95173359+0.j, 4.32544041+0.j]], dtype=complex64)

In [27]:

data.astype(np.string_)

Out[27]:

array([[b'4.844700289907117', b'3.285473931529339', b'2.1797684393413155'], [b'5.824634396536993', b'0.8824946651389621', b'2.76732

952458187'], [b'1.8329068547314877', b'4.527186437261438', b'3.5724501538885134'], [b'2.144914100647332', b'4.951733405544532', b'4.325440230285053']],

dtype='|S32')

Vectorized math

Numpy's ndarrays allow concise mathematical expressions without the need for iteration using forloops: quite like R.


8/27

1/11/2016 01-numpy


In [28]:

data + data

Out[28]:

array([[ 9.68940058, 6.57094786, 4.35953688], [ 11.64926879, 1.76498933, 5.53465905], [ 3.66581371, 9.05437287, 7.14490031], [ 4.2898282 , 9.90346681, 8.65088046]])

In [29]:

data * 3

Out[29]:

array([[ 14.53410087, 9.85642179, 6.53930532], [ 17.47390319, 2.647484 , 8.30198857], [ 5.49872056, 13.58155931, 10.71735046],

[ 6.4347423 , 14.85520022, 12.97632069]])

In [30]:

1 / data

Out[30]:

array([[ 0.20641112, 0.30437009, 0.45876433], [ 0.1716846 , 1.13315133, 0.36135921], [ 0.54558146, 0.22088774, 0.27991993],

[ 0.46621914, 0.20194948, 0.23119034]])

In [31]:

2 ** data.astype(np.int)

Out[31]:

array([[16, 8, 4], [32, 1, 4], [ 2, 16, 8], [ 4, 16, 16]])

In [7]:

data * [1, 2, 3]# Elementwise multiplication of columns

Out[7]:

array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])


9/27

1/11/2016 01-numpy


In [8]:

data * [1, 2, 3, 4]

---------------------------------------------------------------------------ValueError Traceback (most recent call last)

in ()----> 1 data * [1, 2, 3, 4]

ValueError: operands could not be broadcast together with shapes (4,3) (4,)

In [12]:

x = np.array([[1], [2], [3], [4]])print(x, "\n\n", x.shape)

[[1][2][3][4]]

(4, 1)

In [9]:

data * [[1], [2], [3], [4]]# Elementwise multiplication of rows

Out[9]:array([[ 4.84470029, 3.28547393, 2.17976844], [ 11.64926879, 1.76498933, 5.53465905], [ 5.49872056, 13.58155931, 10.71735046], [ 8.5796564 , 19.80693362, 17.30176092]])

In [34]:

diag = np.diag([1, 2, 3])diag

Out[34]:

array([[1, 0, 0], [0, 2, 0], [0, 0, 3]])


10/27

1/11/2016 01-numpy


In [35]:

data * diag# Will not do matrix multiplication

---------------------------------------------------------------------------ValueError Traceback (most recent call last)

in ()----> 1 data * diag # Will not do matrix multiplication

ValueError: operands could not be broadcast together with shapes (4,3) (3,3)

In [36]:

np.dot(data, diag)

Out[36]:

array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])

In [37]:

print(np.sqrt(data), "\n\n", np.log(data), "\n\n", np.exp(data))

[[ 2.20106799 1.81258763 1.47640389]

[ 2.41342793 0.93941187 1.66352924][ 1.3538489 2.1277186 1.89009263][ 1.46455253 2.22524907 2.07976927]]

[[ 1.57788538 1.18951091 0.77921865][ 1.76209623 -0.12500254 1.01788278][ 0.60590315 1.51010065 1.27325168][ 0.76309951 1.5997377 1.46451392]]

[[ 127.06519357 26.72164554 8.84425804][ 338.53734003 2.4169216 15.91607372]

[ 6.25203402 92.49794585 35.60372097][ 8.54130751 141.4198896 75.59878641]]

In numpy parlance, these functions that apply the same operation on each of the elements of an array

are called universal functionsor ufuncs. On the other hand, there are functions that return

aggregations or other types of operations on ndarrays. Refer to McKinney (2012) for details.


11/27

1/11/2016 01-numpy


In [38]:

print(np.sum(data), "\n\n", np.min(data), "\n\n", np.max(data), "\n\n", np.mean(data))

41.1390324294

0.882494665139

5.82463439654

3.42825270245

Indexing ndarrays

Indexing by position

ndarrays are collections of data of a given shape, size, and type. Indexing is the way to access

elements, sub-collections of elements in a given ndarray.

Numpy provides a rich set of indexing semantics that allow concise expression of various indexing

operations.

One-dimensional arrays

The simplest indexing operation into any one dimension of a numpy ndarray mimics the semantics

of Python's indexing operator :.

Let's look at this using a one-dimensional array.

In [39]:

oneD = np.arange(10)oneD

Out[39]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


12/27

1/11/2016 01-numpy


In [40]:

Standard Python indexing: similar to lists and tuples

print(oneD[0]) # 0-based indexingprint(oneD[2:4]) # Indexing a range: excludes right limitprint(oneD[:4]) # Indexing from the beginning implictlyprint(oneD[4:]) # Indexing to the end implicitly

print(oneD[1:5:2])# Indexing with jumpsprint(oneD[::-1]) # Reverting an arrayprint(oneD[:-2]) # Negative indices to index from the end

0[2 3][0 1 2 3][4 5 6 7 8 9][1 3][9 8 7 6 5 4 3 2 1 0][0 1 2 3 4 5 6 7]

Indexing can be combined with the = operator to assign values in an ndarray.

In [41]:

oneD[3] = 13oneD

Out[41]:

array([ 0, 1, 2, 13, 4, 5, 6, 7, 8, 9])

In [42]:

oneD[1:3] = [11, 12]oneD

Out[42]:

array([ 0, 11, 12, 13, 4, 5, 6, 7, 8, 9])

In [43]:oneD[4:7] = [1, 2]# The shape of the replacement must match

---------------------------------------------------------------------------ValueError Traceback (most recent call last) in ()----> 1 oneD[4:7] = [1, 2] # The shape of the replacement mustmatch

ValueError: cannot copy sequence with size 2 to array axis with dimension 3


13/27

1/11/2016 01-numpy


In [44]:

oneD[7:9] = 20# Scalars are replicated to fill spaceoneD

Out[44]:

array([ 0, 11, 12, 13, 4, 5, 6, 20, 20, 9])

Aside: Indexing by position creates views

Numpy intends to be conservative with memory usage and is designed such that indexing / slicing into

an ndarray does not copy elements unless explicitly asked to. Therefore, slices are viewsinto the

original array the elements of a slice are the same as that of the original.

In [45]:

oneDSlice = oneD[2:5]

oneDSlice

Out[45]:

array([12, 13, 4])

Since the slice is just a view into the original ndarray, changes to the slice are also reflected into the

original. Therefore code that works on slices must be careful with introducing any unwanted changes

into the original ndarray.

In [46]:

oneDSlice[2] = 14print(oneDSlice)oneD

[12 13 14]

Out[46]:

array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])

To create an explicit copy of the view / slice, one may use the copy method available with an array.

For example:

In [47]:

oneDCopy = oneD[2:5].copy()oneDCopy

Out[47]:

array([12, 13, 14])


14/27

1/11/2016 01-numpy


In [48]:

oneDCopy[2] = 24print(oneDCopy)oneD

[12 13 24]

Out[48]:array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])

Indexing multi-dimensional arrays

Multi-dimensional arrays differ from one-dimensional arrays because where elements of a one-

dimensional array are themselves scalars, the elements of a multi-dimensional array are arrays

themselves.

For example, a two-dimensional array (or a matrix) can be considered as an array of row-arrays.

In [49]:

data

Out[49]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

In [50]:

row1 = data[0]row1

Out[50]:

array([ 4.84470029, 3.28547393, 2.17976844])

However, a two-dimensional array can also be considered as an array of column-arrays. How do we

access that first column for example?

Numpy provides n indices into any arbitrary n-dimensional array. Any specific element in the

ndarray can be accessed by specifying the position of the element along the n dimensions.

In [51]:

data[1, 1]# Access the second diagonal element in 'data'

Out[51]:

0.88249466513896213


15/27

1/11/2016 01-numpy


In [52]:

data[:, 1]# Access the second column where `:` stands for all rows

Out[52]:

array([ 3.28547393, 0.88249467, 4.52718644, 4.95173341])

In [53]:

data[2, :]# Access the second row

Out[53]:

array([ 1.83290685, 4.52718644, 3.57245015])

In [54]:

data[:2, :]# First and second row, all columns

Out[54]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952]])

In [55]:

data[:2, ::-1]# First and second row, all columns reversed

Out[55]:

array([[ 2.17976844, 3.28547393, 4.84470029],

[ 2.76732952, 0.88249467, 5.8246344 ]])

Fancy Indexing

All the position based indexing that we have seen till now uses slice objects to create views out of

the ndarrays. Moreover, these slice objects create views out of the ndarray instead of copying

elements in the object.

The slice based indexing notation however does not allow one to take values at an arbitrary set of

positions out of the array. For example, consider an array of ten elements with the problem of takingthe first, the fourth, and the ninth element out of the array.

These are the situations where 'fancy indexing' is used instead. An important thing to note is that

fancy indexing does not create views into the existing array instead they create a new ndarray, into

which, the desired elements are copied. Therefore, fancy indexing should be avoided if one can use

slices to index with.


16/27

1/11/2016 01-numpy


In [56]:

data

Out[56]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

In [57]:

data[[2, 0, 1]]

Out[57]:

array([[ 1.83290685, 4.52718644, 3.57245015], [ 4.84470029, 3.28547393, 2.17976844],

[ 5.8246344 , 0.88249467, 2.76732952]])

In [58]:

data[:, [-1, 2, 0]]

Out[58]:

array([[ 2.17976844, 2.17976844, 4.84470029], [ 2.76732952, 2.76732952, 5.8246344 ], [ 3.57245015, 3.57245015, 1.83290685], [ 4.32544023, 4.32544023, 2.1449141 ]])

In [59]:

data[[2, 0, 1], [-1, 2, 0]]

Out[59]:

array([ 3.57245015, 2.17976844, 5.8246344 ])

In [60]:

data[[2, 0, 1], [-1, 2]]

---------------------------------------------------------------------------IndexError Traceback (most recent call last) in ()----> 1 data[[2, 0, 1], [-1, 2]]

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)


17/27

1/11/2016 01-numpy


In [61]:

data[np.ix_([2, 0, 1], [-1, 2])]

Out[61]:

array([[ 3.57245015, 3.57245015], [ 2.17976844, 2.17976844], [ 2.76732952, 2.76732952]])

Boolean indexing

The indexing semantics that we have seen till now are useful when the position of the subset to be

extracted is known. However, there are often situations where one wants to select a subset of a

collection not based on some predicate.

For example, in the following data ndarray comprised of numbers between zero to nine, one may

want to select elements that are less than 4. In such situations boolean indexing comes useful.

In [62]:

print(data)# Let's recap what is in data.

[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 0.88249467 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

In [63]:

data[np.array([True, False, True, False])]# Subset rows

Out[63]:

array([[ 4.84470029, 3.28547393, 2.17976844], [ 1.83290685, 4.52718644, 3.57245015]])

In [64]:

data[:, np.array([True, False, True])]# Subset columns

Out[64]:

array([[ 4.84470029, 2.17976844], [ 5.8246344 , 2.76732952], [ 1.83290685, 3.57245015], [ 2.1449141 , 4.32544023]])

In [65]:

data[-np.array([True, False, True, False])]# Invert bool array

Out[65]:

array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])


18/27

1/11/2016 01-numpy


In [66]:

data[~np.array([True, False, True, False])]# Invert bool array

Out[66]:

array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])

Unlike for numerical indices, if the shape of the boolean indexing array does not match the shape of

the array being indexed, then the values that are left out of the indexing array are considered to be

False. Here is an example. Best to avoid such indexing.

In [21]:

x = data[np.array([True, False, True]), np.array([True, False, True])]print(x)print(x.ndim)

print(x.shape)

[ 4.84470029 3.57245015]1(2,)

Aside: More on booleans

Conditional operations on other arrays generate boolean arrays as well. For example:

In [68]:

data < 4# Returns an array of booleans

Out[68]:

array([[False, True, True], [False, True, True], [ True, False, True], [ True, False, False]], dtype=bool)

These multidimensional arrays can be used to index other arrays but with surprising, yet, correctbehavior. If conditional arrays are desired for conditional assignments, the numpy.where function is

handy.

In [69]:

data[data < 4]# Mangles the shape of the array

Out[69]:

array([ 3.28547393, 2.17976844, 0.88249467, 2.76732952,

1.83290685, 3.57245015, 2.1449141 ])


19/27

1/11/2016 01-numpy


In [70]:

np.where(data < 4, data, -data)# Use `where` for conditional assignments

Out[70]:

array([[-4.84470029, 3.28547393, 2.17976844], [-5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, -4.52718644, 3.57245015], [ 2.1449141 , -4.95173341, -4.32544023]])

Logical combinations on booleans

Numpy provides the standard logical operations: and(&), or (|), and not (~, -) that we have already

seen in action. Besides, the symbolic operators, these logical operations are also provided as

functions in the numpy library.

In [71]:np.array([True, False, True, False]) | np.array([True] * 4)

Out[71]:

array([ True, True, True, True], dtype=bool)

In [72]:

np.array([True, False, True, False]) & np.array([True] * 4)

Out[72]:array([ True, False, True, False], dtype=bool)

In [73]:

np.logical_or(np.array([True, False] * 2), np.array([True] * 4))

Out[73]:

array([ True, True, True, True], dtype=bool)

In [74]:

np.logical_and(np.array([True, False] * 2), np.array([True] * 4))

Out[74]:

array([ True, False, True, False], dtype=bool)

In [75]:

np.logical_not(np.array([True, False]))

Out[75]:

array([False, True], dtype=bool)


20/27

1/11/2016 01-numpy


Logical aggregations

Any array of logicals can be compressed to a single logical value based in two different ways: whether

all values in the array are true or any values in the array are true. Python provides .any() and

.all() methods on logical arrays to do these aggregations. For example:

In [76]:

np.array([True, False, True, True]).all()

Out[76]:

False

In [77]:

np.array([True, False, True, True]).any()

Out[77]:

True

In [78]:

np.array([True, True, True, True]).all()

Out[78]:

True

In [79]:

np.array([False, False, False, False]).all()

Out[79]:

False

Exercise: Create a function which:

1. none: When given a logical array returns True if all elements in the array are False.

2. notall: When given a logical array returns True if any elements in the array areFalse.

Transposing arrays

There are two ways to transpose a numpy array. Each array has a .T attribute which returns a view

which is the transpose of the original array. On the other hand, one may use the numpy.transpose

to return a shallow transposed copy of the array.

However, the thing of particular note is that each of these methods provide a view and one muse use

the .copy() method to achieve a true copy. Transposing arrays is only one of the many placeswhere the programmer needs to exercise special caution to ensure that there is no action at a

distance which can be the reason for many subtle bugs.


21/27

1/11/2016 01-numpy


In [80]:

data.T

Out[80]:

array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])

In [81]:

np.transpose(data)

Out[81]:

array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])

In [82]:

NB: Remember that the transpose is only a shallow copy:

dataT = data.Tprint(dataT)

[[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 0.88249467 4.52718644 4.95173341][ 2.17976844 2.76732952 3.57245015 4.32544023]]

In [83]:

dataT[1, 1] = 1print(data)

[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

Therefore, with numpy it is always better to use .copy() explicity when copies are desired (or take acareful read of the documentation). Let's look at an example:


22/27

1/11/2016 01-numpy


In [84]:

dataT2 = data.T.copy()dataT2[1, 1] = 2print(dataT2, "\n\n", data)

[[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 2. 4.52718644 4.95173341]

[ 2.17976844 2.76732952 3.57245015 4.32544023]]

[[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

Reading from and writing to text files

Numpy provides functions to read delimited text based datasets using two simple functions:

numpy.loadtxt andnumpy.savetxt. Let's look at a simple example.

In [85]:

!head -10 ../../../data/pythagorean-triples.txt

3,4,55,12,1315,8,17

7,24,2621,20,2935,12,379,40,4145,28,5211,60,6133,56,65


23/27

1/11/2016 01-numpy


In [22]:

Load the pythagorean triples into a two dimensional array

pyTrips = np.loadtxt("../../../data/pythagorean-triples.txt", dtype=np.uint, delimiter=",")print(pyTrips[:10], "\n\n", pyTrips.shape)

[[ 3 4 5][ 5 12 13][15 8 17][ 7 24 26][21 20 29][35 12 37][ 9 40 41][45 28 52][11 60 61][33 56 65]]

(101, 3)


24/27

1/11/2016 01-numpy


In [23]:

pyTrips


25/27

1/11/2016 01-numpy


Out[23]:

array([[ 3, 4, 5], [ 5, 12, 13], [ 15, 8, 17], [ 7, 24, 26], [ 21, 20, 29], [ 35, 12, 37],

[ 9, 40, 41], [ 45, 28, 52], [ 11, 60, 61], [ 33, 56, 65], [ 63, 16, 65], [ 55, 48, 73], [ 13, 84, 85], [ 77, 36, 86], [ 39, 80, 89], [ 65, 72, 97], [ 99, 20, 101],

[ 91, 60, 109], [ 15, 112, 111], [117, 44, 125], [105, 88, 137], [ 17, 144, 145], [143, 24, 145], [ 51, 140, 147], [ 85, 132, 157], [119, 120, 169], [165, 52, 173], [ 19, 180, 181],

[ 57, 176, 185], [153, 104, 185], [ 95, 168, 193], [195, 28, 199], [133, 156, 205], [187, 84, 205], [ 21, 220, 221], [171, 140, 221], [221, 60, 229], [105, 208, 235], [209, 120, 241],

[255, 32, 257], [ 23, 264, 265], [247, 96, 265], [ 69, 260, 269], [115, 252, 271], [231, 160, 281], [161, 240, 289], [285, 68, 293], [207, 224, 305], [273, 136, 305], [ 25, 312, 313],

[ 75, 308, 317], [253, 204, 325], [323, 36, 325], [175, 288, 337], [299, 180, 349],


26/27

1/11/2016 01-numpy


[225, 272, 353], [ 27, 364, 368], [357, 76, 365], [275, 252, 373], [135, 352, 377], [345, 152, 379], [189, 340, 389], [325, 228, 397],

[399, 40, 401], [391, 120, 409], [ 29, 420, 421], [ 87, 416, 425], [297, 304, 425], [145, 408, 433], [203, 396, 445], [437, 84, 445], [351, 280, 449], [425, 168, 457], [261, 380, 465],

[ 31, 480, 481], [319, 360, 481], [ 93, 476, 485], [483, 44, 485], [155, 468, 493], [475, 132, 493], [217, 456, 505], [377, 336, 505], [459, 220, 590], [279, 440, 521], [435, 308, 533],

[525, 92, 533], [341, 420, 541], [ 33, 544, 545], [513, 184, 545], [165, 532, 557], [403, 396, 565], [493, 276, 556], [231, 520, 569], [575, 48, 577], [465, 368, 593], [551, 240, 601],

[ 35, 612, 613], [105, 608, 617], [527, 336, 627], [429, 460, 629], [621, 100, 629]], dtype=uint64)

In [25]:

?np.where


27/27

1/11/2016 01-numpy

Exercise

Using the array on Pythagorean triples loaded above, write numpy code to do the following:

1. Create three arrays, viz. baseSq, altitudeSq, and hypotenuseSq respectively by

squaring the first, second, and third columns of the pyTrips array.

2. Use numpy.where and the three arrays created above to find which of these triples are notreally Pythagorean.

3. Count the number of non-Pythagorean triples.

Documents

Numpy for Python