Numpy for Python

Embed Size (px)

Citation preview

  • 7/25/2019 Numpy for Python

    1/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 1/27

    Numpy highlights

    ndarray: fast and space-efficient multidimensional array with vectorized arithmetic and

    sophisticated broadcasting

    standard vectorized mathreading / writing arrays to disk

    memory-mapped file access

    linear algebra, rng, fourier transform

    Integration of C, C++, FORTRAN

    Creating an ndarray

    The arrayfunction

    The array function is the workhorse function for creating numpy ndarrays on the fly from other

    Python sequence like objects: primarily tuples and lists.

    In [2]:

    importnumpyasnpimportrandom

    np.array(range(3))

    Out[2]:

    array([0, 1, 2])

    In [2]:

    np.array((1, 2, 3))

    Out[2]:

    array([1, 2, 3])

    In [3]:

    np.array([1, 2, 3])

    Out[3]:

    array([1, 2, 3])

    In [4]:

    np.array(list('hello'))

    Out[4]:

    array(['h', 'e', 'l', 'l', 'o'],dtype='

  • 7/25/2019 Numpy for Python

    2/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 2/27

    Nested lists are treated as multi-dimensional arrays.

    In [5]:

    random.seed(3.141)dataList = [[random.uniform(0, 9) forx inrange(3)] fory inrange(4)]

    dataList

    Out[5]:

    [[4.844700289907117, 3.285473931529339, 2.1797684393413155],[5.824634396536993, 0.8824946651389621, 2.76732952458187],[1.8329068547314877, 4.527186437261438, 3.5724501538885134],[2.144914100647332, 4.951733405544532, 4.325440230285053]]

    In [6]:

    data = np.array(dataList)data

    Out[6]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

    In [7]:

    Important information about an ndarray

    print(data.ndim) # Number of dimensionsprint(data.shape)# Shape of the ndarrayprint(data.dtype)# Data type contained in the array

    2(4, 3)float64

    Other functions to create arrays

    arange

    This is equivalent to the range function, except returns a one-dimensional array instead of a range

    object:

    In [8]:

    np.arange(10)

    Out[8]:

    array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

  • 7/25/2019 Numpy for Python

    3/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 3/27

    ones, ones_like, zeros, zeros_like

    To create arrays filled with ones or zeroes with a given shape or with a shape similar to a given

    object:

    In [9]:

    np.ones(3)

    Out[9]:

    array([ 1., 1., 1.])

    In [10]:

    np.ones((3, 3))

    Out[10]:

    array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])

    In [11]:

    np.zeros((4))

    Out[11]:

    array([ 0., 0., 0., 0.])

    In [12]:

    np.zeros((4, 4, 4))

    Out[12]:

    array([[[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

    [[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

    [[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]],

    [[ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.], [ 0., 0., 0., 0.]]])

  • 7/25/2019 Numpy for Python

    4/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 4/27

    In [13]:

    np.ones_like(data)

    Out[13]:

    array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]])

    In [14]:

    np.zeros_like(data)

    Out[14]:

    array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.],

    [ 0., 0., 0.]])

    empty, empty_like

    Just like ones and zeros but initializes an empty array with garbage values not zero.

    In [15]:

    np.empty((2, 3))

    Out[15]:

    array([[ 6.91635841e-310, 6.91636044e-310, 6.91635874e-310], [ 6.91635549e-310, 6.91635128e-310, 6.91635185e-310]])

    In [16]:

    np.empty_like(data)

    Out[16]:

    array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

    The ndarray function is a lower level function to create numpy arrays with even more power.

    However, we do not explore that here.

    Choosing a data type

    Most array creation functions take a dtype argument which can be used to explicitly specify the data

    type with which the array should be created.

  • 7/25/2019 Numpy for Python

    5/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 5/27

    In [17]:

    np.array(dataList, dtype=np.int)# Everything is truncated to integers

    Out[17]:

    array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]])

    In [18]:

    np.array(dataList, dtype=np.float)

    Out[18]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015],

    [ 2.1449141 , 4.95173341, 4.32544023]])

    In [19]:

    np.array(dataList, dtype=np.bool)# All numbers > 0 are True

    Out[19]:

    array([[ True, True, True], [ True, True, True], [ True, True, True],

    [ True, True, True]], dtype=bool)

    In [20]:

    np.array(dataList, dtype=np.unicode_)# Unicode strings

    Out[20]:

    array([['4.844700289907117', '3.285473931529339', '2.1797684393413155'], ['5.824634396536993', '0.8824946651389621', '2.76732952458187'],

    ['1.8329068547314877', '4.527186437261438', '3.5724501538885134'], ['2.144914100647332', '4.951733405544532', '4.325440230285053']],

    dtype='

  • 7/25/2019 Numpy for Python

    6/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 6/27

    In [21]:

    np.array(dataList, dtype=np.object)# Arrays containing arbitrary objects.

    Out[21]:

    array([[4.844700289907117, 3.285473931529339, 2.1797684393413155], [5.824634396536993, 0.8824946651389621, 2.76732952458187], [1.8329068547314877, 4.527186437261438, 3.5724501538885134], [2.144914100647332, 4.951733405544532, 4.325440230285053]], dtype=object)

    In [3]:

    np.array([3, 3.141, 'Pi'], dtype=np.object)

    Out[3]:

    array([3, 3.141, 'Pi'], dtype=object)

    Numpy provides a richer typeset than this. One may choose integers and floats of various different

    sizes based on requirements. The details are available in the book. Please refer.

    Casting to a data type

    A numpy ndarray carries anastype method which may be used to cast the elements of the array to

    a new type. Note that this operation always creates a copy of the original array even if the elements

    are being cast to the same data type. Let's look at a few examples.

    In [22]:

    data

    Out[22]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952],

    [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

    In [23]:

    data.astype(np.int8)

    Out[23]:

    array([[4, 3, 2], [5, 0, 2],

    [1, 4, 3], [2, 4, 4]], dtype=int8)

  • 7/25/2019 Numpy for Python

    7/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 7/27

    In [24]:

    data.astype(np.uint32)

    Out[24]:

    array([[4, 3, 2], [5, 0, 2], [1, 4, 3], [2, 4, 4]], dtype=uint32)

    In [25]:

    data.astype(np.float128)

    Out[25]:

    array([[ 4.8447003, 3.2854739, 2.1797684], [ 5.8246344, 0.88249467, 2.7673295], [ 1.8329069, 4.5271864, 3.5724502],

    [ 2.1449141, 4.9517334, 4.3254402]], dtype=float128)

    In [26]:

    data.astype(np.complex64)

    Out[26]:

    array([[ 4.84470034+0.j, 3.28547382+0.j, 2.17976832+0.j], [ 5.82463455+0.j, 0.88249469+0.j, 2.76732945+0.j], [ 1.83290684+0.j, 4.52718639+0.j, 3.57245016+0.j],

    [ 2.14491415+0.j, 4.95173359+0.j, 4.32544041+0.j]], dtype=complex64)

    In [27]:

    data.astype(np.string_)

    Out[27]:

    array([[b'4.844700289907117', b'3.285473931529339', b'2.1797684393413155'], [b'5.824634396536993', b'0.8824946651389621', b'2.76732

    952458187'], [b'1.8329068547314877', b'4.527186437261438', b'3.5724501538885134'], [b'2.144914100647332', b'4.951733405544532', b'4.325440230285053']],

    dtype='|S32')

    Vectorized math

    Numpy's ndarrays allow concise mathematical expressions without the need for iteration using forloops: quite like R.

  • 7/25/2019 Numpy for Python

    8/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 8/27

    In [28]:

    data + data

    Out[28]:

    array([[ 9.68940058, 6.57094786, 4.35953688], [ 11.64926879, 1.76498933, 5.53465905], [ 3.66581371, 9.05437287, 7.14490031], [ 4.2898282 , 9.90346681, 8.65088046]])

    In [29]:

    data * 3

    Out[29]:

    array([[ 14.53410087, 9.85642179, 6.53930532], [ 17.47390319, 2.647484 , 8.30198857], [ 5.49872056, 13.58155931, 10.71735046],

    [ 6.4347423 , 14.85520022, 12.97632069]])

    In [30]:

    1 / data

    Out[30]:

    array([[ 0.20641112, 0.30437009, 0.45876433], [ 0.1716846 , 1.13315133, 0.36135921], [ 0.54558146, 0.22088774, 0.27991993],

    [ 0.46621914, 0.20194948, 0.23119034]])

    In [31]:

    2 ** data.astype(np.int)

    Out[31]:

    array([[16, 8, 4], [32, 1, 4], [ 2, 16, 8], [ 4, 16, 16]])

    In [7]:

    data * [1, 2, 3]# Elementwise multiplication of columns

    Out[7]:

    array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])

  • 7/25/2019 Numpy for Python

    9/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 9/27

    In [8]:

    data * [1, 2, 3, 4]

    ---------------------------------------------------------------------------ValueError Traceback (most recent call last)

    in ()----> 1 data * [1, 2, 3, 4]

    ValueError: operands could not be broadcast together with shapes (4,3) (4,)

    In [12]:

    x = np.array([[1], [2], [3], [4]])print(x, "\n\n", x.shape)

    [[1][2][3][4]]

    (4, 1)

    In [9]:

    data * [[1], [2], [3], [4]]# Elementwise multiplication of rows

    Out[9]:array([[ 4.84470029, 3.28547393, 2.17976844], [ 11.64926879, 1.76498933, 5.53465905], [ 5.49872056, 13.58155931, 10.71735046], [ 8.5796564 , 19.80693362, 17.30176092]])

    In [34]:

    diag = np.diag([1, 2, 3])diag

    Out[34]:

    array([[1, 0, 0], [0, 2, 0], [0, 0, 3]])

  • 7/25/2019 Numpy for Python

    10/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 10/27

    In [35]:

    data * diag# Will not do matrix multiplication

    ---------------------------------------------------------------------------ValueError Traceback (most recent call last)

    in ()----> 1 data * diag # Will not do matrix multiplication

    ValueError: operands could not be broadcast together with shapes (4,3) (3,3)

    In [36]:

    np.dot(data, diag)

    Out[36]:

    array([[ 4.84470029, 6.57094786, 6.53930532], [ 5.8246344 , 1.76498933, 8.30198857], [ 1.83290685, 9.05437287, 10.71735046], [ 2.1449141 , 9.90346681, 12.97632069]])

    In [37]:

    print(np.sqrt(data), "\n\n", np.log(data), "\n\n", np.exp(data))

    [[ 2.20106799 1.81258763 1.47640389]

    [ 2.41342793 0.93941187 1.66352924][ 1.3538489 2.1277186 1.89009263][ 1.46455253 2.22524907 2.07976927]]

    [[ 1.57788538 1.18951091 0.77921865][ 1.76209623 -0.12500254 1.01788278][ 0.60590315 1.51010065 1.27325168][ 0.76309951 1.5997377 1.46451392]]

    [[ 127.06519357 26.72164554 8.84425804][ 338.53734003 2.4169216 15.91607372]

    [ 6.25203402 92.49794585 35.60372097][ 8.54130751 141.4198896 75.59878641]]

    In numpy parlance, these functions that apply the same operation on each of the elements of an array

    are called universal functionsor ufuncs. On the other hand, there are functions that return

    aggregations or other types of operations on ndarrays. Refer to McKinney (2012) for details.

  • 7/25/2019 Numpy for Python

    11/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 11/27

    In [38]:

    print(np.sum(data), "\n\n", np.min(data), "\n\n", np.max(data), "\n\n", np.mean(data))

    41.1390324294

    0.882494665139

    5.82463439654

    3.42825270245

    Indexing ndarrays

    Indexing by position

    ndarrays are collections of data of a given shape, size, and type. Indexing is the way to access

    elements, sub-collections of elements in a given ndarray.

    Numpy provides a rich set of indexing semantics that allow concise expression of various indexing

    operations.

    One-dimensional arrays

    The simplest indexing operation into any one dimension of a numpy ndarray mimics the semantics

    of Python's indexing operator :.

    Let's look at this using a one-dimensional array.

    In [39]:

    oneD = np.arange(10)oneD

    Out[39]:

    array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

  • 7/25/2019 Numpy for Python

    12/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 12/27

    In [40]:

    Standard Python indexing: similar to lists and tuples

    print(oneD[0]) # 0-based indexingprint(oneD[2:4]) # Indexing a range: excludes right limitprint(oneD[:4]) # Indexing from the beginning implictlyprint(oneD[4:]) # Indexing to the end implicitly

    print(oneD[1:5:2])# Indexing with jumpsprint(oneD[::-1]) # Reverting an arrayprint(oneD[:-2]) # Negative indices to index from the end

    0[2 3][0 1 2 3][4 5 6 7 8 9][1 3][9 8 7 6 5 4 3 2 1 0][0 1 2 3 4 5 6 7]

    Indexing can be combined with the = operator to assign values in an ndarray.

    In [41]:

    oneD[3] = 13oneD

    Out[41]:

    array([ 0, 1, 2, 13, 4, 5, 6, 7, 8, 9])

    In [42]:

    oneD[1:3] = [11, 12]oneD

    Out[42]:

    array([ 0, 11, 12, 13, 4, 5, 6, 7, 8, 9])

    In [43]:oneD[4:7] = [1, 2]# The shape of the replacement must match

    ---------------------------------------------------------------------------ValueError Traceback (most recent call last) in ()----> 1 oneD[4:7] = [1, 2] # The shape of the replacement mustmatch

    ValueError: cannot copy sequence with size 2 to array axis with dimension 3

  • 7/25/2019 Numpy for Python

    13/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 13/27

    In [44]:

    oneD[7:9] = 20# Scalars are replicated to fill spaceoneD

    Out[44]:

    array([ 0, 11, 12, 13, 4, 5, 6, 20, 20, 9])

    Aside: Indexing by position creates views

    Numpy intends to be conservative with memory usage and is designed such that indexing / slicing into

    an ndarray does not copy elements unless explicitly asked to. Therefore, slices are viewsinto the

    original array the elements of a slice are the same as that of the original.

    In [45]:

    oneDSlice = oneD[2:5]

    oneDSlice

    Out[45]:

    array([12, 13, 4])

    Since the slice is just a view into the original ndarray, changes to the slice are also reflected into the

    original. Therefore code that works on slices must be careful with introducing any unwanted changes

    into the original ndarray.

    In [46]:

    oneDSlice[2] = 14print(oneDSlice)oneD

    [12 13 14]

    Out[46]:

    array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])

    To create an explicit copy of the view / slice, one may use the copy method available with an array.

    For example:

    In [47]:

    oneDCopy = oneD[2:5].copy()oneDCopy

    Out[47]:

    array([12, 13, 14])

  • 7/25/2019 Numpy for Python

    14/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 14/27

    In [48]:

    oneDCopy[2] = 24print(oneDCopy)oneD

    [12 13 24]

    Out[48]:array([ 0, 11, 12, 13, 14, 5, 6, 20, 20, 9])

    Indexing multi-dimensional arrays

    Multi-dimensional arrays differ from one-dimensional arrays because where elements of a one-

    dimensional array are themselves scalars, the elements of a multi-dimensional array are arrays

    themselves.

    For example, a two-dimensional array (or a matrix) can be considered as an array of row-arrays.

    In [49]:

    data

    Out[49]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

    In [50]:

    row1 = data[0]row1

    Out[50]:

    array([ 4.84470029, 3.28547393, 2.17976844])

    However, a two-dimensional array can also be considered as an array of column-arrays. How do we

    access that first column for example?

    Numpy provides n indices into any arbitrary n-dimensional array. Any specific element in the

    ndarray can be accessed by specifying the position of the element along the n dimensions.

    In [51]:

    data[1, 1]# Access the second diagonal element in 'data'

    Out[51]:

    0.88249466513896213

  • 7/25/2019 Numpy for Python

    15/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 15/27

    In [52]:

    data[:, 1]# Access the second column where `:` stands for all rows

    Out[52]:

    array([ 3.28547393, 0.88249467, 4.52718644, 4.95173341])

    In [53]:

    data[2, :]# Access the second row

    Out[53]:

    array([ 1.83290685, 4.52718644, 3.57245015])

    In [54]:

    data[:2, :]# First and second row, all columns

    Out[54]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952]])

    In [55]:

    data[:2, ::-1]# First and second row, all columns reversed

    Out[55]:

    array([[ 2.17976844, 3.28547393, 4.84470029],

    [ 2.76732952, 0.88249467, 5.8246344 ]])

    Fancy Indexing

    All the position based indexing that we have seen till now uses slice objects to create views out of

    the ndarrays. Moreover, these slice objects create views out of the ndarray instead of copying

    elements in the object.

    The slice based indexing notation however does not allow one to take values at an arbitrary set of

    positions out of the array. For example, consider an array of ten elements with the problem of takingthe first, the fourth, and the ninth element out of the array.

    These are the situations where 'fancy indexing' is used instead. An important thing to note is that

    fancy indexing does not create views into the existing array instead they create a new ndarray, into

    which, the desired elements are copied. Therefore, fancy indexing should be avoided if one can use

    slices to index with.

  • 7/25/2019 Numpy for Python

    16/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 16/27

    In [56]:

    data

    Out[56]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, 4.52718644, 3.57245015], [ 2.1449141 , 4.95173341, 4.32544023]])

    In [57]:

    data[[2, 0, 1]]

    Out[57]:

    array([[ 1.83290685, 4.52718644, 3.57245015], [ 4.84470029, 3.28547393, 2.17976844],

    [ 5.8246344 , 0.88249467, 2.76732952]])

    In [58]:

    data[:, [-1, 2, 0]]

    Out[58]:

    array([[ 2.17976844, 2.17976844, 4.84470029], [ 2.76732952, 2.76732952, 5.8246344 ], [ 3.57245015, 3.57245015, 1.83290685], [ 4.32544023, 4.32544023, 2.1449141 ]])

    In [59]:

    data[[2, 0, 1], [-1, 2, 0]]

    Out[59]:

    array([ 3.57245015, 2.17976844, 5.8246344 ])

    In [60]:

    data[[2, 0, 1], [-1, 2]]

    ---------------------------------------------------------------------------IndexError Traceback (most recent call last) in ()----> 1 data[[2, 0, 1], [-1, 2]]

    IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)

  • 7/25/2019 Numpy for Python

    17/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 17/27

    In [61]:

    data[np.ix_([2, 0, 1], [-1, 2])]

    Out[61]:

    array([[ 3.57245015, 3.57245015], [ 2.17976844, 2.17976844], [ 2.76732952, 2.76732952]])

    Boolean indexing

    The indexing semantics that we have seen till now are useful when the position of the subset to be

    extracted is known. However, there are often situations where one wants to select a subset of a

    collection not based on some predicate.

    For example, in the following data ndarray comprised of numbers between zero to nine, one may

    want to select elements that are less than 4. In such situations boolean indexing comes useful.

    In [62]:

    print(data)# Let's recap what is in data.

    [[ 4.84470029 3.28547393 2.17976844][ 5.8246344 0.88249467 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

    In [63]:

    data[np.array([True, False, True, False])]# Subset rows

    Out[63]:

    array([[ 4.84470029, 3.28547393, 2.17976844], [ 1.83290685, 4.52718644, 3.57245015]])

    In [64]:

    data[:, np.array([True, False, True])]# Subset columns

    Out[64]:

    array([[ 4.84470029, 2.17976844], [ 5.8246344 , 2.76732952], [ 1.83290685, 3.57245015], [ 2.1449141 , 4.32544023]])

    In [65]:

    data[-np.array([True, False, True, False])]# Invert bool array

    Out[65]:

    array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])

  • 7/25/2019 Numpy for Python

    18/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 18/27

    In [66]:

    data[~np.array([True, False, True, False])]# Invert bool array

    Out[66]:

    array([[ 5.8246344 , 0.88249467, 2.76732952], [ 2.1449141 , 4.95173341, 4.32544023]])

    Unlike for numerical indices, if the shape of the boolean indexing array does not match the shape of

    the array being indexed, then the values that are left out of the indexing array are considered to be

    False. Here is an example. Best to avoid such indexing.

    In [21]:

    x = data[np.array([True, False, True]), np.array([True, False, True])]print(x)print(x.ndim)

    print(x.shape)

    [ 4.84470029 3.57245015]1(2,)

    Aside: More on booleans

    Conditional operations on other arrays generate boolean arrays as well. For example:

    In [68]:

    data < 4# Returns an array of booleans

    Out[68]:

    array([[False, True, True], [False, True, True], [ True, False, True], [ True, False, False]], dtype=bool)

    These multidimensional arrays can be used to index other arrays but with surprising, yet, correctbehavior. If conditional arrays are desired for conditional assignments, the numpy.where function is

    handy.

    In [69]:

    data[data < 4]# Mangles the shape of the array

    Out[69]:

    array([ 3.28547393, 2.17976844, 0.88249467, 2.76732952,

    1.83290685, 3.57245015, 2.1449141 ])

  • 7/25/2019 Numpy for Python

    19/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 19/27

    In [70]:

    np.where(data < 4, data, -data)# Use `where` for conditional assignments

    Out[70]:

    array([[-4.84470029, 3.28547393, 2.17976844], [-5.8246344 , 0.88249467, 2.76732952], [ 1.83290685, -4.52718644, 3.57245015], [ 2.1449141 , -4.95173341, -4.32544023]])

    Logical combinations on booleans

    Numpy provides the standard logical operations: and(&), or (|), and not (~, -) that we have already

    seen in action. Besides, the symbolic operators, these logical operations are also provided as

    functions in the numpy library.

    In [71]:np.array([True, False, True, False]) | np.array([True] * 4)

    Out[71]:

    array([ True, True, True, True], dtype=bool)

    In [72]:

    np.array([True, False, True, False]) & np.array([True] * 4)

    Out[72]:array([ True, False, True, False], dtype=bool)

    In [73]:

    np.logical_or(np.array([True, False] * 2), np.array([True] * 4))

    Out[73]:

    array([ True, True, True, True], dtype=bool)

    In [74]:

    np.logical_and(np.array([True, False] * 2), np.array([True] * 4))

    Out[74]:

    array([ True, False, True, False], dtype=bool)

    In [75]:

    np.logical_not(np.array([True, False]))

    Out[75]:

    array([False, True], dtype=bool)

  • 7/25/2019 Numpy for Python

    20/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 20/27

    Logical aggregations

    Any array of logicals can be compressed to a single logical value based in two different ways: whether

    all values in the array are true or any values in the array are true. Python provides .any() and

    .all() methods on logical arrays to do these aggregations. For example:

    In [76]:

    np.array([True, False, True, True]).all()

    Out[76]:

    False

    In [77]:

    np.array([True, False, True, True]).any()

    Out[77]:

    True

    In [78]:

    np.array([True, True, True, True]).all()

    Out[78]:

    True

    In [79]:

    np.array([False, False, False, False]).all()

    Out[79]:

    False

    Exercise: Create a function which:

    1. none: When given a logical array returns True if all elements in the array are False.

    2. notall: When given a logical array returns True if any elements in the array areFalse.

    Transposing arrays

    There are two ways to transpose a numpy array. Each array has a .T attribute which returns a view

    which is the transpose of the original array. On the other hand, one may use the numpy.transpose

    to return a shallow transposed copy of the array.

    However, the thing of particular note is that each of these methods provide a view and one muse use

    the .copy() method to achieve a true copy. Transposing arrays is only one of the many placeswhere the programmer needs to exercise special caution to ensure that there is no action at a

    distance which can be the reason for many subtle bugs.

  • 7/25/2019 Numpy for Python

    21/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 21/27

    In [80]:

    data.T

    Out[80]:

    array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])

    In [81]:

    np.transpose(data)

    Out[81]:

    array([[ 4.84470029, 5.8246344 , 1.83290685, 2.1449141 ], [ 3.28547393, 0.88249467, 4.52718644, 4.95173341], [ 2.17976844, 2.76732952, 3.57245015, 4.32544023]])

    In [82]:

    NB: Remember that the transpose is only a shallow copy:

    dataT = data.Tprint(dataT)

    [[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 0.88249467 4.52718644 4.95173341][ 2.17976844 2.76732952 3.57245015 4.32544023]]

    In [83]:

    dataT[1, 1] = 1print(data)

    [[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

    Therefore, with numpy it is always better to use .copy() explicity when copies are desired (or take acareful read of the documentation). Let's look at an example:

  • 7/25/2019 Numpy for Python

    22/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 22/27

    In [84]:

    dataT2 = data.T.copy()dataT2[1, 1] = 2print(dataT2, "\n\n", data)

    [[ 4.84470029 5.8246344 1.83290685 2.1449141 ][ 3.28547393 2. 4.52718644 4.95173341]

    [ 2.17976844 2.76732952 3.57245015 4.32544023]]

    [[ 4.84470029 3.28547393 2.17976844][ 5.8246344 1. 2.76732952][ 1.83290685 4.52718644 3.57245015][ 2.1449141 4.95173341 4.32544023]]

    Reading from and writing to text files

    Numpy provides functions to read delimited text based datasets using two simple functions:

    numpy.loadtxt andnumpy.savetxt. Let's look at a simple example.

    In [85]:

    !head -10 ../../../data/pythagorean-triples.txt

    3,4,55,12,1315,8,17

    7,24,2621,20,2935,12,379,40,4145,28,5211,60,6133,56,65

  • 7/25/2019 Numpy for Python

    23/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 23/27

    In [22]:

    Load the pythagorean triples into a two dimensional array

    pyTrips = np.loadtxt("../../../data/pythagorean-triples.txt", dtype=np.uint, delimiter=",")print(pyTrips[:10], "\n\n", pyTrips.shape)

    [[ 3 4 5][ 5 12 13][15 8 17][ 7 24 26][21 20 29][35 12 37][ 9 40 41][45 28 52][11 60 61][33 56 65]]

    (101, 3)

  • 7/25/2019 Numpy for Python

    24/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 24/27

    In [23]:

    pyTrips

  • 7/25/2019 Numpy for Python

    25/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 25/27

    Out[23]:

    array([[ 3, 4, 5], [ 5, 12, 13], [ 15, 8, 17], [ 7, 24, 26], [ 21, 20, 29], [ 35, 12, 37],

    [ 9, 40, 41], [ 45, 28, 52], [ 11, 60, 61], [ 33, 56, 65], [ 63, 16, 65], [ 55, 48, 73], [ 13, 84, 85], [ 77, 36, 86], [ 39, 80, 89], [ 65, 72, 97], [ 99, 20, 101],

    [ 91, 60, 109], [ 15, 112, 111], [117, 44, 125], [105, 88, 137], [ 17, 144, 145], [143, 24, 145], [ 51, 140, 147], [ 85, 132, 157], [119, 120, 169], [165, 52, 173], [ 19, 180, 181],

    [ 57, 176, 185], [153, 104, 185], [ 95, 168, 193], [195, 28, 199], [133, 156, 205], [187, 84, 205], [ 21, 220, 221], [171, 140, 221], [221, 60, 229], [105, 208, 235], [209, 120, 241],

    [255, 32, 257], [ 23, 264, 265], [247, 96, 265], [ 69, 260, 269], [115, 252, 271], [231, 160, 281], [161, 240, 289], [285, 68, 293], [207, 224, 305], [273, 136, 305], [ 25, 312, 313],

    [ 75, 308, 317], [253, 204, 325], [323, 36, 325], [175, 288, 337], [299, 180, 349],

  • 7/25/2019 Numpy for Python

    26/27

    1/11/2016 01-numpy

    file:///home/fractaluser/Downloads/01-numpy.html 26/27

    [225, 272, 353], [ 27, 364, 368], [357, 76, 365], [275, 252, 373], [135, 352, 377], [345, 152, 379], [189, 340, 389], [325, 228, 397],

    [399, 40, 401], [391, 120, 409], [ 29, 420, 421], [ 87, 416, 425], [297, 304, 425], [145, 408, 433], [203, 396, 445], [437, 84, 445], [351, 280, 449], [425, 168, 457], [261, 380, 465],

    [ 31, 480, 481], [319, 360, 481], [ 93, 476, 485], [483, 44, 485], [155, 468, 493], [475, 132, 493], [217, 456, 505], [377, 336, 505], [459, 220, 590], [279, 440, 521], [435, 308, 533],

    [525, 92, 533], [341, 420, 541], [ 33, 544, 545], [513, 184, 545], [165, 532, 557], [403, 396, 565], [493, 276, 556], [231, 520, 569], [575, 48, 577], [465, 368, 593], [551, 240, 601],

    [ 35, 612, 613], [105, 608, 617], [527, 336, 627], [429, 460, 629], [621, 100, 629]], dtype=uint64)

    In [25]:

    ?np.where

  • 7/25/2019 Numpy for Python

    27/27

    1/11/2016 01-numpy

    Exercise

    Using the array on Pythagorean triples loaded above, write numpy code to do the following:

    1. Create three arrays, viz. baseSq, altitudeSq, and hypotenuseSq respectively by

    squaring the first, second, and third columns of the pyTrips array.

    2. Use numpy.where and the three arrays created above to find which of these triples are notreally Pythagorean.

    3. Count the number of non-Pythagorean triples.