Tag: python

  • Join two or more arrays using `numpy.concatenate`

    Join two or more arrays using `numpy.concatenate`

    np.concatenate is used for concatenating numpy arrays.
    We will discuss here some of the functionalities of this method of numpy arrays.
    It takes a list of two or more arrays as input argument, and some other keyword arguments, one of which we will cover here.

    Simplest usage of the concatenate method is as follows:

    Python
    import numpy as np
    
    # make two arrays
    a = np.array([1,2,3])
    b = np.array([4,5,6])
    
    # concatenate
    c = np.concatenate([a,b], axis=0)
    
    # print all the arrays and their shapes
    print('\n\n', a)
    print('\n\n', b)
    print('\n\n', c)
    
    print('\nshape of a = ', c.shape)
    print('\nshape of b = ', c.shape)
    print('\nshape of c = ', c.shape)
    a:
     [1 2 3]
    
    b:
     [4 5 6]
    
    c:
     [1 2 3 4 5 6]
    
    shape of a =  (6,)
    
    shape of b =  (6,)
    
    shape of c =  (6,)
    

    Both of the inputs in above code are one dimensional. Hence, they have just one axis. np.concatenate has keyword argument, axis, whose default value is 0. In the above code, we have written it explicitly for clarity.

    Let’s generate similar array of 2 dimensions. For this we need to provide the list of numbers inside a list to the np.array method.

    Python
    a = np.array([[1,2,3]])
    b = np.array([[4,5,6]])
    
    print('\n\nshape of a = ', a.shape)
    shape of a =  (1, 3)

    We can now see that the shape of the arrays a, and b is (1,3),
    meaning they have 1 row and 3 columns.

    The rows correspond to the 0 (zeroth) axis and the columns correspond
    to the 1 (first) axis.

    Let’s now say we want to stack the two arrays on top of each other. The resultant array should give us two rows and three columns.

    We use the concatenate method as follows:

    Python
    c = np.concatenate([a,b], axis=0)
    
    print('\n\na:\n', a)
    print('\n\nb:\n', b)
    print('\n\nc:\n', c)
    
    print('\n\nshape of c = ', c.shape)
    a:
     [[1 2 3]]
    
    b:
     [[4 5 6]]
    
    c:
     [[1 2 3]
     [4 5 6]]
    
    shape of c =  (2, 3)

    To stack the two arrays side ways to get an array of shape, (1,6), make
    the value of keyword argument, axis equal to ‘1’.

    Python
    c = np.concatenate([a,b], axis=1)
    
    print('\n\na:\n', a)
    print('\n\nb:\n', b)
    print('\n\nc:\n', c)
    
    print('\n\nshape of c = ', c.shape)
    a:
     [[1 2 3]]
    
    b:
     [[4 5 6]]
    
    c:
     [[1 2 3 4 5 6]]
    
    shape of c =  (1, 6)

    When axis=0, the number of columns in each of the arrays need to be same.
    Also, when axis=1, the number of rows in each of the arrays need to be same.

    You may concatenate as many arrays in one statements, provided their shapes are compatible for concatenation.

    Python
    c = np.concatenate([a,b,a,a,b,a,b], axis=0)
    
    print('\n\na:\n', a)
    print('\n\nb:\n', b)
    print('\n\nc:\n', c)
    
    print('\n\nshape of c = ', c.shape)
    a:
     [[1 2 3]]
    
    b:
     [[4 5 6]]
    
    c:
     [[1 2 3]
     [4 5 6]
     [1 2 3]
     [1 2 3]
     [4 5 6]
     [1 2 3]
     [4 5 6]]
    
    shape of c =  (7, 3)
  • Convert class labels to categories using keras

    Convert class labels to categories using keras

    Class labels can be converted to OneHot encoded array using keras.utils.to_categorical.
    The resultant array has number of rows equal to the number of samples, and number of columns equal to the number of classes.

    Let’s take an example of an arrray containing labels.
    First we need to import numpy to create the labels array and then define the labels array.

    import numpy as np
    labels = np.array([0,0,1,2,1,3,2,1])

    The labels contain four categories.

    np.unique(labels)
    array([0, 1, 2, 3])

    To convert the labels to OneHot encoded array, excute the following:

    import tensorflow as tf
    labels_encoded = tf.keras.utils.to_categorical(labels)
    print(labels_encoded)
    [[1. 0. 0. 0.]
     [1. 0. 0. 0.]
     [0. 1. 0. 0.]
     [0. 0. 1. 0.]
     [0. 1. 0. 0.]
     [0. 0. 0. 1.]
     [0. 0. 1. 0.]
     [0. 1. 0. 0.]]

    This encoded array can be used for training multiclass classification model.

  • Change elements of an array based on a condition using np.where

    Change elements of an array based on a condition using np.where

    Let’s say we want to convert multiple categorical variables into binary variables by selecting one category as “0” and the rest as “1”.

    Or we want to change the values of an array based on a condition, such as in RELU function where all negative values are converted to zero and rest stay the same.

    We can do this using np.where function.

    Let’s take an array of letter from “A” to “E”. We want to have the letter “C” to be labelled as “0” and rest of the letter to be labelled as one. Following is ho we do it:

    import numpy as np
    # create array
    a = np.array(['A', 'B', 'C', 'D', 'E'])
    
    # convert to binary labels
    b = np.where(a == 'C', 0, 1)
    
    print(f'a = {a}')
    print(f'b = {b}')
    a = ['A' 'B' 'C' 'D' 'E']
    b = [1 1 0 1 1]

    Now, let’s say we have a three dimensional array of numbers of size (20, 4, 4) and we want to emulate the RELU function where all negative number of the array would be set to 0 and rest remain the same.

    We will generate an array of random numbers from standard normal distribution for our example and apply the np.where function to do the transformation.

    a = np.random.normal(size=(20, 4, 4))
    b = np.where(a < 0, 0, a)
    
    # print part of the arrays for understanding
    print(a[0])
    print(b[0])
    [[ 1.45872533 -0.24965688 -1.11663205 -0.65852554]
     [-1.13076242 -0.49868332 -0.46350182 -0.02889719]
     [-0.99350298  0.88240974  0.87975654 -0.28836425]
     [-0.10684949 -0.88570172  1.70835701 -0.16105656]]
     
     [[1.45872533 0.         0.         0.        ]
     [0.         0.         0.         0.        ]
     [0.         0.88240974 0.87975654 0.        ]
     [0.         0.         1.70835701 0.        ]]

    It is clear that the negative numbers in the array have been converted to 0.

    We have only used a equal to (==) condition in above examples, but we may use any comparative operators as per our need. For example we can take log2 value of those numbers in an array which are greater than or equal to a certain number.

    a = np.random.normal(size=(20, 4, 4)) * 10
    a = a * 10
    b = np.where(a >= 5, np.log2(a), a)

  • How to save python objects using joblib

    Often times you would want to save python objects for later use. For example, a dataset you constructed which could be used for several projects, or a transformer object with specific parameters you want to apply for different data, or even a machine learning learning model you trained.

    This is how to do it. First, we will create a dummy data using numpy.

    import numpy as np
    a = np.random.chisquare(45, size=(10000, 50))

    Let’s say you want to transform it using the quantile transformer and save the transformed data for later use. Following is how you would transform it.

    from sklearn.preprocessing import QuantileTransformer
    qt = QuantileTransformer(n_quantiles=1000, random_state=10)
    a_qt = qt.fit_transform(a)

    Save python objects

    To save a_qt using joblib the dump method is used.

    import joblib
    joblib.dump(a_qt, 'out/a_qt.pckl')

    You can even save the transformer object qt.

    joblib.dump(qt, 'out/qt.pckl')

    Load python objects

    To load the objects we use the joblib method load.

    # load array
    b_qt = joblib.load('out/a_qt.pckl')
    
    # load transformer
    qt2 = joblib.load('out/qt.pckl')

    You can verify that the saved objects and loaded objects are same, by printing the arrays or printing the class of the objects.

    print('Shape of a_qt: ', a_qt.shape)
    print('Shape of b_qt: ', b_qt.shape)
    
    print('Class of qt', type(qt))
    print('Class of qt2', type(qt2))
    Shape of a_qt:  (10000, 50)
    Shape of b_qt:  (10000, 50)
    Class of qt <class 'sklearn.preprocessing._data.QuantileTransformer'>
    Class of qt2 <class 'sklearn.preprocessing._data.QuantileTransformer'>

Privacy Overview
Analytics Notes

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.