Tag: python

Join two or more arrays using `numpy.concatenate`
np.concatenate is used for concatenating numpy arrays.
We will discuss here some of the functionalities of this method of numpy arrays.
It takes a list of two or more arrays as input argument, and some other keyword arguments, one of which we will cover here.

Simplest usage of the concatenate method is as follows:
Python
```
import numpy as np

# make two arrays
a = np.array([1,2,3])
b = np.array([4,5,6])

# concatenate
c = np.concatenate([a,b], axis=0)

# print all the arrays and their shapes
print('\n\n', a)
print('\n\n', b)
print('\n\n', c)

print('\nshape of a = ', c.shape)
print('\nshape of b = ', c.shape)
print('\nshape of c = ', c.shape)
```
```
a:
 [1 2 3]

b:
 [4 5 6]

c:
 [1 2 3 4 5 6]

shape of a =  (6,)

shape of b =  (6,)

shape of c =  (6,)
```
Both of the inputs in above code are one dimensional. Hence, they have just one axis. np.concatenate has keyword argument, axis, whose default value is 0. In the above code, we have written it explicitly for clarity.

Let’s generate similar array of 2 dimensions. For this we need to provide the list of numbers inside a list to the np.array method.
Python
```
a = np.array([[1,2,3]])
b = np.array([[4,5,6]])

print('\n\nshape of a = ', a.shape)
```
```
shape of a =  (1, 3)
```
We can now see that the shape of the arrays a, and b is (1,3),
meaning they have 1 row and 3 columns.

The rows correspond to the 0 (zeroth) axis and the columns correspond
to the 1 (first) axis.

Let’s now say we want to stack the two arrays on top of each other. The resultant array should give us two rows and three columns.

We use the concatenate method as follows:
Python
```
c = np.concatenate([a,b], axis=0)

print('\n\na:\n', a)
print('\n\nb:\n', b)
print('\n\nc:\n', c)

print('\n\nshape of c = ', c.shape)
```
```
a:
 [[1 2 3]]

b:
 [[4 5 6]]

c:
 [[1 2 3]
 [4 5 6]]

shape of c =  (2, 3)
```
To stack the two arrays side ways to get an array of shape, (1,6), make
the value of keyword argument, axis equal to ‘1’.
Python
```
c = np.concatenate([a,b], axis=1)

print('\n\na:\n', a)
print('\n\nb:\n', b)
print('\n\nc:\n', c)

print('\n\nshape of c = ', c.shape)
```
```
a:
 [[1 2 3]]

b:
 [[4 5 6]]

c:
 [[1 2 3 4 5 6]]

shape of c =  (1, 6)
```
When axis=0, the number of columns in each of the arrays need to be same.
Also, when axis=1, the number of rows in each of the arrays need to be same.

You may concatenate as many arrays in one statements, provided their shapes are compatible for concatenation.
Python
```
c = np.concatenate([a,b,a,a,b,a,b], axis=0)

print('\n\na:\n', a)
print('\n\nb:\n', b)
print('\n\nc:\n', c)

print('\n\nshape of c = ', c.shape)
```
```
a:
 [[1 2 3]]

b:
 [[4 5 6]]

c:
 [[1 2 3]
 [4 5 6]
 [1 2 3]
 [1 2 3]
 [4 5 6]
 [1 2 3]
 [4 5 6]]

shape of c =  (7, 3)
```
Convert class labels to categories using keras
Class labels can be converted to OneHot encoded array using keras.utils.to_categorical.
The resultant array has number of rows equal to the number of samples, and number of columns equal to the number of classes.

Let’s take an example of an arrray containing labels.
First we need to import numpy to create the labels array and then define the labels array.
```
import numpy as np
labels = np.array([0,0,1,2,1,3,2,1])
```
The labels contain four categories.
```
np.unique(labels)
```
```
array([0, 1, 2, 3])
```
To convert the labels to OneHot encoded array, excute the following:
```
import tensorflow as tf
labels_encoded = tf.keras.utils.to_categorical(labels)
print(labels_encoded)
```
```
[[1. 0. 0. 0.]
 [1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 1. 0. 0.]
 [0. 0. 0. 1.]
 [0. 0. 1. 0.]
 [0. 1. 0. 0.]]
```
This encoded array can be used for training multiclass classification model.
Change elements of an array based on a condition using np.where
Let’s say we want to convert multiple categorical variables into binary variables by selecting one category as “0” and the rest as “1”.

Or we want to change the values of an array based on a condition, such as in RELU function where all negative values are converted to zero and rest stay the same.

We can do this using np.where function.

Let’s take an array of letter from “A” to “E”. We want to have the letter “C” to be labelled as “0” and rest of the letter to be labelled as one. Following is ho we do it:
```
import numpy as np
# create array
a = np.array(['A', 'B', 'C', 'D', 'E'])

# convert to binary labels
b = np.where(a == 'C', 0, 1)

print(f'a = {a}')
print(f'b = {b}')
```
```
a = ['A' 'B' 'C' 'D' 'E']
b = [1 1 0 1 1]
```
Now, let’s say we have a three dimensional array of numbers of size (20, 4, 4) and we want to emulate the RELU function where all negative number of the array would be set to 0 and rest remain the same.

We will generate an array of random numbers from standard normal distribution for our example and apply the np.where function to do the transformation.
```
a = np.random.normal(size=(20, 4, 4))
b = np.where(a < 0, 0, a)

# print part of the arrays for understanding
print(a[0])
print(b[0])
```
```
[[ 1.45872533 -0.24965688 -1.11663205 -0.65852554]
 [-1.13076242 -0.49868332 -0.46350182 -0.02889719]
 [-0.99350298  0.88240974  0.87975654 -0.28836425]
 [-0.10684949 -0.88570172  1.70835701 -0.16105656]]
 
 [[1.45872533 0.         0.         0.        ]
 [0.         0.         0.         0.        ]
 [0.         0.88240974 0.87975654 0.        ]
 [0.         0.         1.70835701 0.        ]]
```
It is clear that the negative numbers in the array have been converted to 0.

We have only used a equal to (==) condition in above examples, but we may use any comparative operators as per our need. For example we can take log2 value of those numbers in an array which are greater than or equal to a certain number.
```
a = np.random.normal(size=(20, 4, 4)) * 10
a = a * 10
b = np.where(a >= 5, np.log2(a), a)
```
How to save python objects using joblib
Often times you would want to save python objects for later use. For example, a dataset you constructed which could be used for several projects, or a transformer object with specific parameters you want to apply for different data, or even a machine learning learning model you trained.

This is how to do it. First, we will create a dummy data using numpy.
```
import numpy as np
a = np.random.chisquare(45, size=(10000, 50))
```
Let’s say you want to transform it using the quantile transformer and save the transformed data for later use. Following is how you would transform it.
```
from sklearn.preprocessing import QuantileTransformer
qt = QuantileTransformer(n_quantiles=1000, random_state=10)
a_qt = qt.fit_transform(a)
```
Save python objects

To save a_qt using joblib the dump method is used.
```
import joblib
joblib.dump(a_qt, 'out/a_qt.pckl')
```
You can even save the transformer object qt.
```
joblib.dump(qt, 'out/qt.pckl')
```
Load python objects

To load the objects we use the joblib method load.
```
# load array
b_qt = joblib.load('out/a_qt.pckl')

# load transformer
qt2 = joblib.load('out/qt.pckl')
```
You can verify that the saved objects and loaded objects are same, by printing the arrays or printing the class of the objects.
```
print('Shape of a_qt: ', a_qt.shape)
print('Shape of b_qt: ', b_qt.shape)

print('Class of qt', type(qt))
print('Class of qt2', type(qt2))
```
```
Shape of a_qt:  (10000, 50)
Shape of b_qt:  (10000, 50)
Class of qt <class 'sklearn.preprocessing._data.QuantileTransformer'>
Class of qt2 <class 'sklearn.preprocessing._data.QuantileTransformer'>
```

Tag: python

Join two or more arrays using `numpy.concatenate`

Convert class labels to categories using keras

Change elements of an array based on a condition using np.where

How to save python objects using joblib

Save python objects

Load python objects