Often times you would want to save python objects for later use. For example, a dataset you constructed which could be used for several projects, or a transformer object with specific parameters you want to apply for different data, or even a machine learning learning model you trained.
This is how to do it. First, we will create a dummy data using numpy
.
import numpy as np
a = np.random.chisquare(45, size=(10000, 50))
Let’s say you want to transform it using the quantile transformer and save the transformed data for later use. Following is how you would transform it.
from sklearn.preprocessing import QuantileTransformer
qt = QuantileTransformer(n_quantiles=1000, random_state=10)
a_qt = qt.fit_transform(a)
Save python objects
To save a_qt
using joblib
the dump
method is used.
import joblib
joblib.dump(a_qt, 'out/a_qt.pckl')
You can even save the transformer object qt
.
joblib.dump(qt, 'out/qt.pckl')
Load python objects
To load the objects we use the joblib
method load
.
# load array
b_qt = joblib.load('out/a_qt.pckl')
# load transformer
qt2 = joblib.load('out/qt.pckl')
You can verify that the saved objects and loaded objects are same, by printing the arrays or printing the class
of the objects.
print('Shape of a_qt: ', a_qt.shape)
print('Shape of b_qt: ', b_qt.shape)
print('Class of qt', type(qt))
print('Class of qt2', type(qt2))
Shape of a_qt: (10000, 50)
Shape of b_qt: (10000, 50)
Class of qt <class 'sklearn.preprocessing._data.QuantileTransformer'>
Class of qt2 <class 'sklearn.preprocessing._data.QuantileTransformer'>