How to deploy your Machine Learning model on the web?

6 min readAug 12, 2021

When we first started to learn Data Science and Machine Learning with the toy datasets, we may have assumed that predicting or getting good values would be the end of the scope but there’s a whole another sea to cross after that. It is imperative that the learning model should be put on to the web so that the customers or end users typically get a chance to lay hands on your product. In this post, let’s understand what deployment is and how it works.

Pre requisites (apart from ML concepts)

Fundamentals of web application
Fundamentals of HTML

Topics covered

Creating a model
Creating a Flask Web App
Creating HTML forms
Deploying it live on the web using Heroku

Deploying a model is basically exchanging of data i.e., we accept an input data in some form (JSON/XML etc), process it through our model and in turn give out the result which essentially is data. Following are the steps to create, build and deploy a model.

STEP 1: Create a basic model

To build a simple model, I’m using Iris dataset (very common) in this case. It is a binary classification dataset with 4 features and 1 target variable. Remember, this post focuses more on model deployment and not the actual model. Here’s the Kaggle link to the data set

#Import basic libraries
import numpy as np
import pandas as pd#Load the data
df = pd.read_csv('Iris.csv')#Separate the features from labels
X = df.drop(['Id','Species'],axis=1)
y = df['Species']#Since model cannot take in text data, we encode the y variable
from sklearn.preprocessing import LabelBinarizer
encoder = LabelBinarizer()
y = encoder.fit_transform(y)#Split the dataset
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25,random_state=42)#Scale the dataset (Remember to only fit it on the train set so that we don't acquire any prior knowledge about the test set)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
sc_X_train = scaler.fit_transform(X_train)
sc_X_test = scaler.transform(X_test)#Create the model (Here a deep learning model is created. Any ML model will work just fine)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(4,activation='relu', input_shape=[4,]))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])#Train and fit the model
from tensorflow.keras.callbacks import EarlyStopping
es = EarlyStopping(monitor='val_loss', patience=5)
model.fit(sc_X_train, y_train, epochs=400,validation_data=(sc_X_test,y_test), callbacks =[es])

STEP 2: Assess the model

Check if the model and it’s metrics are satisfactory. Alter the previous steps if it isn’t. In case if it is okay, finalize the model for deployment. In production, there lies no concept of train test split. We used it to determine the best working model. When the model is ready for production, it is best to retrain on ALL THE DATA.

epochs = len(metrics) #No of epochs set to the metrics variable length
sc_X = scaler.fit_transform(X) #Scaling the entire datamodel = Sequential()
model.add(Dense(4,activation='relu', input_shape=[4,]))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(sc_X, y, epochs=epochs,validation_data=(sc_X_test,y_test), callbacks =[es]) #Now the model is fit on the entire dataset

STEP 3: Save the model and the scaler

Apart from saving the model, it is necessary to save the scaler as well. In the production environment, we do not have the tool or the facility to scale the new incoming data.

model.save('FinalModel.h5')import joblib
joblib.dump(scaler, 'iris_scaler.pkl') #saving the scaler object

STEP 4: Define a function to predict

This is essentially the function that is going to use the model, the scaler and the inputs to predict the final output. It is best practice to always encapsule the predicting capabilities under a function

#Creating a function to predict the output. This takes in the original model, the scaler and the json data as input parametersdef return_prediction(model,scaler,json): 
    s_len = example_input["sepal_length"]
    s_wid = example_input["sepal_width"]
    p_len = example_input["petal_length"]
    p_wid = example_input["petal_width"]
    new_flower = [[s_len,s_wid,p_len,p_wid]]
    new_flower = scaler.transform(new_flower)[0]
    #Predicting the classes based on the input. However, this will return either 0,1 or 2 and not the actual class names. So setting up the actual class names and indexing it.
    class_index = model.predict_classes(new_flower) 
    classes = np.array(['setosa','versicolor','virginica'])
    return classes[class_index]

Step 5: Deploy using a Flask Web App and connect it to a front-end HTML form

Flask is a python based web application framework. It uses python to handle the back end of a web app (our model). We can connect it to front end components (HTML, JS etc). This step focuses on combining the Flask API and the model to a front-end HTML form. This step ensures that anyone who is non-technical can simply go to this website and fill out the form to see the results. Below is the plan of action -

Create a Flask Web App
Create necessary HTML files
Use Flask to create an HTML form and inject into home.html (accept user input and sends to flask app)
Use Flask to accept the submitted form data
Use Flask to return the prediction onto prediction.html (returns prediction once the prediction function has done running)

NOTE:

To run a flask application, we need to write a python script (.py). I’m not running the script directly on the jupyter notebook because it runs on a web browser and it might conflict with the flask’s capabilities. Recommend you to write the script separately on a text editor.
All the HTML files should be located under a folder called ‘template’ (all lower). Flask will automatically detect the files only under that folder.

Here’s the flask web app script

#modelflask.pyfrom flask import Flask, render_template, session, redirect, url_for
import numpy as np
from flask_wtf import FlaskForm
from wtforms import TextField, SubmitField
from tensorflow.keras.models import load_model
import joblib#Define the prediction function (copy paste from step 4)
def return_prediction(model,scaler,json): 
    s_len = example_input["sepal_length"]
    s_wid = example_input["sepal_width"]
    p_len = example_input["petal_length"]
    p_wid = example_input["petal_width"]
    new_flower = [[s_len,s_wid,p_len,p_wid]]
    new_flower = scaler.transform(new_flower)[0]
    class_index = model.predict_classes(new_flower)
    classes = np.array(['setosa','versicolor','virginica'])
    return classes[class_index]#Initialize the flask app
app = Flask(__name__)
app.config['SECRET_KEY'] = 'mysecretkey'

#Creating a form class and inheriting the FlaskForm which is an inbuilt class
class MyForm(FlaskForm):
    sep_len = TextField("Sepal Length")
    sep_wid = TextField("Sepal Width")
    pet_len = TextField("Petal Length")
    pet_wid = TextField("Petal Width")

    submit = SubmitField("Predict Result")


#Create a basic route for a home page
@app.route("/", methods=['GET','POST'])
def index():
    form = MyForm()
    if form.validate_on_submit():
        session['sep_len'] = form.sep_len.data
        session['sep_wid'] = form.sep_wid.data
        session['pet_len'] = form.pet_len.data
        session['pet_wid'] = form.pet_wid.data

        return redirect(url_for('prediction')) #Upon validation, redirect it to the prediction function ('/prediction')
    return render_template('home.html',form=form) #Returning an html form

#Load the model and the scaler
load_model = load_model('FinalModel.h5')
load_scaler = joblib.load('iris_scaler.pkl')

#Create a routing view for prediction
@app.route("/prediction")
def prediction():
    contents = {}
    contents['sepal_length'] = float(session['sep_len'])
    contents['sepal_width'] = float(session['pet_wid'])
    contents['petal_length'] = float(session['sep_len'])
    contents['petal_width'] = float(session['pet_wid'])

    results = return_prediction(load_model, load_scaler, contents) #Store the results from the return_prediction function 

    return render_template('prediction.html',results=results) #Returning an html form


if __name__ == '__main__':
    app.run()

Here’s the HTML code for both the files

<!--home.html-->
<h2>Enter the flower measurements below</h2>
<form method='POST'>
    {# This hidden_tag is a CSRF security feature. #}
    {{ form.hidden_tag() }}
    {{form.sep_len.label }} {{form.sep_len}}
    <br>
    {{form.sep_wid.label }} {{form.sep_wid}}
    <br>
    {{form.pet_len.label }} {{form.pet_len}}
    <br>
    {{form.pet_wid.label }} {{form.pet_wid}}
    <br>
    {{form.submit()}}
</form><!--prediction.html-->
<h1>Based on the information given, here's the result</h1>
<ul>
    <li>Sepal Length: {{session['sep_len']}}</li>
    <li>Sepal Width: {{session['sep_wid']}}</li>
    <li>Petal Length: {{session['pet_len']}}</li>
    <li>Petal Width: {{session['pet_wid']}}</li>
</ul>
<h2>The predicted flower class is: {{results}}</h2>

Step 6: Deployment to the web

Now we have come to the last step. We have now tested and understood that things work fine locally and it’s time to take it to the web. There are lots of different services to deploy and host web apps. Although cloud service providers like MS Azure, AWS, GCP may provide free services to a limit, this time we’re using a fully free service from Heroku. (Heavier apps may not come under free tier though). Here’s the plan of action

Create a new folder in your desktop and copy the templates folder, flask script, saved model and scaler files. Remember to rename the flask script to app.py
Go to signup.heroku.com and sign up
Install the Heroku CLI from devcenter.heroku.com/categories/command-line. Heroku CLI requires GIT to be installed in the system. Make sure to have it.
Open anaconda prompt and naviagte to the newly created folder. Create an environment specifically for this deployment

conda create --name envname

Activate the environment

conda activate envname

Install necessary libraries (flask, Flask-WTF, tensorflow, scikit-learn, gunicorn)
Create these libraries into a requirements file to upload it to the server

pip freeze > requirements.txt

Create a process file in any text editor and save it as Procfile. Content of the file -

web: gunicorn app:app

Go back to Heroku and click on Create New App
Give an unique app name
Choose deployment method as Heroku Git
Follow all the command line instructions provided as such

  $ heroku login (browser opens. Login using Heroku creds)
  $ git init
  $ heroku git:remote -a nameoftheapp
  $ git add .
  $ git commit -am "Enter commit message"
  $ git push heroku master

Your app will now have a dedicated URL which can be launched on the browser and voila, your very own web app which runs on your Machine Learning model is running on the web!

CLOSING NOTES:

Consider this post as a template to build and deploy your own model and web app. Stretching beyond just predictions and deploying the model is like completing a full circle. Though this method requires fundamental knowledge of HTML and web app, it is a great weapon to have in the arsenal. I have a GitHub link that shows a full version of the same, which I recommend you to have a look at. Thanks!

Follow me on LinkedIn at https://www.linkedin.com/in/bharathwajmurali/