Reading Text from Invoice Images with Python


In this article, you will see how to read text from image invoices using Python programming language. Text invoices contain variety of information such as product names, VAT, product prices, vendor or customer names, tax information, the date of the transaction etc. The process of reading text from images is called Object Character Recognition since characters in images are essentially treated as objects. Traditional computer vision as well as advanced deep learning approaches exist for OCR. Both the approaches have their own pros and cons. The article will briefly review both the approaches and will also discuss some of the challenges that are associated with invoice image recognition. Finally, you will see how to read text from a very simple invoice image using Tesseract library.


  1. It is assumed that the readers have intermediate knowledge of the Python programming language since the code samples are in Python.
  2. Basic understanding of image processing will also help but is not required.

Approaches for OCR

As discussed earlier, approaches for object character recognition can be divided into two categories: Classic Computer Vision Based Approaches, and Advanced Deep Learning based approaches.

Classic Computer Vision Approach

The process adopted by classic computer vision based approaches for OCR can be divided into the following steps:

  1. Different type of filters are applied that normally blur the background, thus making the text more prominent and easier to identify.
  2. Text characters are recognized one by one in a sequence.
  3. Classification approaches are applied to classify the characters.

To further study about classification computer vision approaches. Checkout, these links.

Deep Learning Approaches

The deep learning based approaches use image recognition techniques along with the advanced neural networks to identify text from images. For instance, convolutional neural networks work best for image recognition hence they can be used to recognize text from images. Recurrent neural networks learn from sequence of data. therefore, they can be used to predict the next character which can help in the correction of mistakes. Also, RNN can help in parts of speech tagging and named entity recognition of the text, which can in turn help to identify whether the text is a product, currency, date etc.

Challenges Associated with Reading Invoice Images

As with any other OCR task, various challenges are associated with reading text from text invoices. Some of these challenges are enlisted below:

  1. Varying Text Density: The text in image invoices differ from image to image. Invoices with high text density are easier to read compared to low text density.
  2. Invoice Orientation: The orientation of invoices in the images is not same. Invoices with tilted oriented are difficult to read.
  3. Lack of Uniformity: Invoices differ a lot with the respect to the contents. For instance some invoices contain information about a customer’s bank statement while the other invoices contain information about a particular transaction at a super market. The lack of uniformity among the fields in different invoices makes it extremely difficult to develop a universal invoice reader.
  4. Fonts and Character Differences: Fonts and character difference also adds to the difficulty in reading text from invoice images.
  5. Text Structure: Difference invoices have text located at different locations. For instance, in one invoice, the total amount may be at the top of the invoice with the prices of individual items broken down at the bottom. On the contrary, the other invoice may contain total amount at the bottom.

In short invoices come in all sizes and shapes and therefore, it is extremely challenging to develop a universal invoice image reader.

In the next section we will show a very crude approach to read invoice images using Python.

Reading Invoice Images with Python

The task of reading text from invoice images can be broadly categorized into two steps:

  1. Reading text from images
  2. Annotating text with correct labels.

Step1: Reading Text

The task of reading text from images is not limited to invoices. For instance, the applications exists which convert the hardcopy of textbooks into pdf and word format. Several Python libraries exist for reading text from images. However, we will be using Tesseract which is one of the most commonly used OCR libraries for Python.

The installation steps for the Tesseract library are as follows:

  1. Download the installation files for Tesseract from this link:
  2. Install the corresponding files for your operating and remember the installation path.
  3. Execute the following command to install the Python wrapper for Tesseract:

$ pip install pytesseract

  1. The final step is to add the installation path in your Python script before you make any call to the Tesseract library function. For instance, for windows if your installation path is :C:\Program Files\Tesseract-OCR\, you will need to add the following script to your Python script before calling Tesseract functions:

pytesseract.pytesseract.tesseract_cmd = r’C:\Program Files\Tesseract-OCR\tesseract.exe’

Let’s now see how we can read text from Image Invoices using Tesseract. We will try to read text from the following invoice image:

Look at the following script:

# import the necessary packages
from PIL import Image
import pytesseract
import argparse
import cv2
import os
import imutils

pytesseract.pytesseract.tesseract_cmd = r’C:\Program Files\Tesseract-OCR\tesseract.exe’

img =”E:\invoice2.jpeg”)

text = pytesseract.image_to_string(img)


In the script above we first loaded the image using the function of the PIL (Python Imaging Library) module. The img object is then passed to the image_to_string() function of the pytesseract module which returns the text contents of the image. The text is then printed on the console.



61 11th Street, Dodgeville
Tel no. 061 333 9999

Tax invoice VAT No. 4423338888109

Milk tart R17.99
Apple crumble R29.99
Carrier bag 24L R 0.40
Carrier bag 24L R 0.40
Marshmallow 60G R 9.99
Dairy custard 17.99
Hot dog rolls 65
Lemon biscuits 99
ENT. bacon/egg LF

0,458 kg @ R49.99/kg R22.90
Sunflower oil 250ml R14.99
Popcorn 300g R 7.99
Chicken-mayo sandwich R23.99
Brown bread seed R10.99
Brown bread loaf R 6.99
Sauce Peri Peri R13.99


Balance due R193.24

EFT credit card R193.24

Zero VAT R32.97 RO.OO *
VAT R169.51 R23.73
Total tax R23.73


You can see that the text has been read successfully since the text in the invoice was pretty clear and the invoice was not distorted. The formatting of the invoice further made it easier to read. We have read the text, the next step is to see how we can actually annotate the text.

Step2: Annotating Text

Annotating the data from a invoice is not as straight forward as it may seem. The domain knowledge plays a very important role in this regard. For instance, we want to know what type of data we want to annotate and how that data is presented in the invoice. For instance, in the invoice that we just read we know that each line contains a product name on the left and the price on the right.

Again it all depends upon what you need to extract from the invoice. Suppose we are only interested in the product names and their prices along with the total balance. Depending upon the format of the invoice, there are several ways to parse such information. We will see a very simple approach where we will iterate line by line through the text, tokenize the text to words and then read the product names and their corresponding prices.

We know that price occurs at the end of the line, therefore the last element of the tokenized line will be added to the Price column of the pandas datafram and the remaining words will be concatenated together and inserted in the Product_Name column. Let’s first create a pandas dataframe with these two columns:

import pandas as pd
dataset = pd.DataFrame(columns=[‘Product_Name’,’Price’])

The script that iterates through the invoice text and then inserts the records into the dataframe is as follows:

from nltk.tokenize import word_tokenize

for i, line in enumerate(text.splitlines()):

   if i < 7:
   word_list = word_tokenize(line)

   if len(word_list) > 1 and any(i.isdigit() for i in word_list[-1]):

       item_words = word_list[:-1]
       if item_words[-1] == ‘R’:
           item_words = item_words[:-1]
       item = ‘ ‘.join(item_words)

       price = word_list[-1]
       if price[0] != ‘R’:
           price = ‘R’ + price

       dataset = dataset.append(pd.DataFrame([[item, price]], columns =dataset.columns))

In the script above, we start reading the text from the 7th line, since the record for the first product i.e. Milk tart occurs at the 7th line. We then tokenize each line via the word_tokenize method of the nltk.tokenize module. If the length of the tokenized list is not greater than one or the last item i.e. (price) doesn’t contain any digit, the loop terminates. Else, if the second last item contains an ‘R’, the ‘R’ is removed and the remaining items are concatenated together and stored in the item variable. The last item is stored in the price variable. If the value of the price variable doesn’t start with an R, it is appended at the beginning of the word. The item and price variables are then added to the dataset dataframe.

The dataset dataframe now contains products and their corresponding prices as shown below:

The output shows the image of the output pandas dataframe. There are afew problems with the output, for the product ENT.bacon/egg LF, only the second line is stored since the first line doesn’t contain any integer value. However, these issues can be solved with regular expressions. The idea of this post is to demonstrate one of the possible approaches for reading invoiced via Python.


Extracting information from image invoices can be very useful for data mining in scenarios where digital invoices are not available. This article briefly explains how to extract text data from image invoices using Python Tesseract library. The article also discuses several approaches for OCR and different challenges in this domain.

Introduction of IoT in The Medical Field: Current Status and Future Prospects

It is difficult to imagine modern medicine without the latest technologies, which are becoming one of the main tools of today’s healthcare system. The digital world has brought doctors innovations that can radically change the system of medical care and disease prevention.

This area was one of the first to use the capabilities of the Internet of Things (Internet of Things, IoT), and all kinds of “smart” devices have already become an integral part of the functioning of many clinics and hospitals.


Keep reading and you will know how the introduction of advanced IoT technologies in the field of medicine is helping modern medical science.

Benefits of medical IoT

The introduction of IoT in themedical industry has the following various advantages.

Then it is easy to test values that​​can be easily graphed and analyzed. In addition, it is possible to use it for research of new treatments by utilizing big data that collects many numerical values.

Medical devices and other devices using IoMT technology can be divided into the following types:

Diagnostic: tonometer, urine analyzer,ultrasound apparatus, glucometer, thermometer, uroflow meter, and many others.This is the most famous and understandable for doctors’ class of devices that have a long clinical history of use and which, thanks to the technology of theInternet of things, are becoming widely available to users.

Preventive(or for maintaining a healthy lifestyle): fitness trackers, scales with the definition of the composition of adipose tissue, devices for determining calorie content and harmful substances in foods, etc.


Such solutions are usually acquired and applied by users who monitor their health without the active participation(appointment) of doctors or medical personnel. However, these devices can have a very large impact on the collection of “big data” that are directly related to health, preventive medicine.

Medical: an insulin pump, a clever “pillbox”that controls the administration of drugs, etc. For such devices, the safety requirements are as high as possible, since they directly “participate” in the treatment process. Any technical errors and malfunctions can significantly affect the patient’s health.


Rehabilitation: devices that accelerate the patient’s recovery, help to carry out the program of rehabilitation measures in ordinary life. Also, improve the quality of life after serious illnesses.

There is no doubt that the same device or solution can fall into several points of this classification simultaneously.

Issues of introducing IoT in medical care

There are several challenges to the introduction of IoT in medicine.


1. Security measures are required

The first point to consider is security measures. Medical data including test values ​​are highly confidential personal information. They are often directly related to the lives of individual patients and have a high value.


In addition, there is a possibility that hacking may cause information leakage and alteration of test values, which may lead to medical accidents.


2.Sharing issues

As an advantage of digital data conversion, data sharing can be facilitated. However, against the backdrop of high interest in protecting personal information, some people may have psychological resistance to data exchange between multiple medical institutions and doctors. It is necessary to handle the data with prior explanation and consent.

3.Some areas that are difficult to introduce

In the medical field, telemedicine robots that can be operated by the remote operation have appeared and are spreading. However, remote surgery is currently difficult to introduce into surgical treatment due to a lack of training and legislation for doctors.


On the other hand, it can be said that there are few barriers to the introduction in order to enhance care at home and medical institutions. When it is necessary to examine detailed medical reports closely related to a patient then it is difficult to do it. In such a situation, further introduction and research are expected in the future.

Examples of IoT use in medicine

1.Home health monitoring

Smart bracelets not only increase the effectiveness of the physical activity but also allow doctors and relatives to remotely monitor the patient’s health. Thanks to fixed indicators and telemedicine, diagnostics can be carried out remotely, saving time and effort.


Today, many wearable gadgets are being created to monitor important health indicators. So, Cambridge Consultants released the Flow Health Hub device, which measures the level of pressure, cholesterol and the amount of sugar in the blood. This device directly communicates with the attending physician of the patient, informing him of the above data.


2.Real-time physical condition management with ICT devices

Nursing care beds used at nursing care sites are equipped with functions that let staff stations know when the clinical time has passed and when patients wake up from the bed.

In the medical field, the “Smart BedSystem” is used, which can measure not only the pulse rate and breathing of sleeping patients but also sleep and awakening.

Share data with nurse centers and electronic medical records.


Wearable devices that can be worn on the wrist can detect changes in body condition and position using sensors. This can measure body temperature, pulse, and acceleration, and can respond to changes in condition.


3.The fusion of AI (artificial intelligence) and medical care using big data

Scientists have developed a system that assists doctors in diagnosis. By collaborating with private companies to create a database of past cases with a definitive diagnosis.

Incorporating the search and reference of images and data into the system and use AI to analyze them. It saves doctors and helps improve the speed and accuracy of diagnosis.



It has become possible to perform remote medical care by using the latest IoT technologies. In addition to surgical support by specialists, post-surgery care support, and medical care in depopulated areas are being possible. Remote medical care is being provided in a wide range of medical fields, such as home care and maternal health checkups.

Among other things, medical treatment, remote nursing and monitoring services for senior citizens are also being implemented.


What else will need to be done in this area

The advantages of using new technologies are obvious, so it is not surprising that the IoT in the healthcare sector is gaining momentum every year and will continue to do so.However, the biggest risk, which has not yet been fully resolved, is the security of patients’ personal data.

Many hospitals use network monitoring to collect information from all their equipment and encrypt data.Creating different levels of access to information limits the circle of people who can get it.


Also, in real-time, you can see and control what actions are performed from each device. However, the developers still have a lot of work to do to ensure the complete security of the data transmitted and stored by IoT devices.


The rapid growth of the Internet of things in healthcare is impressive, but the industry is only at the stage of understanding what opportunities are opening up before it.



The main barrier to the development of the Internet of things in medicine that each of us must overcome is not technological limitations, but the conservatism and skepticism of patients and medical workers. But this obstacle will be overcome faster when the state provides active support, including at the legislative level, and there will be more success stories using IoT technologies.


According to experts, by 2020, the market for medical IoT devices will amount to $ 47.4 billion. The demand for gadgets for smart healthcare continues to grow, which means that in the near future we will see the rise of digital medicine.


Evaluating a Car’s Condition with Machine Learning


Machine learning techniques, owing to their accuracy and precision, are being increasingly employed for decision making in a variety of scenarios. From stock purchases to marketing campaigns, organizations base their decisions on the insights obtained from data via machine learning techniques. This article demonstrates how machine learning automates the decision-making process of evaluating a car’s condition.  You will see how to develop a machine learning model which predicts if a car is in an unacceptable, acceptable, good or very good condition, based on different characteristics of the car.  


  1. It is assumed that the readers have intermediate knowledge of the Python programming language since the code samples are in Python.
  2. The code has been tested with Google Colaboratory.  However, you can run it on your local machines as well, provided you have installed Python 3.6.

Step 1: Problem Statement & Dataset

The task is to predict whether a car is in an unacceptable, acceptable, good or very good condition based on car characteristics such as the price of the car, maintenance cost, safety features, luggage space, seating capacity,  and the number of doors.

The dataset we will be using to build our model can be freely downloaded from this kaggle link.  The dataset is in CSV format. In case you are using a cloud platform to develop the model, you will need to upload the CSV file to the corresponding cloud platform.

Step 2: Importing Libraries and Loading the Dataset

The following script imports the libraries required to execute the scripts in this article:

import pandas as pd
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns

If you open the CSV file for the dataset, you will see that it doesn’t contain headers for data columns. The details of the headers for the dataset is available at this link.  If you look at the “Attribute Information”  heading at the link, you can see the details of the headers that correspond to different attributes of the cars.

There are 6 attributes in total:

  1. buying:  which corresponds to the price of the car. There are four possible values for this attribute:  vhigh, high, med and low.
  2. maint: which stands for the maintenance cost. It can also have the same four possible values as for the buying attribute.
  3. doors: corresponds to the number of doors of a car. The possible values are 2, 3, 4, 5more.
  4. persons: refers to the seating capacity of a car. The possible values are 2, 4 or more.
  5. lug_boot: contains information about the luggage compartment, and can have small, med, and big as the possible values.
  6. safety: corresponds to the safety rating of the car. The possible values are low, med, and high.
  7. class: refers to the manual evaluation of the car’s condition. A car can be in an unacceptable condition, acceptable condition, good condition, and very good condition. The shorthand notations for the values are unacc, acc, good, and vgood. The class attribute is renamed as condition for the sake of readability.

The task is to predict the value for the seventh attributes, given the values for the first six attributes.

The following script loads the dataset:

colnames=[‘buying’, ‘maint’, ‘doors’, ‘persons’, ‘lug_boot’, ‘safety’,’condition’]
car_data = pd.read_csv(r’E:\Datasets\car_prediction.csv’, names=colnames, header=None)

Let’s first see how our dataset looks like.



This image has an empty alt attribute; its file name is image.png

You can see the seven attributes in the dataset.  Let’s now print the unique values for all the columns in our dataset.

for col in car_data:
   print (car_data[col].unique())


[‘vhigh’ ‘high’ ‘med’ ‘low’]

[‘vhigh’ ‘high’ ‘med’ ‘low’]

[‘2’ ‘3’ ‘4’ ‘5more’]

[‘2’ ‘4’ ‘more’]

[‘small’ ‘med’ ‘big’]

[‘low’ ‘med’ ‘high’]

[‘unacc’ ‘acc’ ‘vgood’ ‘good’]

Step 3: Exploratory Data Analysis

Before training the model, it is always a good idea to perform some exploratory data analysis on the data.

Let’s visualize the relationship between the output class i.e. condition and some of the other attributes in the dataset.

Firstly, we increase the default plot size with the following script:

plot_size = plt.rcParams[“figure.figsize”]
plot_size [0] = 8
plot_size [1] = 6
plt.rcParams[“figure.figsize”] = plot_size

The following script generates a pie plot that shows the class distribution for the condition column.  

car_data.condition.value_counts().plot(kind=’pie’, autopct=’%1.0f%%’, colors = [‘skyblue’, ‘orange’, ‘pink’, ‘lightgreen’], explode = (0.05, 0.05, 0.05,0.05))


You can see that 70% of the cars have unacceptable conditions while 22% of the cars are in acceptable conditions. The ratio of cars with good and very good conditions is very low. 

Let’s now see the relationship between the condition and buying attributes:

.groupby([‘condition’, ‘buying’])


The output shows that cars with acceptable and unacceptable conditions belong to all price ranges. However, cars with good and very good conditions are either medium or low price cars. The possible reason for this distribution is that expensive cars might have better conditions than less expensive cars, however, the conditions of the expensive cars may not justify their price tags. Hence, with respect to their price, the condition has been rated as either acceptable or not acceptable, and not good or very good.

Next, let’s plot the relationship between the conditions and doors attributes.

.groupby([‘condition’, ‘doors’])


You can see that the distribution of cars with respect to the number of doors across various car condition types is approximately the same. Therefore, doors is not a very good attribute for decision making regarding a car’s condition.

Finally, we will plot the relationship between condition and safety attributes.

.groupby([‘condition’, ‘safety’])


The output shows that none of the cars with low safety have been rated as being in acceptable, good or very good condition. All the cars with low safety features have been rated as having unacceptable conditions. Therefore, it can be assumed that safety is a very good feature for decision making regarding a car’s condition.  In the same way, we can study the relationships between the remaining attributes in the dataset.

Step 4: Data Preprocessing

The basic purpose of exploratory data analysis is to see which are the most important features for decision making since a large number of features can really slow down the model training and can also negatively affect the performance of the algorithms. Exploratory data analysis and feature selection is a mandatory step in case you have large datasets.  However, since we have only 6 features to base our decision on, we will retain all of the features in the dataset and will apply pre-processing to all of them.

All the features in our dataset are categorical, i.e. they contain categorical values. Features that contain both numerical and categorical values, for instance,   doors and person are also treated as categorical features. However, machine learning algorithms work only with numbers. Therefore, we need to convert the categorical features in our dataset to their numerical counterparts. To do so, one-hot encoding can be used. In one hot encoding, for each value in a categorical column, a new column is created. The integer 1 is added to one of the newly generated columns that correspond to the actual value.

Let’s first remove all the features or attributes from the car_data dataset, except `condition` since the condition is what we have to predict.  

temp_data =  car_data.drop([‘buying’, ‘maint’, ‘doors’, ‘persons’, ‘lug_boot’, ‘safety’] , axis=1)

Next, we need to create one-hot encoded columns for the attributes that we dropped.

buying = pd.get_dummies(car_data.buying, prefix = ‘buying’)
maint = pd.get_dummies(car_data.maint, prefix = ‘maint’)

doors = pd.get_dummies(car_data.doors, prefix = ‘doors’)
persons = pd.get_dummies(car_data.persons, prefix = ‘persons’)

lug_boot = pd.get_dummies(car_data.lug_boot, prefix = ‘lug_boot’)
safety = pd.get_dummies(, prefix = ‘safety’)

Now, if you print the `buying` dataframe, you should see the following output:



You can see that for each value in the original `buying` column, a new column has been generated.

Next, we need to concatenate the one-hot encoded columns for all the attributes to create the final dataset.

car_data = pd.concat([buying, maint, doors, persons, lug_boot, safety, temp_data] , axis=1)

Step 5: Training the Model

In this step, we will train our machine learning model on the data.  We will use the Random Forest algorithm to train our model. However, before that, we need to divide the dataset into training and test set. The model is trained on the training data and model performance is evaluated on the test data.

The following script divides the data into training and test sets:

X = car_data.loc[:, car_data.columns != ‘condition’].values
y = car_data[[‘condition’]]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Finally, the following script trains the model on the test set:

from sklearn.ensemble import RandomForestClassifier

model= RandomForestClassifier(n_estimators=20, random_state=0), y_train)
y_pred = model.predict(X_test)

Step 6: Evaluating Model Performance

Confusion matrix, accuracy, precision, and recall are the performance metrics used to evaluate the performance of a classification model such as the one we developed in this article. Look at the following script:

print(“\nConfusion Matrix:”)

print(accuracy_score(y_test, y_pred))


  precision    recall  f1-score   support

        acc       0.95      0.88      0.91        83
       good       0.62      0.73      0.67        11
      unacc       0.98      1.00      0.99       235
      vgood       0.88      0.82      0.85        17

  micro avg       0.95      0.95      0.95       346
  macro avg       0.85      0.86      0.85       346
weighted avg       0.96      0.95      0.95       346

Confusion Matrix:
[[ 73   5   5   0]
[  1   8   0   2]
[  0   0 235   0]
[  3   0   0  14]]


Our model achieves an accuracy of 95.37% which is pretty impressive.  


Car condition evaluation is one of the many decision-making problems that can be solved via machine learning techniques. This article explains how to automate the process of predicting the condition of any car based on several attributes such as price, safety, maintenance cost, etc. The article also explains how to perform exploratory data analysis for studying feature importance.

Did you know that blockchain technology is the ‘trend of the future?’

“Well I have heard of Blockchain Technology and believe that it is related to bitcoin, but I would like to know what it is exactly because I really don’t know what it is about!”  Well, you should know that blockchain technology is everywhere in the business world – from marketing to finance.  “Wonderful, but I would like to more about blockchain technology and how it is used!”  You are in luck because the blog will answer these basic questions regarding blockchain technology.  

What is blockchain technology?

If you read any blog or literature online discussing blockchain technology it will provide you with the following definition:  ‘Blockchain technology consists of facts and information which are grouped together in data blocks.   These data blocks are identifiable by a unique timestamp and linked together using continuous linkages of cryptographic validation and principles. The blocks are stored on various computers and other digital devices in ‘the network.’  In other words, blockchain technology is a group of related information that is assigned a unique timestamp at the time of creation and is interlinked by IT-related principles and stored on a digital device.  

“Wow, I never knew that data could be sorted out and grouped together into groups based on certain common characteristics, protected and identified by a unique time stamp, and then stored on digital devices!  That is amazing!”  If you thought that was amazing, what you will read next will really impress you!  It is this linking of related information which allows users to remain anonymous and securely share related data.  It is useful in this instance – you have come into a large inheritance and need to prove that you are the legitimate heir.  If you use the information contained in blockchain technology, you can completely eliminate the ‘middleman’ who in this case would be the lawyer.  Blockchain technology’s main purpose is to eliminate the need for a third party in any digitally related transaction.

Howdoes blockchain technology keep information secure?

“Youmentioned that blockchain technology keeps information secure, but I want toknow how it does this!”  The answer is simple.  The firstthing to know about blockchain technology is that its data is stored on ablockchain platform.  This makes it veryeasy to encrypt and hence keep the information secure from unwantedoutsiders. 

BlockchainTechnology’s many uses

“Nowthat I have a basic understanding of blockchain technology, I would love toknow how it is used in the professional world!” Well here is your answer.  Blockchain technology is completely securebecause it is identifiable by unique time-stamps which makes it easy to encryptand therefore protect the information contained in it on various digitaldevices across a network. Information on this network is completely securebecause would-be hackers would have to break through the entire network ofdigital devices (including computers) in order to access the information andthis would be impossible to do so.  Eachblock of verified technology has a unique record with a unique history.  The hacker would need to know all of thepasswords for each digital device storing different blocks of information andwould need to falsify all of the records on this vast network.  Obviously, this would be impossible todo.  This type of technology has manyuses, including:

  • Smart contracts
  • Digital voting
  • Distributed storage
  • Unmatched IoT security
  • Supply-chain communications
  • Bitcoin

Each is explained in more detail below:

Smart contracts

“Hey wait,what are smart contracts, what are they used for, and how does blockchaintechnology help keep their information secure?” The answer is: smart contracts are used when assets (like money) are transferred fromone business entity to another on a daily basis.  But they have far more diverse uses thanthis.  Smart contracts can be used whenmoney is being exchanged from the insuree to the insurer when insurancepremiums are being processed.  However,the uses for blockchain technology do not end there.  They can also be used to exchange money,information, and other valuable assets when legal processes are beingexecuted.  A good example would be thetransfer of funds from the buyer to the seller when a real estate deal is‘closed and goes through!’  In thisinstance, the real estate agent or broker (the middleman) would be completelyabsent from the scene. The real estate agent or broker gets a 6% commission onthe total selling price of a house when it is sold.  These contracts are useful when transferringmoney and other assets in crowd-funding agreements.  Smart contracts are expected to replace thirdparty lawyers who currently take a huge ‘cut’ in legal and financial processes.

Digital voting

Blockchain technology promises to change theway information is stored about political candidates during variouselections.  It promises to make digitalvoting an easy and convenient process by storing sensitive and confidentialinformation regarding political candidates on digital devices and making itcompletely confidential.  This will allowpeople to vote for a candidate from the convenience of their own home insteadof going to a voting center and casting their electronic votes there.  “Great,I can vote for my presidential candidate from the convenience of my iPhonewhile watching The Big Bang Theory in the comfort of my own home!  I am beginning to like blockchain technologybecause it seems to make life easier!”

Aside from making the voter’s life easier, blockchain technology addresses a prime concern of political information which is keeping it secure and confidential.  Hackers could easily access this information in those instances.  This gave them great opportunities to tamper with this information and hence skew the election results.  Blockchain technology keeps voter information regarding preferences for political candidates secure and confidential on the various digital devices used.  It offers tremendous scope for countries that have had traditionally low voter turnouts because they can use technology to encourage and get greater voter participation.  

Distributed storage

Did you know that blockchain technology cankeep information stored in your dropbox secure and confidential?  “Wow,really?  I never knew this, please tellme more!”  Blockchain technology isperfect for storage services which store large quantities of unique anddifferent but related and interrelated information.  These storage services include Dropbox andGdrive.  Dropbox and Gdrive store thistype of information in the form of information ‘files.’  Though these services are great for storingvast amounts of information, this information is often sensitive and must bekept confidential. However, many governments do not understand this and oftenpressure the services into disclosing and revealing personal information whichis sensitive (like credit card and social security numbers) to them(governments). 

Dropbox, Gdrive and other services areincreasingly and heavily relying on Blockchain technology to group related datain files together and store it on many highly encrypted and extremely securecomputers and other digital devices on their vast network.  This will make it impossible for eithergovernments or hackers to access this information.

UnmatchedIoT security

Have you heard of unmatched IoT security?  “No, I have not.  What is it and how can blockchain technology helps keep it secure?”  Hackers have had an easy time accessing the information available in unmatched IoT security.  However, when blockchain technology is used, the information is protected and kept confidential because of its multiple layers of distribution channels that exist over all of the digital devices in its vast network.   Since the information is assigned a unique timestamp, it can easily be tracked, validated, and verified any would-be hacker would need to access all of the digital devices on the network.  Access would only come by knowledge of all of the passwords which no hacker would know!

Supplychain communications

The introduction of manufacturing powered byelectricity in the early 20th century completely revolutionized manufacturingprocesses because it allowed the same machine to produce different items.  Additionally, many firms manufacture many ofthe components found in items currently bought like cars and iPhones.  Many raw material and logistics suppliersmust be involved in the process and this leaves a large scope for the resultinginformation to be accessed by hackers. If a hacker were to access information in even a small section of themanufacturing chain, it could lead to the unraveling of the entiremanufacturing chain.

Blockchain technology guards against this by converting supplier and logistics-related information into secure and auditable digital information.  This information is invaluable because it tells management and higher leadership the exact stage a particular product lies in the entire value-added manufacturing process and chain!


“I have heard of Bitcoin, but I am confused by it.  I was wondering if someone could explain to me how blockchain technology is related to Bitcoin!”  Bitcoin is a currency that is created and kept secure by the Internet.  This is the reason why Bitcoin is often referred to as cryptocurrency.  Because blockchain protects information contained on different digital devices in a network.  These devices are often referred to as ‘digital ledgers.’  Another term for blockchain technology is ‘distributed ledger technology.’  

Bitcoin was created after payments made in various national currencies were rejected by many banks around the world.  The result was the easy transfer of payments in any national currency between multiple parties around the world.  Blockchain technology is valuable in this instance because it protects international digital currencies and the information associated with them.

Blockchaintechnology is on your side

“What does the statement ‘blockchain technology is on your side’ mean?”  The statement refers to the fact that this type of technology keeps sensitive information on digital devices that are far-flung onto a global network secure and confidential.  This reassures you that only authorized parties can access your personal and sensitive information.  Knowing this, why don’t you invest in blockchain technology today?

View at

Front end web Development with Go-gin (Golang)

Choice of using a front end framework such as Vue.Js, React or Angular is great however if you are an indie Dev like & need a rapid prototyping methodology with just HTML, CSS & JavaScript then this article is for you.

To demonstrate step be step, we are going to use Golang’s Go-Gin. So let’s building our base structure of the project with an MVC pattern.

Folder structure

  • Create a root folder called hypi-app in your workspace

workspace ❯ mkdir -p hypi-app

  • Enter the new app directory & Create 3 folders for holding the models, views & controllers for MVC

workspace ❯ cd hypi-app && mkdir {models,views,controlllers}  

  • Now Create folders to holder the routes & router

workspace ❯ mkdir {routes,router}  

  • Let us create templates folder for HTML files & public to serve CSS, Images & JavaScript

workspace ❯ mkdir -p {templates/layouts,templates/partials,public/images,public/css,public/js}

  • Finally let us  create a main.go, essentially the entry point for the app

hypi-app ❯ touch main.go   & mkdir server

Now we all should have our project structure like this:

hypi-app ❯ tree
| |____css
| |____images
| |____js
| |____layouts
| |____partials

Web Server’s main entry

We are going to define the servers main entry in main.go file as followed:

package main

func main() {}

We will working on this file later.

Putting the server together

Inside server directory create a server.go file and add the following code:

package server

func Init() {}

Now  create a router.go in router folder and add the following code in it

package router

import (

func Router() *gin.Engine  {

  router := gin.New()


Now most importantly let us create system default routes which will help us in handling unknown routes & failures.

Create a file called routes.go and add the following code in it

package routes

import “”

func NotFound(route *gin.Engine) {
  route.NoRoute(func(c *gin.Context) {
     c.AbortWithStatusJSON(404, “Not Found”)

func NoMethods(route *gin.Engine){
  route.NoMethod(func(c *gin.Context) {
     c.AbortWithStatusJSON(405, “not allowed”)

Now in the router file let us import the default routes & run “go get -u” in your terminal    

package router

import (

func Router() *gin.Engine  {

  router := gin.New()

  // System routes

  return router

We now have everything we need to get our server up, let us add the router to server initialization

package server

import “hypi-app/router”

func Init() {
  r := router.Router()

Now we call the server initialization from our main entry point

package main

import “hypi-app/server”

func main(){

If you have followed along as described then we should be able to run our server & test it, from your hype-app directory run go run main.go

hypi-app ❯ go run main.go
[GIN-debug] [WARNING] Running in “debug” mode. Switch to “release” mode in production.
– using env:   export GIN_MODE=release
– using code:  gin.SetMode(gin.ReleaseMode)

[GIN-debug] Listening and serving HTTP on :3000

Go Views

We will be using go view to define our front end, let us get started & install go-view by typing “go get” in your terminal

create a index.go file and add the following content

package views

import (

func IndexView(route *gin.Engine)  {
  app := ginview.NewMiddleware(goview.Config{
     Root:      “templates/”,
     Extension: “.html”,
     Master:    “layouts/master”,
     Partials:  []string{“partials/link”,
     Funcs: template.FuncMap{
        “copy”: func() string {
           return time.Now().Format(“2006”)
     DisableCache: true,

  appGroup := route.Group(“/”, app)
        appGroup.GET(“/”, func(ctx *gin.Context) {
           ginview.HTML(ctx, http.StatusOK, “index”, gin.H{
              “title”: “Hypi App | Hello World!”,


Now let us add the view to our router

package router

import (

func Router() *gin.Engine  {

  router := gin.New()

  // System routes

  // Index view

  return router

Now let us create a reusable HTML structur,  under templates/layouts folder create a new file call master.html, this will be the master layout and add the following content

<!DOCTYPE html>
<html lang=”en”>
   {{template “link” .}}
   {{template “meta” .}}
   {{template “title” .}}
   {{template “scripts_head” .}}
{{include “layouts/header”}}
<div class=”core”>
   {{template “content” .}}

{{include “layouts/footer”}}
{{template “scripts_foot” .}}

Create footer, & header HTML files under layouts we will not use it for this tutorial however let us create it to understand the structure.

now we need to create some partials that can be used in all layouts when needed, under partials directory create a new file named link.html

{{define “link”}}
<link rel=”shortcut icon” href=”[email protected]×32.png?x83512″&gt;
<link rel=”stylesheet” href=”; />
<link href=”,300,400,700&display=swap&#8221; rel=”stylesheet”>
{{ end }}

Create a meta.html file and add the following code

{{ define “meta”}}
<meta charset=”UTF-8″>
{{ end }}

Create a scripts_foot.html file and the scripts to be loaded in the footer section

{{define “scripts_foot”}}
<script src=””></script&gt;
<script src=””></script&gt;
{{end }}

For now create scripts_head.html file and add the following code but we will not be loading any scripts on header for this go gin front end development  tutorial

{{define “scripts_head”}}
{{end }}

Now our last partial which will hold and parse the title for each page, create a title.html & add the following code

{{define “title”}}

Now we will create our index.html file with the following code

{{define “content”}}
<div class=”ui stackable grid”>
<div class=”ui equal width row row-vh-f”>
<div class=”ui column brand-bg-img row-vh-f”></div>
<div class=”ui column auth-bg middle aligned padded”>
<div class=”ui container form segment”>
<form class=”ui form padded”>
<div class=”ui buttons two fluid”>
<div class=”ui animated fluid large fade button” tabindex=”0″>
<div class=”visible content”><i class=”github icon”></i></div>
<div class=”hidden content”>
<div class=”ui left floated animated fluid large fade basic button” tabindex=”0″>
<div class=”visible content”><i class=”slack icon”></i></div>
<div class=”hidden content”>

<div class=”ui buttons two fluid”>
<div class=”ui animated fluid large fade facebook button” tabindex=”0″>
<div class=”visible content”><i class=”facebook icon”></i></div>
<div class=”hidden content”>
<div class=”ui left floated animated fluid large fade twitter button” tabindex=”0″>
<div class=”visible content”><i class=”twitter icon”></i></div>
<div class=”hidden content”>

<div class=”ui buttons two fluid”>
<div class=”ui animated fluid large fade google plus button” tabindex=”0″>
<div class=”visible content”><i class=”google plus icon”></i></div>
<div class=”hidden content”>
                                   google plus
<div class=”ui left floated animated fluid large fade youtube button” tabindex=”0″>
<div class=”visible content”><i class=”youtube icon”></i></div>
<div class=”hidden content”>

<div class=”ui buttons two fluid”>
<div class=”ui left floated animated fluid large fade vk button” tabindex=”0″>
<div class=”visible content”><i class=”vk icon”></i></div>
<div class=”hidden content”>
<div class=”ui left floated animated fluid large fade linkedin button” tabindex=”0″>
<div class=”visible content”><i class=”linkedin icon”></i></div>
<div class=”hidden content”>

<div class=”field”>
<div class=”ui checkbox”>
<input type=”checkbox” tabindex=”0″ class=”hidden”>
<label>I agree to the Terms and Conditions</label>
{{ end }}

All right now let us try running the server  you should see a page like this

Serving Static Content

Great we have created our own front end structure with MVC pattern. Now this code base can be further extended into a full fledged application when we start writing our controllers & models but it is outside the scope of this article & hopefully i will continue to write in another tutorial.

But before you start playing around this code base there is one more thing that i would like you to learn which is serving static files as we are doing front end development, So let us create static route and add it to our router to server the files as intended.

Create a static.go file under the routes folder and add the following code

package routes

import “”

func PublicImages(route *gin.Engine)  {
  route.Static(“/public/images”, “./public/images”)

func PublicCss(route *gin.Engine)  {
  route.Static(“/public/css”, “./public/css”)

Now include this route in router with the updated code as shown below

package router

import (

func Router() *gin.Engine  {

  router := gin.New()

  // System routes

  // Static Routes

  // Index view

  return router

now let’s add a style.css file in the css folder and add the following style.

h1, h2, h3, h4, h5, h6, p, a {
   font-family: ‘Lato’, sans-serif !important;

   background: #2D52A3 url(“../media/images/botecho_logo_size_invert.png”) no-repeat;
   background-size: contain;
   background-position: left center;

   height: 100vh;

   background-color: #FFFFFF;

Import the newly added style to link.html

<link rel=”stylesheet” href=”../../../../public/css/style.css”/>

That’s it, now run the server with “go run main.go”, you should see your new hypi app page like this

Code for this tutorial can be found here , Have fun!

IoT in Agriculture: The Smart Farming Solution

Smart Farming

Precision Agriculture or Smart farming is one of the sectorswith the highest development opportunities and with the lowest penetration, todate, of digitized solutions.

It is a sector that needs digital solutions forenvironmental and territorial sensors, applications for the weather, automationof equipment for the increasingly precise management of water, fertilizers,fertilizers, agrochemicals. These are the areas where IoT can revolutionizefarming.

In the agricultural sector, now robots and AI nextgeneration of agriculture “that utilizes IoT” has appeared, it hasattracted attention.

what is precisionagriculture?

Precision agriculture consists of the process of managingthe different crops by observing, measuring and generating actions based on theresults obtained from the analysis of the different elements that affect cropproductivity. For this, sensors are used that collect information that is thenused to make decisions with greater precision, and also to optimize cropyields.

Stages inprecision agriculture

The arrival of low-cost solutions that incorporate multiple sensors and that use Low-Power Wide-Area Network (LPWAN) networks to transmit small data packets greatly enhance the possibilities of achieving extraordinary results through precision agriculture models based on IoT.

Crop management based on smart or precision agricultureencompasses monitoring activities, support tools for decision-making and theperformance of actions that automatically control one or more systems(irrigation, frost protection, fertilization, etc.)


Using temperature sensors, humidity, soil conditions, climatic conditions among many others.


Generation of KPIs, graphs, reports, alerts that allowobtaining visibility on the conditions and variables that are intended to beevaluated.


The expert will have information to take preventive orcorrective decisions according to the particular conditions and objectives ofthe crop.


The continuous information allows us to analyze results,learn from the executed actions and plan to improve.

The solution based on a series of sensors, devices and acomputer application allows to obtain detailed information of the crop, thesoil and of the more granular climatic variations than the techniques oftraditional agriculture directly impacting on the quality of the products, inthe processes that are made and in the raw materials or inputs used in theactivity.

Purpose of smartagriculture

Efforts using advanced technology called “Smart” are beingpromoted in various fields, and there are many products and solutions.Smartphones, smartwatches, smart speakers, smart homes, and even smartcommunities using these devices are born.

Among them, smart agriculture has been regarded as anindustry that is difficult to introduce, especially in fields that have oftenbeen thought to have little connection with technologies such as IT and ICT. Ithas accelerated all at once in the last few years, and its introduction isgrowing rapidly, regardless of size.

The following can be considered as the purpose ofintroducing smart agriculture:

1. Labor-savingand labor reduction of farm work

The first is labor-saving and light labor in farming. Theagricultural sector in every country is experiencing a serious labor shortagedue to the aging of individual farmers. It is required to support suchhardships in agriculture using IoT.

2. The successionof agricultural technology

The second is the succession of cultivation skills to newworkers. There is a shortage of human resources to succeed in inheritance andagriculture so that the agricultural technology cultivated in the inheritanceof family members can be inherited continuously by smart agriculture systems.

3. Increase foodself-sufficiency

The third purpose is smart agriculture as a measure againstfood self-sufficiency in the world. Many developing even developed countries’imports exceed domestic production, making it difficult to say that theappropriate balance is maintained.

In order to increase the yield and raise the self-sufficiency ratio in the shortage of human resources as mentioned above, automation with sensors and robots is indispensable for reliably growing agricultural products with a small number of personnel.

4. Cost reductionand quality Improvement

The reduction of costs, the improvements in the process andin the care of the crops, the optimization in the use of material and humanresources, the increase of the yield per hectare cultivated, the greaterquality of final product and the decrease of discard, the compliance withnational and international requirements of production and productcharacteristics, etc. These are some of the concrete benefits of opting for aSimple IoT Smart Agriculture solution.

Example of CostReduction:On-line monitoring of soil temperature and humidity will collaborate indetecting if the environment is conducive to the proliferation of fungi andpests in crops, this allows them to apply fertilizers and fungicidesefficiently and accurately obtaining a cost reduction.

Example of ProductQuality Increase:Optimization of production and quality improvement will be possible, amongother things, by making online measurements of the size of the stem and of thefruit or crop in question, the level of water required for irrigation, thevalue of Photosynthetically Active Radiation (RFA / PAR) to measure the conditionsof photosynthesis, etc.

5. Realtime datacollection

All of this also has real-time and historical weatherinformation through different sensors and devices that allow monitoring ofatmospheric pressure, air temperature and humidity, wind speed and direction,amount of rainfall, dew point, etc.

All the information collected is stored to be able to makestatistical reports, traceability, extract data to carry out corrective orpreventive actions, issue alerts, etc.

The solution not only allows us to expose the information ina complete way and of easy visualization, additionally, at the request of theclient it can generate notices or alerts in mobile devices of data processingor cell phones.

Practicalapplications of the Internet of Things in the agricultural sector

The agricultural sector has also launched into thehyperconnected world. Thanks to Smart Agriculture we can obtain detailedinformation on the crop, soil and climatic variations in real-time from anytablet or smartphone. New IoT technologies have broken into strong sectors suchas agriculture or livestock with the aim of improving the quality of life andreducing heavy work.

Here we want to analyze some of the practical applicationsof the Internet of Things in this sector:

The smart tractors

These tractors are to replace the driver’s cab for a complete standalone system based on cameras, radar, GPS and sensors that detect obstacles and make the vehicle change direction to avoid impacts. The farmer programs it with an application and can make it work simultaneously with other tractors. Thanks to the introduction of maps in the system, with the limits of the field and, in addition, it includes software of route planning.


Drones are unmanned aircraft that will increasingly fly overagricultural land. The market of the drone is moving at a rapid pace.  Many farmers already use them to know thestate of the crops accurately in real-time and thus carry out a precisionpesticide spraying. In Poland, they have even started to work with the dronecalled ‘bee drones’ in order to favor world pollination due to the reduction ofbees.

Online monitoringthrough sensors

Online monitoring through sensors allows farmers to knowfrom their smartphone or tablet the temperature, humidity and stem size of thefruit or crop. Depending on the state of the crops, each person can adapt toeach farm the treatment of fertilizers and fungicides effectively andaccurately.

Smart Devices forLivestock

Connected livestock is another of the advances that arealready being made. Tools and sensors that measure the movement of cattle,control their nutrition and even their reproductive capacity. In addition,farmers can always know the location of animals to facilitate their countingand reduce theft.

Smart Pest Control

Smart pest controls through remote sensors are installed incrops and warn farmers about the most suitable conditions for pestproliferation. Even the actions necessary to combat them can be performedmanually or automatically thanks to the use of new technologies.

Artificialintelligence (AI) in Agriculture

AI can also be used to systematize and provide technologiesand know-how for new farmers. By doing this, even people who have no experienceor knowledge of agriculture can engage in agriculture and hope to help solvethe shortage of human resources.

Some programs have already been developed and put intopractical use, such as programs that analyze the degree of growth based on theshape and color of crops to predict and judge harvest time.

In addition, it is possible to detect information on croppests at an early stage by using image analysis using AI and to provide countermeasures.In both cases, demonstration experiments have already begun, and there arecases where these have been partially commercialized.

The growth situation from the field image of the drone isgrowing especially recently in the field of agriculture using AI. Judgment ordetection of pests and dealing with them. The Japanese developed “PinpointPesticide Spraying Technology” that sprays pesticides only on the points wherethe detected pests are present, reducing the labor of spraying the pesticides,reducing the cost of the sprayed pesticides, and above all, affecting thenatural environment and crops. Keeping it the minimum necessity.

In addition, this technology can be utilized not only forlarge-scale farmers but also for small and medium-sized farmers who have onlysmall fields and may become a technology that supports successor farmers andnew farmers.

Smart Agriculture is able to cover and provide solutions tothe entire production cycle and also to the harvest and storage stages of the finalproduct.

Smart agriculturecase study

As we have spoken so far, it is by no means a dream story in pursuit of ideals. Although it takes time to spread to ordinary farmers, there are already examples of smart agriculture introduced all over the world.

The Netherlands, which is said to be a developed country ofsmart agriculture, is a pioneer that is always mentioned when talking aboutsmart agriculture. In Dutch farms, infrastructure using smartphones and tabletshas been developed so that the state of crop growth can be monitored 24 hours aday.

In addition, agriculture using the latest technology, suchas sensing technology using various sensors and network technology using IoT,and even renewable energy is being deployed.

Disadvantages of smartagriculture, future challenges

Smart agriculture seems to be a good thing in this way, butnaturally, there are some disadvantages if you want to introduce smartagriculture.

The initial costis expensive

First of all, the initial cost of an introduction is higherthan that of ordinary agricultural machines. How to keep development costs lowdepends on the skill of the manufacturer.

In addition, even if it can be introduced, ICT and robotsthat have just begun to be used in the agricultural field also have an aspectthat it is difficult to make a cost-effective outlook. Many farmers say theyhave given up on the way because they were unable to use it after all.

Variation in thedata format of individual devices

Next, these ICT devices and robots are products that are unique to each manufacturer, so the problem is the standardization of software and data formats. These things need to be taken into consideration from the long-term perspective to data storage, management, and migration.

This is because interoperability is not possible if it isdivided into its own systems and standards. However, the development of theseOSs and middleware often falls under the dilemma that standardization is notprogressing easily because organizations such as companies are oftenindependently developing in order to gain market share.

Shortage anddevelopment of smart farmers

The viewpoint of human resource development for smartness isalso needed. There are few people who can quickly use such smart devices foraging farmers. Support systems for mastering smart devices and the developmentof IT-savvy personnel are urgently needed in the agricultural field.

New work burden onfarmers

The introduction of such smart agriculture places afinancial, time and technical burden on the farmers. It will be more difficultto learn than the introduction of agricultural machines as before, and it willbe very difficult for those who are not used to data input and data analysisusing personal computers, smartphones, and tablets.

Taste ofvegetables grown in smart agriculture

And don’t forget the taste of crops. No matter how smartagriculture increases efficiency and yields, consumers will not choose unlessit is delicious food.

Plant factory as efforts to stable constant yield can beobtained. On the other hand, the plant factory of vegetables that haveundergone artificial light, such as LED cannot be compared with the foodgrowing up in response to full sunlight. There must be a taste difference.

There is no doubt that smart agriculture will reduce theburden of our agricultural work. However, it is a “means” for making deliciousvegetables, not a “purpose”. I want to keep in mind that smart agriculture isthe way to grow nutritious and delicious vegetables.


A wave of change is about to come to the world ofagriculture that has long been screamed to eliminate human resource shortagesand improve productivity. IT technology is the driving force. “Smart farming”,which uses IoT (Internet of Things) and the cloud to optimize the whole farmmanagement from production management to quality control, distribution, andsales, is drawing attention.

Internet of Things for Industry 4.0

Internet of Things or IoT simply refers to the connection of devices for data collection, analysis, and sharing. The building blocks of IoT are data-gathering devices, connectivity, data analytics capability, and user interface. An example of IoT technology set-up is sensors and cameras, equipped with wireless connectivity which is then processed and accessed via the internet through smartphones.

These characteristics of IoT leadto the proliferation of smart devices in different field of business andconsumerism. Its main attraction is that internet-enabled machines tend to workwithout direct human interventions. Imagine controlling your appliances throughyour phone while overlooking your work from home or directing your staff whileon a vacation. Life is made easy with smart devices. Connectivity is the assetof IoT devices that indulges more consumers and businesses to take a leap fromits traditional operation.

However, IoT has so much more tooffer! One key feature of this technology is data analysis. While moreconsumers are likely satisfied with the direct service rendered by the smartdevices, IoT is moving slowly in the utilization of “data analytics” assolutions to real-world problems in the community as well as in the industry.IoT through data analysis adds more value to the smart gadgets in place onhomes, streets, cars, parking lots, agriculture, business centers, productionlines, and the list goes on!

The characteristics of IoT can beintegrated with Industry 4.0 wherein the main goal is to accustom automationand data connectivity in the industrial ecosystem. This goal encouragesinitiatives in advanced manufacturing and optimization operation. Hence, IoT isa promising technology for Industry 4.0.

Industrial IoT

 According to de Backer, Mancini, and Sharma, Industry 4.0 converges the following technologies: machine inter-connectivity, analytics and intelligence, advanced production methods, and human-machine interaction in the manufacturing sector.

  • Machine connectivity involves machine to machine (M2M) communication, machine to sensor connectivity, and cloud technology.
  • Analytics and intelligence pertaining to big data analytics and learning-based automation work.
  • Advanced production methods include technology to improve manufacturing techniques.
  • Human-machine interfaces such as touch interface, augmented-reality tools, and collaborative robots.

One major advantage of theemergence of IoT in the industrial operation the monitoring and adjusting ofmachines in a remote central control room. The processes involved in thecontrol room are hardware data collection, software analyses and decisions, andremote user manipulation. The sensing hardware collects and monitors data. Thesecurrent and historical data are software analyzed and process to facilitate parameter-basedmodels for decision-making.

These methods lead to labor efficiency,yield increased, yield tracking, throughput traceability, and quality enhancement.Other than machine data-driven real-time adjustments, a lot more can be done onthe collected data. To name a few, data can be used for preventive maintenance ofmachines, predicting parts replacement, information lost productivity andprofits.

Data analytics and processing ofIoT- gathered information can be taken advantage of in the Industry 4.0, butthe industrial IoT adoption has not accelerated these past years. In fact, few large-scaleindustries are adopting IoT projects in their businesses. And for somecompanies that adopted IoT in their operations, a report from McKinsey GlobalInstitute says that there is limited data usage from IoT devices for real-timedeployments and large-scale project. Business managers and leaders arereluctant to use IoT data analytics for reasons such as accountability issues,lack of capable staff, lack of trust in the decision-making process, andattachment to the old-style method and experiences. IoT security concerns are alsoa challenge in the integration of IoT in industries since complete securityinitiatives should be adopted to do away with its network vulnerabilities. Furthermore,the lack of IoT standards for seamless connectivity and transmission is onegeneral concern that posed doubt in the adoption of IoT.

If there is one field that couldfully adopt what IoT could offer, it’s the manufacturing industry! Campaigns inthe used of IoT in the industrial application has to be intensified andinnovations on the implementation strategies should be adopted in order to convincemanufacturers of the importance of IoT in the full implementation of Industry4.0. IoT drivers should focus on developing and enhancing data analytics applicationsof IoT in the financial, business operations, and convenience.  Moreover, maximize design on the specificoperation to create more usability in all departments and sectors. Technicaland analytical training on IoT associated technology should be initiated somore workforce will be involved the spreading of this one enormous technology. Inthis way, IoT will become a way of life in the manufacturing industry.

State of the Internet of Things in 2019

Nowadays, The Internet of things isbecoming a hot topic of conversation. This concept has the power to strongly impacthow we live and work. IoT is a huge network of connected things. It describesa world where nearly anything can be connected and communicate in anintelligent manner.


In 1999, the term”Internet of Things” was used by KevinAshton during his work at Procter&Gamble which became widely accepted.Between 2008 and 2009, the Internetof Things was born.

The term Internet of things refers tothe networkof physical devices connected to theInternet and/or to each other, containing electronics and sensors embeddedwithin their architecture for collectingand transmitting data via the Internet.

With the internet of things, the physical world is becoming a significant information system. The goal of the Internet of things is to extend to internet connectivity from standard devices like laptops, smartphones, and IPads to comparatively simpler devices like a lamp or a toaster.

Components of an IoT system

Agriculture technology is on the rise

There are four basic components of IoT systems which are involved in the working of IoT. These components are given below:

  1. Sensors or Devices: 

Sensors/devices are a key component that assists you incollecting data from the external environment. The data may have many levels ofcomplexities. A device may have differenttypes of sensors which performs multiple tasks otherthan sensing.

For example, a smartphone has multiple sensors like the camerabut the smartphone is not able to sense these things.

  • Connectivity: 

All the collected data from the sensors/devices are sentto a cloud infrastructure. It is necessary to connect sensors or devices to thecloud through different communication mediums. These communication mediums couldbe Bluetooth, WI-FI, etc.

  • Data Processing: 

After the data is collected and send to the cloud, thesoftware performs processing on the collected data. The data processing can besimple like checking the temperature or can be very complex like identifyingobjects.

  • User Interface:

The data is now processed into a piece of usefulinformation. This information needs to be available to the end-user. This canbe done by triggering alarms on the user’s phone or sending him notifications throughemails or/and text messages. The user may want an interface to examine theirIOT system.

For example, if the user has a camera installed in hisoffice and he wants to access its video recording. This can be achieved withthe help of a web server.

Applications of IoT

IoT applications arewidely used in various companies across many industries. Some of these IoTapplications are given below:

  • ParkingSensors

With the help of IoT technology, users canlocate the available parking spaces on their phones.

  • Activity Trackers

This application of IoT Helps the usersto capture their various fitness-related activities like heart rate patterns,distance walked, and calories consumption.

  • SmartOutlets

IoT technologyallows you to remotely turn any device on or off. It also allows you to track theenergy level of a device. You can also get custom notifications on yoursmartphone.

  • Smart City

This offers all typesof use cases like traffic management, waste management, water distribution,etc.

Role of IoT in our daily lives

Internet of things isan important part of our daily lives. Following are some aspects that willclarify the importance of IoT in our daily lives.

  • Comfort

IoT technologyprovides convenience to us. Our lives are being touched by the IoT. Let us takea few daily lives examples where IoT is providing us convenience:

  • Waking up to the sound of our smartphonealarm.
  • Ordering food from Amazon.

These examples arekinds of connections that are designed to improve our daily lives.

  • MaintainingHealth

Your health analysiscan now be done with the help of Wearable technology that records them to IoT.Blood sugar level, blood pressure, and even skin allergies can now be detectedand prevented with the help of smartphone applications. Such applications fromthe Internet of things are upgrading our lives and proves to be more efficientin maintaining health.

  • Safety

IoT providesadvanced safety measures. With the help of IoT, sensors can now be merged intoconstruction materials. You can mix concrete with sensor capability to providedata to the IoT. Many Factors like internal temperature and pressure caninitiate real-time communication with the IoT. With the help of these embeddedsensors in the concrete, issues can be reported to engineers before conditionsbecome severe. 

  • Locating lost things:

With the help ofIoT, active communications systems are installed while making suitcases. Thesesystems connect your suitcases to tracking systems that can help you to locatethem. In case you lost your luggage, the systems tell you where your lostluggage is.

Impact of the Internetof things on technology

IoT has greatly revolutionised the whole of technology. IoT becomes immensely important technology because it is the first real evolution of the Internet. It gave so many revolutionary applications that have the capability to improve the way people live and work. IoT has made the Internet sensory (temperature, pressure, vibration, light, moisture, stress) which allows us to become more proactive and less reactive.

ATM was the first IoT objectas far back as 1974. In the few coming years, IoT will do wonders in everyfield of life like business, agriculture etc. By 2020, over 250,000 vehicles will be connected tothe internet. With the IoT, there are limitless opportunities for business.According to estimates, the Internet of Things will add $10-$15 trillion to global GDP in the next 20 years.

IoT has already impacted the technology a lot like patients are using IoTdevices on their own bodies which help doctors to diagnose and determine thecauses of specific diseases, and really small IoT sensors can be placed onplants, animals, and geologic features to gain their specific details which canbe used for their betterment.

Challenges of IoT

Along with various advantages, IoT is faced with many challenges. Someof these are given below:

  • Data security andprivacy issues.
  • Inadequatetesting and updating.
  • Complexity ofsoftware.
  • Large Datavolumes.
  • Integration withAI and automation.
  • Constant powersupply to the devices which is rather difficult.
  • Short-rangecommunication.

Considerations and recommendations to combatprivacy risks

Privacy is a bigissue associated with the Internet of thingsas billions of devices being connected together. It is very necessary tounderstand the use of IoT because many IoT devices affect privacy risksdifferently than conventional IT devices.


Given below are somePrivacy risk considerations thatorganizations should ensure to address throughout the lifecycle of their IoTdevices as these may affect the management of privacy risks for them.

  • Interactions of a device with the Physical World.
  • Access, Management, and Monitoring Featuresof a device.
  • Availability, efficiency, and effectivenessof privacy capabilities.

These considerations helpyou to mitigate the privacy risk associated with your IoT devices. Thereduction of privacy risks has the following goals:

  • To protect device security.
  • To protect data security.
  • To protect individuals’privacy.


The privacy riskconsiderations may help you to reduce your privacy risks but there are alsomany challenges that you might face while mitigating the privacy risks. Followingare some recommendations to tackle the challenges of privacy risks mitigation.

  • Understand privacy risk considerations of the device and the challenges they may cause to reduce the privacy risks.
  • Adjust the organizational policies and processes to address the privacy risk mitigation challenges.
  • Implement updated mitigation practices.

Guest post by Masoora Syed

How will your life change with Application of IoT?

Application of IoT is limitless.
Internet of Things is changing our life


One of the technological advances with the most potential in recent years is the “Internet of Things” or IoT. The possibility of connecting all kinds of objects to a network and creating connections between them to optimize our time is a field that is still developing. The creativity of programmers plays a key role in shaping our near future.

Although much remains to be invented, the Internet of Things is beginning to sneak into our lives. It is already a lot more present in our daily lives, even though we may not even think about it. 

How the Internet of Things Will Change Our Lives

TheInternet of Things is no longer a future term, of which we speak in theabstract, it is more than a current trend that will gradually change both ourdaily routine and our way of understanding technology.

Today we have smartphones, tablets, laptops, televisions … all of these are connected to the Internet. But the Internet of Things goes further; it means that the network reaches all things, that everything is connected to the Internet.

Thereare already refrigerators, ovens and washing machines that can be controlledfrom a smartphone thanks to its Internet connection but that is only the firststep of everything that the Internet of Things offers us.

Bothprofessionally and domestically, the Internet of Things could changeeverything. 

Both professionally and domestically, the Internet of Things could change everything, from our way of working – automated tasks, to how we navigate in our homes.

What would it be like to have your toothbrush alert you to your tooth decay and schedule your dentist appointment for you. Or to have your refrigerator inform you of food near it’s expiration or what is running low and prompts you to stock up.

With the potential of the Internet of Things, our cities will also be smarter. Our cars could communicate with traffic signals prompting a speed reduction, instructing manoeuvres, analyzing the flow of movement amongst, transport and people to achieve the most efficient traffic management.

The Internet of Things is changing how we live.

Meaning and genesis of the “Internet of Things”

Internetof Things (IoT) is a neologism used in telecommunications, a new term born fromthe need to give a name to real objects connected to the Internet. The meaningof IoT is expressed well with examples: For example, IoT is a refrigerator thatorders milk when it “notices” that it is finished. IoT is a housethat turns on the heat as soon as it hears you arrive. These are examples ofIoT, i.e. objects that, when connected to the network, allow the real andvirtual worlds to be brought together.

The term IoT (“Internet of Things”)  is used for the first time by Kevin Ashton, a researcher at MIT (Massachusetts Institute of Technology) where the standard for RFID( Radio-frequency Identification) and other sensors were founded. But even if the term is new, we have talked about these concepts for a long time, basically from the birth of the internet and the semantic web.

But what does IoT mean in practice? With the Internet of Things, we mean a set of technologies that allow you to connect any type of device to the Internet. The purpose of this type of solution is basically to monitor and control and transfer information and then perform subsequent actions.

In the city environment, for example, a detector located in a street can check the street lights and indicate if the lamp works, but the same light could, if properly equipped, also report information on air quality or on the presence of people.

Areas of application for consumers and businesses

IoT application areas
IoT application areas

The main areas of application of the Internet of Things (both for end consumers like me and you, for companies and manufacturing) are represented by those contexts in which there are “things” that can “talk” and generate new information such as for example:

  • Home smart home, home automation
  • Smart buildings, smart building, building automation
  • Industrial monitoring, Robotics, Collaborative Robotics
  • The automotive industry, automotive, self-driving car
  • Smart health, healthcare, the biomedical world
  • All areas of telemetry
  • All areas of surveillance and security
  • Smart city, smart mobility
  • New forms of digital payment through objects
  • Smart agrifood, precision farming, field sensors
  • Animal husbandry, wearable for animals

Whatwill we see in the future? Smart cities, more agile efforts, long processesthat will take place in a matter of seconds … Tomorrow is still to beinvented and we are excited about the idea of ​​being part of the change.

Presentand future of the IoT

Howmany objects are connected?

Major research companies, such as Accenture among others, argue that more than 25 billion IoT devices will be reached by 2020.

Manyoperators in the sector believe that the number will be largely exceeded andthis already represents an extraordinary business opportunity for all operatorsin the sector.

Authors such as Adrian McEwen (with the book Designing the Internet of Things) talks about creativity and IoT, and how the next winning ideas and products will need to connect everyday objects with the internet and technology.

Andas the spread of devices and sensors grows, even more, the amount of data thatwill have to be managed grows and the number of applications that will have tobe developed grows.

Fromthis point of view, an important business opportunity can be foreseen in termsof spreading development platforms and also in terms of connectivity solutionsand in this respect we can already see growth of very important interest bytelcos. Another fundamental area of ​​growth is represented by systemintegrators and consulting companies.

IoTmeans integration and opens up very important perspectives in terms ofreviewing company information systems. Also from this point of view, the IoTwill represent an important development opportunity.

After several years of curiosity and experimentation in the Internet of Things, people are finding that they have different degrees of application: the most consolidated realities, the experimental ones and the embryonic ones.

Examples and applications of the Internet of Things in real life

Fromthe refrigerator at home to the clock, to the traffic lights, everyone can beconsidered examples of IoT. The important thing is that these objects areconnected to the network and that they have the ability to transmit and receivedata. In this way, these objects become “intelligent”, and can beactivated and deactivated “on their own” and as needed.

For example, in Switzerland, there are intelligent traffic lights, which turn green when they “see” that a vehicle is nearing traffic lights, and that there are no other passing vehicles. 

This, like others, are examples of how objects take “life”, and how these objects can be connected to each other and to real-life everyday. Here is the future described by Orwell and dystopian novels, in the present.

Theseconnected devices and objects can inter alia connect to data analysis software(for example, Google Universal Analytics) and in this way transmit data andinformation from real-life directly to computers and analysis software. Leadingthe way to Big Data.

There are many examples and applications of IoT, from smart cities to companies. 

Below are examples of the Internet of Things in everyday life.

1. Smart City

Smart cities (some call them sensitive city) relates to urban planning strategies that improve the quality of life in the city to try and meet the needs of citizens.

The technologies adopted to create smart cities (or parts of them) allow relating infrastructures (objects) to fulfil the needs of the inhabitants of the city. Examples of this are intelligent traffic lights (which turn green when there are no cars running in the opposite direction) or innovative systems for waste management and disposal, other environmental, energy, mobility, communication and urban planning innovations.

The sectors in which there are greatest interest are industrial and public administration. The whole world of Smart Cities is accompanied by issues related to public administration projects and more strategic issues such as those related to Open Data.

2. Smart Building and Smart Home (connected houses and buildings)

The substantial differences between buildings and smart homes are that, smart homes are aimed primarily at a “consumer” public or final users of services (examples can be to regulate the temperature of the home remotely, or sensors surveying for people in the house.)

Smartbuildings are mainly aimed at B2B or the construction and optimization ofbuildings and offices, to equip them with intelligent objects that interactwith the internal environment (for example light management and electricity).

The world of  Smart Building continues on a double track, with a component that looks mainly at the domestic world (smart homes) is turning it’s attention phenomena towards the consumer world and a professional component (smart building) that has now become the common heritage of development and design by designers and architects. 

3. Smart Mobility

Thetheme of mobility is absolutely central to determine the quality of life in ourcities and as has been emphasized several times, there can be no Smart City ifthere is no Smart Mobility.

There are many companies that are heavily investing in this sector, in the dimension of Smart Car and Connected Car but also applications related to the world of rail transport with trains controlled by IoT, it opens up huge business opportunities.

4. Smart Agriculture

Whatis the impact of the Internet of Things on the environment?

Precision farming or Smart Agriculture is one of the sectors with the highest development opportunities for digitized solutions.

This is a sector that requires sensors for the environment and the land, for example applications for weather, automation of equipment for the increasingly precise management of water, fertilizers and crop protection products, which all need digital solutions.

The possibilities are vast and not only linked to the use of drones as commonly seen, but also to sensors that compliment the themes of IoT to innovate logistics and drive solutions for Smart Agriculture, agro energy and operations that aim to improve the relationship linked to food and sustainability.

5. IoT and Public Administration: transport, energy, sustainability, waste, environment

Today, public administrations play a fundamental role in the development of technical affairs. Often technology is regulated, financed and managed by the public sector as seen with the development of Intelligent Transport Systems.

 6. Smart Manufacturing or Industry 4.0

SmartManufacturing also called the 4.0 Industry.

Smart Manufacturing overlaps with the Industry 4.0 world, though work is still to be done to implement development policies to influence the introduction of digital in the industrial world that was born in Germany with the Industry 4.0 phenomenon. Industry 4.0 has also made ground in the United States with the phenomenon of the digital factory. 

Industry 4.0 or 4.0 Industry has real life research data that shows how this market is growing. At a rate of 20% growth, this is evidence of how the manufacturing industry is changing with IoT. 

From the Internet of Things to Smart Products

IoT and smart products or devices
IoT and smart products or devices

TheIoT is the starting point for the realization of connected products. Objectsthat “network” their ability to collect information in any context.

When the connected products are «connected» to a production system that already operates in a production phase, they allow the product to access data that enables the connected products to modify the PROCESSES. When the connected product is added with a processing abilities (even minimal) these become Connected and Intelligent products. So we enter the Smart product range: SMART PRODUCT = CONNECTION + INTELLIGENCE.

But when can we say that Smart Products change companies and competition? When, there is a SMART PRODUCT ECOSYSTEM – A network of intelligent products that put the result of their respective processing capacity online to create new services and new value for users.

In 2015 the “How Smart, Connected Products Are Transforming Companies” by Michael E. Porter and James E. Heppelmann appeared on the Harward Business Review HBR, dedicated to the perspectives of the Internet of Things in terms of developing smart products that change companies while the previous year the same authors published “How Smart, Connected Products Are Transforming Competition” which analyzed the impact of competition of Smart Product development companies. 

The Internet of Things economy is realized not only when costs are reduced and efficiency is increased, but when new services are created for businesses and consumers.

Cybersecurity: from the IoT to the Internet Security of Things

If it is true that by 2021, as Ericsson’s forecasts support, we will have 1.5 billion Internet of Things devices with cellular connectivity, most of which with 5G solutions, new business will open for mobile operators linked to initial data collection, then to the data communication and, not least to the analytics data associated with the IoT.

As connected devices increases, the data produced increases. Unfortunately, in the face of scenarios that increase business and wealth opportunities, threats will also increase, which will also bring about general security problems and the security associated with the protection of personal data.

Toaddress the issue of security in a new way, interest is growing in edge datacollection solutions that act as a security endpoint.

Cybersecurity and Internet of Things
Cybersecurity and Internet of Things

The Internet of Things is effective and productive only if each system interacts with others – integration requires control, scalability, flexibility, efficiency. And the answer to this security question comes primarily from the data themselves, or rather from the ability to know more and more. More precisely the needs that underlie the projects to effectively design solutions that best enable them to achieve those objectives without minimizing exposure to all known risk factors.

The IoT is, on the one hand, a new source of threats but also a source of data and knowledge to reduce exposure to risks.

Connected objects and growth of the semantic web

In terms of implementing the simplest solution, a diffusion approach continues to show strong growth.

In December 2011 almost all the meters installed in Italy were smart and the objects connected to each other by the cellular network were 3.9 million: 10% more than the previous year.

Today the most relevant area is the Smart Car, the connected objects related to this technology are 43% of the total connected objects. It is expected that this percentage will rise next year, significantly with regards to the monitoring applications of cars for insurance and mobility info purposes.

Itshould also be mentioned that, with the Internet of Things, as well as issuessuch as privacy and security, terminologies such as:

  • IPv6 (successor to the IPv4 Internet Protocol) which simplifies configuration and management of IP networks. 
  • Cloud computing, or the technology that allows you to save data on a virtual cloud (cloud) where these data can be found without the need to be on a physical machine, such as a desktop computer or a laptop
  • Big Data, which has already been discussed- a large amount of data available now that the objects are connected and are communicating data on their use. This theme, in particular, raises doubts about the security deriving from the IoT, and on topics such as privacy and processing of personal data and sensitive data.

With the IoT towards Intelligent Manufacturing

Bringingintelligence to the manufacturing world means creating new value forbusinesses. Thanks to the Internet of Things and thanks to Industry 4.0,digital becomes a tool that allows the improvement of products and processesand allows the development and implementation of new business models.

With thanks to an IoT that ” communicates ” with the CRM and Data-Driven Enterprise projects, we can guarantee and develop the centrality of the customer experience in a context in which more than 80% of consumers agree to pay a premium price in exchange for better user experience. 

IoT made industry automated
IoT made industry automated

Inaddition, the product that turns into service is “good for business”and brings new value, so much so that most manufacturers believe that the”product as a service” will help to increase profits. But toimplement these scenarios it is necessary that companies are not only connectedbut at the same time are able to support and spread the development of relatedproducts. 

Here it is important to read and analyze the experiences of those who are in reality “doing” and are interpreting the digital transformation and serialization in the manufacturing world: Intelligent Manufacturing contributes towards the data economy, there are connected products and serialization thanks to IoT.


In conclusion, we observe that from utilities to healthcare, from production to public administration there are now many sectors and fields of work involved in the technological innovation of the IoT (Internet of Things), with different levels of maturity.

Inthe future, the wave of IoT will spread further, and it is expected that notonly IT devices and home appliances, but also a society where everything willbe connected to the network will come.

Text Classification with Deep Learning


Text classification is a supervised machine learning task where text documents are classified into different categories depending upon the content of the text. Some of the most common examples of text classification include sentimental analysis, spam or ham email detection, intent classification, public opinion mining, etc. Rule-based, machine learning and deep learning approaches have been developed for text classification. Deep learning approaches for text classification are further divided into two types: Supervised learning and unsupervised learning. In this article, you will study a supervised deep learning technique for text classification.


The main goal of this article is to demonstrate how to perform text classification with deep learning using Python deep learning libraries. As an example, you will create a deep learning model capable of detecting spam messages. After reading this article you will be able to solve general text classification problems such as sentimental analysis, intent detection, news article classification, etc with Python. You will also learn word embeddings that are used as input for a variety of natural language tasks such as topic modeling, text generation, chatbot development, etc. You will be using Python’s Keras library to implement your deep learning model.


The code provided in this article is written in Python 3, therefore you are expected to have at least an intermediate level of Python knowledge. The code has been executed and tested with Google Colaboratory which is an online deep learning platform. You can also run the code locally on your PC. You will need to install TensorFlow’s Keras library in order to execute the code. Here is a very good article on how to install TensorFlow Keras on Windows. Linux users can check out this tutorial. Basic knowledge of machine learning is also assumed.

Step 1: Problem Statement & Dataset

Given a set of text messages, we want to develop a deep learning model capable of telling which of the messages are spam. Machine learning and deep learning models learn from datasets. Therefore, we need a dataset that contains ham and spam messages.The dataset we are going to use to develop our deep learning model can be downloaded from this kaggle link. Download the CSV file into local computers, or if you are training your models with any cloud library, you will need to upload your CSV file to the corresponding platform.

Step 2: Importing Libraries

Execute the following script to import the Python libraries required to execute scripts in this article.# -*- coding: utf-8 -*-


Created on Sun Sep 22 11:48:17 2019

@author: usman


import pandas as pd

import numpy as np

import re

import nltk

from nltk.corpus import stopwords

from numpy import array

from keras.preprocessing.text import one_hot

from keras.preprocessing.sequence import pad_sequences

from keras.models import Sequential

from keras.layers.core import Activation, Dropout, Dense

from keras.layers import Flatten, LSTM

from keras.layers.embeddings import Embedding

from sklearn.model_selection import train_test_split

from keras.preprocessing.text import Tokenizer

import matplotlib.pyplot as plt

from numpy import array

from numpy import asarray

from numpy import zeros

%matplotlib inline

Step 3: Dataset Analysis

In this step, we will import the dataset into our application and will perform some exploratory data analysis. The read_csv method from the Pandas library can be used to import the dataset.ham_spam = pd.read_csv(“/content/drive/My Drive/datasets/ham_spam.csv”)We will remove all those rows where the dataset contains a null value.ham_spam.dropna(inplace = True)The shape attribute of the pandas dataframe can be used to view the shape of the dataset.ham_spam.shapeIn the output, you should see (5572, 2) which shows that the dataset contains 5572 rows and 2 columns.Let’s now see the first five rows of the dataset using the head method of the pandas dataframe.ham_spam.head()Here is the output:

The output shows that the dataset contains two columns: Category and Message. The Category column contains information regarding whether a text message is spam or ham, while the Message column contains the actual text of a message.Let’s plot a pie graph which shows the distribution of ham and spam messages in the dataset. Let’s first increase the plot size since the default plot size is a bit too small.plot_size = plt.rcParams[“figure.figsize”]

plot_size [0] = 8

plot_size [1] = 6

plt.rcParams[“figure.figsize”] = plot_size The following script plots the pie graph for two categories of messages in the dataset.ham_spam.Category.value_counts().plot(kind=’pie’, autopct=’%1.0f%%’, colors = [‘skyblue’, ‘orange’], explode = (0.05, 0.05))Here is the output:

The output shows that 87% of the messages are ham while only 13% of the messages are spam.

Step 4: Data Cleaning

Text messages contain multiple empty spaces, punctuations, special characters, etc. It is a good practice to clean the text by removing special characters and punctuations before classification. Hence, we will write a method that accepts a text string as a parameter and returns a text string without punctuations and special characters.def clean_text(mes):

# Remove punctuations and numbers

message = re.sub(‘[^a-zA-Z]’, ‘ ‘, mes)

# Remove single characters

message = re.sub(r”\s+[a-zA-Z]\s+”, ‘ ‘, message)

# Remove multiple spaces generated due to single character removal

message = re.sub(r’\s+’, ‘ ‘, message)

return messageThe following script cleans the text messages in our dataset:message_list = []

texts = list(ham_spam[‘Message’])

for text in texts:


Step 6: Splitting Data into Training and Test Sets

In order to create robust deep learning models, the dataset used is divided into train and test sets. The deep learning models are trained on the training set. The performance of the trained models is evaluated by making predictions on the test sets. An ideal deep learning algorithm should perform equally well on training and test set.Deep learning algorithm works with numeric data. However, the categories in our dataset have two unique text values i.e. ham and spam. We will replace ham by 0 and spam by 1. To do so, execute the following script:labels = ham_spam[‘Category’]

labels = np.array(list(map(lambda x: 1 if x==”spam” else 0, labels)))The following script divides our data into training and test sets. 80% of the data will be used to train our deep learning model while 20% of the data will be used for evaluating the model performance.X_train, X_test, y_train, y_test = train_test_split(message_list, labels, stratify = labels, test_size=0.20, random_state=42)

Step 7: Embedding Words to Numeric Vectors

In the previous section, you saw we replaced text labels by binary digits 1 and 0. To train deep learning models, we also need to convert our text messages into the numeric format. There are different approaches to convert text two numbers such as a bag of words, TFIDFI, n-grams, etc. Furthermore, a variety of prebuild word embeddings also exist such as Google Word2Vec, Stanford’s Glove, etc. We will be using Stanford’s Glove word embedding in this article.First of all, we need to convert our text to a series of integers, depending upon the words in the text. To do so, we can use the Tokenizer class from the keras.preprocessing.text library.tokenizer = Tokenizer(num_words=4000)


X_train = tokenizer.texts_to_sequences(X_train)

X_test = tokenizer.texts_to_sequences(X_test)The Tokenizer class converts text to numbers. However, since text messages can be of different lengths, the integer representations created by Tokenizer class also vary by length. In order to create uniform length vectors, we can specify a specific length for all the vector and then truncate the vectors longer than that length. For the vectors, smaller than the specified length, we can add zeros at the end. This process is called padding. Padding can be implemented via pad_sequences class from the Keras.preprocessing.sequence . Look at the following script:vocabulary = len(tokenizer.word_index) + 1

sen_length = 200

X_train = pad_sequences(X_train, padding=’post’, maxlen=sen_length )

X_test = pad_sequences(X_test, padding=’post’, maxlen=sen_length )The vocabulary variable contains the total number of unique words in the dataset. The sentence length is 200 which means that each sentence will be represented by a numeric vector of 200 integers.Next, we need to create a dictionary that contains words and their corresponding vector representation as specified by Stanford’s Glove library. The following script does that:embeddings_dictionary = dict()

glove_file = open(‘E:/Datasets/Word Embeddings/glove.6B.100d.txt’, encoding=”utf8″)

for line in glove_file:

records = line.split()

word = records[0]

vector_dimensions = asarray(records[1:], dtype=’float32′)

embeddings_dictionary [word] = vector_dimensions

glove_file.close()Finally, we will create an embedding matrix where each row index will correspond to the integer value of the words as specified by the `Tokenizer` class. The value for each row will be the corresponding Glove representations for the words.embedding_matrix = zeros((vocabulary, 100))

for word, index in tokenizer.word_index.items():

embedding_vector = embeddings_dictionary.get(word)

if embedding_vector is not None:

embedding_matrix[index] = embedding_vector

Step 7: Training the Models

In the previous steps, we converted our data into a format that is required to train deep learning algorithms. Now is the time to train the model itself. Several deep learning models can be used for text classification. However, Long Short Term Memory (LSTM) network has shown excellent performance with sequential data. Since sentences are basically sequences of words, we will use LSTM to create text classification model. With Keras, LSTM can be implemented in less than 10 lines of code as shown below:model = Sequential()

embedding_layer = Embedding(vocabulary, 100, weights=[embedding_matrix], input_length=sen_length , trainable=False)



model.add(Dense(1, activation=’sigmoid’))

model.compile(optimizer=’adam’, loss=’binary_crossentropy’, metrics=[‘acc’])In the above script, we create LSTM model with 16 neurons. The Embedding layer specifies the word embeddings that will be used to convert input text to numbers. The activation function is sigmoid, the loss function is binary_crossentropy since there are only two possible outputs. In the case of more than two output labels, you will use categorical_crossentropy, the performance metrics is accuracy.To train the model, you simply need to call the fit method and pass it our training set.performance =, y_train, batch_size=128, epochs=10, verbose=1, validation_split=0.2)And that’s pretty much. The above script trains the model. When you run the above script, you should see the following output:Train on 3565 samples, validate on 892 samples

Epoch 1/10

3565/3565 [==============================] – 11s 3ms/step – loss: 0.6523 – acc: 0.8715 – val_loss: 0.5963 – val_acc: 0.8430

Epoch 2/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.4672 – acc: 0.8715 – val_loss: 0.4498 – val_acc: 0.8430

Epoch 3/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3841 – acc: 0.8715 – val_loss: 0.4374 – val_acc: 0.8430

Epoch 4/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3837 – acc: 0.8715 – val_loss: 0.4366 – val_acc: 0.8430

Epoch 5/10

3565/3565 [==============================] – 9s 3ms/step – loss: 0.3837 – acc: 0.8715 – val_loss: 0.4386 – val_acc: 0.8430

Epoch 6/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3838 – acc: 0.8715 – val_loss: 0.4393 – val_acc: 0.8430

Epoch 7/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3841 – acc: 0.8715 – val_loss: 0.4386 – val_acc: 0.8430

Epoch 8/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3837 – acc: 0.8715 – val_loss: 0.4404 – val_acc: 0.8430

Epoch 9/10

3565/3565 [==============================] – 10s 3ms/step – loss: 0.3841 – acc: 0.8715 – val_loss: 0.4422 – val_acc: 0.8430

Epoch 10/10

3565/3565 [==============================] – 9s 3ms/step – loss: 0.3839 – acc: 0.8715 – val_loss: 0.4382 – val_acc: 0.8430Our model achieves training accuracy of 84.30%.

Step 8: Evaluate Model Performance

To evaluate the performance of the trained model on the test set, you can use the evaluate method.score = model.evaluate(X_test, y_test, verbose=1)

print(“Accuracy on Test Data:”, score[1])Here is the output:1115/1115 [==============================] – 4s 4ms/step

Accuracy on Test Data: 0.8663677136994263 The output shows an accuracy of 86.63% which means that out of 100 predictions, 86.63% of the time our model correctly predicts whether or not a message is a spam.Let’s now plot the training and validation loss and accuracy achieved while training the model.import matplotlib.pyplot as plt

plt.plot(performance .history[‘acc’])

plt.plot(performance .history[‘val_acc’])




plt.legend([‘train’,’validate’], loc=’upper left’)






plt.legend([‘train’,’validate’], loc=’upper left’) is output:

The output shows that the model achieves maximum accuracy in its first epoch (one epoch refers to one training cycle on the complete dataset). On the other hand, the model achieves minimum loss at around second epoch.


Text classification is one of the most common natural language processing tasks. This article explains the basics of text classification with deep learning. In this article, you saw how to identify whether a text message is spam or ham. The trained deep learning model achieves an accuracy of 86.63 on the test set without any parameter tuning. I would suggest you change the parameter and see if you can get better results. Feel free to comment in case if you have something to say or if you want to ask any question.