E- Surveillance Alert Classification

Table of Contents

  1. Business Problem
  2. Problem Statement
  3. Real-world objectives and Constraints
  4. Mapping the real-world problem to ML problem
  5. Read the data
  6. Data Preprocessing
  7. Building Machine learning Model
  8. Conclusion

1. Business Problem

Description:

Prevent break-ins before they occur using IoT security cameras with built-in computer vision capabilities, reducing the need for human intervention. Automated security to safeguard and alert against threats from intrusion or fire using multi-capability sensors such as vibration, motion, smoke, fire, panic switches etc. Ensure the safety of both monetary and intellectual assets with round-the-clock surveillance and controlled access management.

2. Problem Statement

We are tasked with classifying the alert whether it is Critical, Normal, or Testing which is received from the various sensors. Such as vibration, motion, smoke, fire, Panic, shutter(Door sensor).

True_Normal

Testing the sensors(Smoke, Panic, fire), Smoke Alert due to AC maintenance / UPS maintenance, Cash loading, Pressing the panic switches unknowingly by bank staffs…

True_Critical

Smoke, Fire, Network Connection error, PIR, Panic(Fire, Theft Attempt, in ATM and Bank)

Breaking the ATM Machine(Vibration sensor will be activated)

Thieves are showing the Weapons to Bank Staff (Panic switch must be pressed by Bank Staff to get the alert)

Breaking the Doors(In this case, the Shutter sensor will send the alert )

False_Normal

While opening the bank, make sure to enter the password to change the mode. Otherwise, PIR and Alarm will be generated.

Some interesting reasons:

*During renovation work at the bank (smoke and fire sensor will be activated due to dust)

*During sanitization of the branch.

* Cleaning or sweeping the branch.

* Auditing(Auditor will check by burning the papers inside the branch)

* Birthday celebration( candle smoke will generate the alert)

All images Credit: Google

False_Critcal

False_Critical = No activity, Sensor Malfunctioning (keep on getting the alert). Alert is received from the critical sensor even though there is no activity. Such as Smoke, fire, PIR(Motion detection sensor)

3. Real-world/Business Objectives and Constraints

1. The cost of misclassification can be very high.

2.No strict latency concerns.

4. Mapping the real-world problem to an ML problem

Type of Machine Learning Problem

Supervised Learning:

It is a Multi classification problem, for a given sensor data we need to classify if it is critical, Normal, or Testing

Train and Test Construction

We build train and test by randomly splitting in the ratio of 70:30 or 80:20 whatever we choose as we have sufficient points to work with.

Importing Necessary Libraries

import pandas as pdimport seaborn as snsimport matplotlib.pyplot as pltfrom matplotlib import rcParamsfrom matplotlib.cm import rainbow%matplotlib inlineimport warningswarnings.filterwarnings('ignore')

5. Read Data

data= pd.read_csv('Master - Ack Time Analysis From April-20 to August-20 - Details.csv',error_bad_lines= False)

Showing the first 5 rows of the dataset

data.head(5)

Shape

data.shape

Output

(185687, 16)

Counting the output values for each category

data.Status.value_counts().count

Output

<bound method Series.count of True_Normal      110395
Testing 36442
False_False 23755
True_Critical 10505
FALSE 4590
Name: Status, dtype: int64>

Features of the data set

data.columns

Output

Index(['DATE', 'SOL ID', 'Branch', 'SENSOR (As of Portal)', 'Region', 'STATE',
'SENSOR NAME (Standard)', ' EVENT', 'EVENT DATE AND TIME',
'AKNOWLEDGE STATUS', 'Confirmation', 'LOG ID', '2nd AKNOWLEDGE STATUS',
'Status', 'Month', 'Reason'],
dtype='object')

Number of distinct observations

data.nunique()

Output

DATE                         177
SOL ID 1721
Branch 1793
SENSOR (As of Portal) 1327
Region 4
STATE 22
SENSOR NAME (Standard) 19
EVENT 14
EVENT DATE AND TIME 73423
AKNOWLEDGE STATUS 9864
Confirmation 36
LOG ID 185475
2nd AKNOWLEDGE STATUS 10078
Status 5
Month 6
Reason 18
dtype: int64

6. Data Preprocessing

Removing the unnecessary features

p=data.drop(['LOG ID','SENSOR (As of Portal)','DATE','SENSOR NAME (Standard)', 'Month','EVENT DATE AND TIME','Reason'], axis=1)

Checking for NaN/null values

p.isna().sum()Output:SOL ID                   0
Branch 0
Region 0
STATE 0
EVENT 0
AKNOWLEDGE STATUS 0
Confirmation 0
2nd AKNOWLEDGE STATUS 0
Status 0
dtype: int64

Handling categorical data

Here we are converting categorical data to numerical data using a label encoder

from sklearn.preprocessing import LabelEncoderlabelencoder_X=LabelEncoder()xm=p.apply(LabelEncoder().fit_transform)xm

Splitting input features

X=xm.iloc[:,:-1]X

Output feature

y=xm.iloc[:,8]y

Output

0         3
1 0
2 3
3 3
4 3
5 0
6 3
7 3
8 0
9 0
10 0
11 3
12 0
13 0
14 0
15 0
16 3
17 0
18 3
19 0
20 0
21 3
22 0
23 3
24 0
25 0
26 0
27 3
28 0
29 0
..
185657 1
185658 1
185659 1
185660 1
185661 1
185662 1
185663 1
185664 1
185665 1
185666 4
185667 1
185668 4
185669 4
185670 4
185671 4
185672 4
185673 1
185674 1
185675 4
185676 4
185677 4
185678 4
185679 4
185680 4
185681 4
185682 4
185683 4
185684 1
185685 1
185686 1
Name: Status, Length: 185687, dtype: int32

Split Train and Test data

from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test =
train_test_split(X, y, test_size = 0.30, random_state=5)
print(X_train.shape)print(X_test.shape)print(y_train.shape)print(y_test.shape)

Output:

(129980, 8)
(55707, 8)
(129980,)
(55707,)

7. Building machine learning model

1. Logistic Regression

from sklearn.model_selection import train_test_splitfrom sklearn.linear_model import LogisticRegressiondata = pX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state=5)model = LogisticRegression()model.fit(X_train, y_train)print(model.score(X_test, y_test))print(model.score(X_train, y_train))

​Accuracy:

0.5982192543127435
0.5926219418372057

2. KNearest Neighbour

from sklearn.neighbors import KNeighborsClassifierknn_classifier = KNeighborsClassifier(knn_classifier.fit(X_train, y_train)knn_predictions = knn_classifier.predict(X_test)print(knn_classifier.score(X_test, y_test))

Accuracy

0.962123252015007

X_ = np.array([1255,619,3,11,6,7872,14,8079])y_pred =knn_classifier.predict([X_])X1 = np.array([705,16,3,16,10,2682,1,2748])y_pred1 =knn_classifier.predict([X1])print(y_pred)print(y_pred1)

[3]
[1]

3. Random Forest

from sklearn.ensemble import RandomForestClassifierrf_classifier = RandomForestClassifier()rf_classifier.fit(X_train, y_train)rf_predictions = rf_classifier.predict(X_test)print(rf_classifier.score(X_test, y_test))

Accuracy:

0.9804871919148401

X_ = np.array([1255,619,3,11,6,7872,14,8079])y_pred =rf_classifier.predict([X_])y_pred

Output

array([3])

8. Conclusion

print('\n                     Accuracy     Error')print('                     ----------   --------')print('Logistic Regression : {:.04}%       {:.04}%'.format( model.score(X_test, y_test)* 100,\100-(model.score(X_test, y_test) * 100)))print('KNN                 : {:.04}%       {:.04}% '.format(knn_classifier.score(X_test, y_test) * 100,\100-(knn_classifier.score(X_test, y_test) * 100)))print('Random Forest       : {:.04}%       {:.04}% '.format(rf_classifier.score(X_test, y_test)* 100,\ 100-(rf_classifier.score(X_test, y_test)* 100)))
Accuracy Error
---------- --------
Logistic Regression : 59.82% 40.18%
KNN : 96.21% 3.788%
Random Forest : 98.05% 1.951%

We can choose the Random Forest, KNN model to get the desired output

Happy learning…..

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sridhar

Sridhar

32 Followers

Data science Enthusiast|Cricketer| Gymmer| Blog Writer