Skin Cancer Detection using Convolution Neural Network(CNN)

Charan H U
5 min readDec 28, 2021

--

Build deep learning model to classify given query image into one of the 7 different classes of skin cancer.

Skin Cancer Detection using Convolution Neural Network(CNN)

Skin cancer is the most common human malignancy, is primarily diagnosed visually, beginning with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and histopathological examination. Automated classification of skin lesions using images is a challenging task owing to the fine-grained variability in the appearance of skin lesions.

Table of Contents:

  1. Prerequisites
  2. About Data
  3. Importing Essential Libraries
  4. Loading data and Making labels
  5. Train Test Split
  6. Exploratory data analysis (EDA)
  7. CNN Model Architecture
  8. Model Building (CNN)
  9. Setting Optimizer & Annealing
  10. Fitting the model
  11. Model Evaluation
  12. Model Deployment
  13. Model Deployment Results
  14. Conclusion
  15. References

1. Prerequisites:

This post assumes you are familiarity with basic knowledge of Data Preprocessing, Exploratory Data Analysis, Performance matric, Machine Learning, Deep Learning techniques like CNN, python syntax, some libraries like NumPy, Pandas, sk-learn, Matplotlib, Seaborn, PrettyTable, TensorFlow, Keras, etc.

2. About Data:

2.1 Overview:

Another more interesting than digit classification dataset to use to get biology and medicine students more excited about machine learning and image processing.

2.2 Original Data Source:

  1. Original Challenge: https://challenge2018.isic-archive.com
  2. Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available dataset of dermatoscopic images.
  3. This the HAM10000 (“Human Against Machine with 10000 training images”) dataset. It consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive. This benchmark dataset can be used for machine learning and for comparisons with human experts.
  4. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions: Actinic keratoses and intraepithelial carcinoma / Bowen’s disease (akiec), basal cell carcinoma (bcc), benign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses, bkl), dermatofibroma (df), melanoma (mel), melanocytic nevi (nv) and vascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage, vasc)

It has 7 different classes of skin cancer which are listed below:

  1. Melanocytic nevi
  2. Melanoma
  3. Benign keratosis-like lesions
  4. Basal cell carcinoma
  5. Actinic keratoses
  6. Vascular lesions
  7. Dermatofibroma

3. Importing Essential Libraries:

import pandas as pd
import numpy as np
import warnings
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, Dense,MaxPool2D
import tensorflow as tf

4. Loading data and Making labels:

5. Train Test Split:

>>df.label.unique()array([4, 6, 2, 5, 0, 1, 3])# reference: https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000/discussion/183083classes={
0:('akiec', 'actinic keratoses and intraepithelial carcinomae'),

1:('bcc' , 'basal cell carcinoma'),

2:('bkl', 'benign keratosis-like lesions'),

3:('df', 'dermatofibroma'),

4:('nv', ' melanocytic nevi'),

5:('vasc', ' pyogenic granulomas and hemorrhage'),

6:('mel', 'melanoma'),
}
y_train=train_set['label']

x_train=train_set.drop(columns=['label'])

y_test=test_set['label']

x_test=test_set.drop(columns=['label'])

columns=list(x_train)
import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

print(device)

6. Exploratory data analysis (EDA):

import seaborn as sns

sns.countplot(train_set['label'])

After random over sampling,

from imblearn.over_sampling import RandomOverSampler 

oversample = RandomOverSampler()

x_train,y_train = oversample.fit_resample(x_train,y_train)
sns.countplot(y_train)

7. CNN Model Architecture:

Skin Cancer Detection using Convolution Neural Network(CNN)
Skin Cancer Detection using Convolution Neural Network(CNN)

8. Model Building (CNN):

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 7.87 µs
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_15 (Conv2D) (None, 28, 28, 16) 448
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 14, 14, 16) 0
_________________________________________________________________
batch_normalization_18 (Batc (None, 14, 14, 16) 64
_________________________________________________________________
conv2d_16 (Conv2D) (None, 12, 12, 32) 4640
_________________________________________________________________
conv2d_17 (Conv2D) (None, 10, 10, 64) 18496
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 5, 5, 64) 0
_________________________________________________________________
batch_normalization_19 (Batc (None, 5, 5, 64) 256
_________________________________________________________________
conv2d_18 (Conv2D) (None, 3, 3, 128) 73856
_________________________________________________________________
conv2d_19 (Conv2D) (None, 1, 1, 256) 295168
_________________________________________________________________
flatten_3 (Flatten) (None, 256) 0
_________________________________________________________________
dropout_9 (Dropout) (None, 256) 0
_________________________________________________________________
dense_15 (Dense) (None, 256) 65792
_________________________________________________________________
batch_normalization_20 (Batc (None, 256) 1024
_________________________________________________________________
dropout_10 (Dropout) (None, 256) 0
_________________________________________________________________
dense_16 (Dense) (None, 128) 32896
_________________________________________________________________
batch_normalization_21 (Batc (None, 128) 512
_________________________________________________________________
dense_17 (Dense) (None, 64) 8256
_________________________________________________________________
batch_normalization_22 (Batc (None, 64) 256
_________________________________________________________________
dropout_11 (Dropout) (None, 64) 0
_________________________________________________________________
dense_18 (Dense) (None, 32) 2080
_________________________________________________________________
batch_normalization_23 (Batc (None, 32) 128
_________________________________________________________________
dense_19 (Dense) (None, 7) 231
=================================================================
Total params: 504,103
Trainable params: 502,983
Non-trainable params: 1,120
_________________________________________________________________

9. Setting Optimizer & Annealing:

10. Training Model:

11. Model Evaluation:

Accuracy v/s epoch
Loss v/s Epoch

Confusion Matrix:

12. Model Deployment:

The model is deployed to Heroku cloud through Git/GitHub.

Files Required are:

In the above mentioned files, tester.png, LICENSE, model.png, model_architecture.png are optional

13. Model Deployment Results:

Play with inputting skin cancer images here

URL to deployment: https://skin-cancer-detection-cnn.herokuapp.com/

Home page:

Result Page:

14. Conclusion:

This model is not robust to all skin images, because, we not trained with good amount of equal class images data. Due to random oversampling it may give some wrong predictions to images.

15. References:

  1. https://www.kaggle.com/sid321axn/step-wise-approach-cnn-model-77-0344-accuracy
  2. https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000

You can reach me at:-

GitHub Repository Link: https://github.com/charanhu/Google-Analytics-Customer-Revenue-Prediction

LinkedIn: https://www.linkedin.com/in/charanhu/

GitHub: https://github.com/charanhu

--

--