COVID-19 India Analysis

By Niharika P, Fri 08 May 2020, in category Corona virus

covid-19 data analysis python

COVID-19 India Analysis

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus.

Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness.

The best way to prevent and slow down transmission is be well informed about the COVID-19 virus, the disease it causes and how it spreads. Protect yourself and others from infection by washing your hands or using an alcohol based rub frequently and not touching your face.

CORONA VIRUS GIF

This notebook analyzes the spread of corona virus in India.

COVID-19 cases overview (Worldwide)

In [1]:
import plotly.graph_objects as go
import plotly.offline as py
autosize =False


# Use `hole` to create a donut-like pie chart
values=[4100000, 1400000, 282000]
labels=['Confirmed',"Recovered","Deaths"]
fig = go.Figure(data=[go.Pie(labels=labels, values=values, hole=.3)])
fig.update_traces(hoverinfo='label+percent', textinfo='value',textfont_size=15,
                  marker=dict(colors=['#00008b','#fffdd0'], line=dict(color='#FFFFFF', width=2.5)))
fig.update_layout(
    title='COVID-19 ACTIVE CASES VS CURED WORLDWIDE')
py.iplot(fig)

COVID-19 cases Overview(India).

In [2]:
# Use `hole` to create a donut-like pie chart
values=[67152, 20917, 2206]
labels=['Confirmed',"Recovered","Deaths"]
fig = go.Figure(data=[go.Pie(labels=labels, values=values, hole=.3)])
fig.update_traces(hoverinfo='label+percent', textinfo='value',textfont_size=15,
                  marker=dict(colors=['#DAA520','#800000'], line=dict(color='#FFFFFF', width=2.5)))
fig.update_layout(
    title='COVID-19 ACTIVE CASES VS CURED INDIA')
austosize=False
py.iplot(fig)
In [3]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.
/kaggle/input/covid19-corona-virus-india-dataset/complete.csv
/kaggle/input/covid19-corona-virus-india-dataset/nation_level_daily.csv
/kaggle/input/covid19-corona-virus-india-dataset/tests_latest_state_level.csv
/kaggle/input/covid19-corona-virus-india-dataset/web_scraping.ipynb
/kaggle/input/covid19-corona-virus-india-dataset/patients_data.csv
/kaggle/input/covid19-corona-virus-india-dataset/tests_daily.csv
/kaggle/input/covid19-corona-virus-india-dataset/district_level_latest.csv
/kaggle/input/covid19-corona-virus-india-dataset/zones.csv
/kaggle/input/covid19-corona-virus-india-dataset/state_level_latest.csv
/kaggle/input/covid19-corona-virus-india-dataset/api.ipynb
/kaggle/input/covid19indiazones/India-District-Zones.csv
/kaggle/input/geojson-maharashta/maharashtra_district.json
/kaggle/input/indian-state-geojson-data/india_state_geo.json
/kaggle/input/covid19-in-india/ICMRTestingDetails.csv
/kaggle/input/covid19-in-india/StatewiseTestingDetails.csv
/kaggle/input/covid19-in-india/HospitalBedsIndia.csv
/kaggle/input/covid19-in-india/covid_19_india.csv
/kaggle/input/covid19-in-india/population_india_census2011.csv
/kaggle/input/covid19-in-india/ICMRTestingLabs.csv
/kaggle/input/covid19-in-india/IndividualDetails.csv
/kaggle/input/covid19-in-india/AgeGroupDetails.csv

Geospatial Analysis India

In [4]:
import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import geopandas
import seaborn as sns
In [5]:
data=pd.read_csv("../input/covid19-corona-virus-india-dataset/complete.csv")

data.tail()
Out[5]:
Date Name of State / UT Total Confirmed cases (Indian National) Total Confirmed cases ( Foreign National ) Cured/Discharged/Migrated Latitude Longitude Death Total Confirmed cases
1816 2020-05-11 Telengana 0 0 750 18.1124 79.0193 30 1196
1817 2020-05-11 Tripura 0 0 2 23.9408 91.9882 0 150
1818 2020-05-11 Uttar Pradesh 0 0 1653 26.8467 80.9462 74 3467
1819 2020-05-11 Uttarakhand 0 0 46 30.0668 79.0193 1 68
1820 2020-05-11 West Bengal 0 0 417 22.9868 87.8550 185 1939
In [6]:
import json
import folium

statecases=data.groupby('Name of State / UT')['Total Confirmed cases','Death','Cured/Discharged/Migrated'].max().reset_index()

with open('/kaggle/input/indian-state-geojson-data/india_state_geo.json') as file:
    geojsonData = json.load(file)

for i in geojsonData['features']:
    if(i['properties']['NAME_1']=='Orissa'):
        i['properties']['NAME_1']='Odisha'
    elif(i['properties']['NAME_1']=='Uttaranchal'):
        i['properties']['NAME_1']='Uttarakhand'
        
for i in geojsonData['features']:
    i['id'] = i['properties']['NAME_1']
    

map_choropleth = folium.Map(location = [20.5937,78.9629], zoom_start = 4)

folium.Choropleth(geo_data=geojsonData,
                 data=statecases,
                 name='CHOROPLETH',
                 key_on='feature.id',
                 columns = ['Name of State / UT','Total Confirmed cases'],
                 fill_color='YlOrRd',
                 fill_opacity=0.7,
                 line_opacity=0.8,
                 legend_name='Confirmed Cases',
                 highlight=True).add_to(map_choropleth)

folium.LayerControl().add_to(map_choropleth)
display(map_choropleth)
In [7]:
zones=pd.read_csv('/kaggle/input/covid19indiazones/India-District-Zones.csv')

Zone Division

Coronavirus cases in India are rising steadily, though experts believe the country has not entered in stage three yet. So far, 590 people have died due to the virus in India, while total tally has topped 18,601. Restrictions in red zones remain the same and it'll take some time before they are lifted from such high-risk areas. The central government's strategy to contain the coronavirus in the second phase of the lockdown includes dividing the districts based on the number of coronavirus cases and major testing exercise. Based on this, the government has divided the states into three zones -- red zone, orange zone and green zones. As per the orders, 170 districts of India's 720 districts have been declared as 'Red Zones' also known as hotspots areas. Two hundred and seven districts have been marked as non-hotspot zones

In [8]:
import plotly.express as px
fig = px.treemap(zones, path=['State','District'],
                  color='Zone', hover_data=['Zone'], color_discrete_map={'Red Zone':'red', 'Green Zone':'green', 'Orange Zone':'orange'})
autosize=False
py.iplot(fig)

Hover over the blocks in your states to know if your district is in the Red Zone.

In [9]:
data.shape
Out[9]:
(1821, 9)

State wise analysis of corona virus data.

In [10]:
# Create a plot
plt.figure(figsize=(8,12))

# Add title
plt.title("Total cases by state")

grouped_data=data.groupby("Name of State / UT").sum()

sns.barplot(x=grouped_data['Total Confirmed cases'], y=grouped_data.index)

data['Total Confirmed cases'].sum()
Out[10]:
973344

Total confirmed cases by date

In [11]:
grouped_by_date_data=data.groupby("Date").sum()

plt.figure(figsize=(17,16))

plt.xticks(rotation=90)
sns.lineplot(data=grouped_by_date_data["Total Confirmed cases"],label="Total Confirmed cases")
Out[11]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f7428057f98>
In [12]:
symptoms={'symptom':['Fever',
        'Dry cough',
        'Fatigue',
        'Sputum production',
        'Shortness of breath',
        'Muscle pain',
        'Sore throat',
        'Headache',
        'Chills',
        'Nausea or vomiting',
        'Nasal congestion',
        'Diarrhoea',
        'Haemoptysis',
        'Conjunctival congestion'],
        'percentage':[87.9,67.7,38.1,33.4,18.6,14.8,13.9,13.6,11.4,5.0,4.8,3.7,0.9,0.8],
          'parent':['high','high','high','high','medium','medium','medium','medium','medium','medium','low','low','low',"low"]}

symptoms=pd.DataFrame(symptoms)
In [13]:
fig =px.sunburst(
    symptoms,
    path=['symptom','parent'],
    values='percentage',
    color='percentage')
autosize=False
py.iplot(fig)

Number of Active Cases

In [14]:
plt.figure(figsize=(20,20))

# Add title
plt.title("Active cases")

#Total Active cases
data["Active cases"]=data["Total Confirmed cases"]-data["Cured/Discharged/Migrated"]-data["Death"]

grouped_active_data=data.groupby("Name of State / UT").sum()
grouped_active_data=(grouped_active_data.sort_values(by="Active cases"))
plt.xticks(rotation=90)
sns.barplot(x=grouped_active_data['Active cases'], y=grouped_active_data.index,)
Out[14]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f7417ef2240>
In [15]:
plt.figure(figsize=(12,6))
plt.plot(grouped_by_date_data["Total Confirmed cases"].diff().fillna(0),linewidth=3,label="Confirmed Cases")
plt.plot(grouped_by_date_data["Cured/Discharged/Migrated"].diff().fillna(0),linewidth=3,label="Recovered Cases")
plt.plot(grouped_by_date_data["Death"].diff().fillna(0),linewidth=3,label="Death Cases")
plt.ylabel("Increase in Number of Cases")
plt.xlabel("Date")
plt.title("Daily increase in different types of cases in India")
plt.xticks(rotation=90)
plt.legend()
Out[15]:
<matplotlib.legend.Legend at 0x7f7417d4c3c8>
In [16]:
def color_negative_red(val):
    """
    Takes a scalar and returns a string with
    the css property `'color: red'` for negative
    strings, black otherwise.
    """
    color = 'red' if val > 30 else 'green'
    return 'color: %s' % color
In [17]:
total_cases=grouped_data["Total Confirmed cases"].sum()
rec_cases=grouped_data["Cured/Discharged/Migrated"].sum()
death=grouped_data["Death"].sum()
y_axis=[total_cases,rec_cases,death]
x_axis=["Total Confirmed cases","Recovered","Deaths"]
In [18]:
plt.figure(figsize=(8,5))
sns.barplot(x=x_axis, y=y_axis)
plt.title('Status of affected people.')
plt.xlabel('Status', fontsize=15)
plt.ylabel('Count of people affected', fontsize=15)
plt.show()

Importing and visualization of Statewise Testing Data

In [19]:
testing=pd.read_csv("/kaggle/input/covid19-in-india/StatewiseTestingDetails.csv")
testing.tail()
Out[19]:
Date State TotalSamples Negative Positive
912 2020-05-06 West Bengal 30141.0 NaN 1456.0
913 2020-05-07 West Bengal 32752.0 NaN 1548.0
914 2020-05-08 West Bengal 35767.0 NaN 1678.0
915 2020-05-09 West Bengal 39368.0 NaN 1786.0
916 2020-05-10 West Bengal 43414.0 NaN 1939.0
In [20]:
sttest=testing.groupby("State").sum()
In [21]:
plt.figure(figsize=(20,20))
plt.barh(sttest.index,sttest['TotalSamples'],label="Total Samples",color='black')
plt.barh(sttest.index, sttest['Positive'],label="Positive Cases",color='coral')
plt.xlabel('Cases',size=30)
plt.ylabel("States",size=30)
plt.legend(frameon=True, fontsize=12)
plt.title('Recoveries and Total Number of Cases Statewise',fontsize = 20)
plt.show()
In [22]:
age=pd.read_csv("/kaggle/input/covid19-in-india/AgeGroupDetails.csv")
In [23]:
age
Out[23]:
Sno AgeGroup TotalCases Percentage
0 1 0-9 22 3.18%
1 2 10-19 27 3.90%
2 3 20-29 172 24.86%
3 4 30-39 146 21.10%
4 5 40-49 112 16.18%
5 6 50-59 77 11.13%
6 7 60-69 89 12.86%
7 8 70-79 28 4.05%
8 9 >=80 10 1.45%
9 10 Missing 9 1.30%
In [24]:
plt.figure(figsize=(10,10))
#plt.title("Current age group scenario in india",fontsize=50)
labels=age['AgeGroup']
len(labels)
sizes=['3.18','3.9','24.86','21.10','16.18','11.13','12.86','4.05','1.45','1.3']
plt.pie(sizes,labels=labels,autopct='%1.1f%%')
plt.show() 

Visualization of Hospitals/ medical care units data (Beds available, Hospitals etc)

In [25]:
hospbeds=pd.read_csv("/kaggle/input/covid19-in-india/HospitalBedsIndia.csv")
hospbeds.head()
Out[25]:
Sno State/UT NumPrimaryHealthCenters_HMIS NumCommunityHealthCenters_HMIS NumSubDistrictHospitals_HMIS NumDistrictHospitals_HMIS TotalPublicHealthFacilities_HMIS NumPublicBeds_HMIS NumRuralHospitals_NHP18 NumRuralBeds_NHP18 NumUrbanHospitals_NHP18 NumUrbanBeds_NHP18
0 1 Andaman & Nicobar Islands 27 4 NaN 3 34 1246 27 575 3 500
1 2 Andhra Pradesh 1417 198 31.0 20 1666 60799 193 6480 65 16658
2 3 Arunachal Pradesh 122 62 NaN 15 199 2320 208 2136 10 268
3 4 Assam 1007 166 14.0 33 1220 19115 1176 10944 50 6198
4 5 Bihar 2007 63 33.0 43 2146 17796 930 6083 103 5936

Filling the null values with 0.

In [26]:
hospbeds= hospbeds.fillna(0)
hospbeds.head()
Out[26]:
Sno State/UT NumPrimaryHealthCenters_HMIS NumCommunityHealthCenters_HMIS NumSubDistrictHospitals_HMIS NumDistrictHospitals_HMIS TotalPublicHealthFacilities_HMIS NumPublicBeds_HMIS NumRuralHospitals_NHP18 NumRuralBeds_NHP18 NumUrbanHospitals_NHP18 NumUrbanBeds_NHP18
0 1 Andaman & Nicobar Islands 27 4 0.0 3 34 1246 27 575 3 500
1 2 Andhra Pradesh 1417 198 31.0 20 1666 60799 193 6480 65 16658
2 3 Arunachal Pradesh 122 62 0.0 15 199 2320 208 2136 10 268
3 4 Assam 1007 166 14.0 33 1220 19115 1176 10944 50 6198
4 5 Bihar 2007 63 33.0 43 2146 17796 930 6083 103 5936
In [27]:
centers=['NumPrimaryHealthCenters_HMIS','NumCommunityHealthCenters_HMIS','NumSubDistrictHospitals_HMIS','NumDistrictHospitals_HMIS','TotalPublicHealthFacilities_HMIS','NumPublicBeds_HMIS','NumRuralHospitals_NHP18','NumRuralBeds_NHP18','NumUrbanHospitals_NHP18','NumUrbanHospitals_NHP18']

Changing datatypes of columns that are not already in integer format.

In [28]:
hospbeds['NumPrimaryHealthCenters_HMIS'] = hospbeds['NumPrimaryHealthCenters_HMIS'].str.replace(',', '')
hospbeds['NumPrimaryHealthCenters_HMIS']=hospbeds['NumPrimaryHealthCenters_HMIS'].astype(str).astype(int)
In [29]:
hospbeds.dtypes
Out[29]:
Sno                                   int64
State/UT                             object
NumPrimaryHealthCenters_HMIS          int64
NumCommunityHealthCenters_HMIS        int64
NumSubDistrictHospitals_HMIS        float64
NumDistrictHospitals_HMIS             int64
TotalPublicHealthFacilities_HMIS      int64
NumPublicBeds_HMIS                    int64
NumRuralHospitals_NHP18               int64
NumRuralBeds_NHP18                    int64
NumUrbanHospitals_NHP18               int64
NumUrbanBeds_NHP18                    int64
dtype: object

Statewise analysis of medical care

In [30]:
plt.figure(figsize=(20,60))
for i,col in enumerate(centers):
    plt.subplot(8,2,i+1)
    sns.barplot(data=hospbeds,y='State/UT',x=col)
    plt.xlabel('Number of Cases')
    plt.ylabel('')
    plt.title(col)
plt.tight_layout()
plt.show()
In [31]:
icmrtestlabs= pd.read_csv("/kaggle/input/covid19-in-india/ICMRTestingLabs.csv")
icmrtestlabs
Out[31]:
lab address pincode city state type
0 ICMR-Regional Medical Research Centre, Port Blair ICMR-Regional Medical Research Centre, Post Ba... 744103 Port Blair Andaman and Nicobar Islands Government Laboratory
1 Tomo Riba Institute of Health & Medical Scienc... National Highway 52A, Old Assembly Complex, Na... 791110 Naharlagun Arunachal Pradesh Collection Site
2 Sri Venkateswara Institute of Medical Sciences... Sri Venkateswara Institute of Medical Sciences... 517507 Tirupati Andhra Pradesh Government Laboratory
3 Rangaraya Medical College, Kakinada Rangaraya Medical College, Kakinada Pithampura... 533001 Kakinada Andhra Pradesh Government Laboratory
4 Sidhartha Medical College, Vijaywada Siddhartha Medical College, Vijayawada NH 16 S... 520008 Vijayawada Andhra Pradesh Government Laboratory
... ... ... ... ... ... ...
262 Tata Medical Center, Kolkata Department of Laboratory Sciences, Tata Medica... 700160 Kolkata West Bengal Private Laboratory
263 Laboratory Services, Peerless Hospitex Hospita... Laboratory Services, Peerless Hospitex Hospita... 700094 Kolkata West Bengal Private Laboratory
264 AMRI Hospitals, Department of Laboratory Medic... AMRI Hospitals, Department of Laboratory Medic... 700098 Kolkata West Bengal Private Laboratory
265 Suraksha Diagnostics Pvt. Ltd., Kolkata Suraksha Diagnostics Pvt. Ltd., 12/1, Premises... 700156 Kolkata West Bengal Private Laboratory
266 Dr. Lal PathLabs Ltd, Reference Laboratory, Ko... Dr. Lal PathLabs Ltd, Reference Laboratory, Pl... 700156 Kolkata West Bengal Private Laboratory

267 rows × 6 columns

In [32]:
import plotly.express as px
fig = px.treemap(icmrtestlabs, path=['state','city'],
                  color='city', hover_data=['lab','address'],color_continuous_scale='Purples')
autosize=False
py.iplot(fig)
counter free