A crime map of India in R – Crimes against women

In this post I take a look at the gory crime scene across India to determine which states are the heavy weights in crimes. Who is the undisputed champion of rapes in a year? Which state excels in cruelty by husbands and the relatives to wives? Which state leads in dowry deaths? To get the answers to these questions I perform analysis of the state-wise crime data against women with the data  from Open Government Data (OGD) Platform India. The dataset  for this analysis was taken for the Crime against Women from OGD.

(Do see my post Revisiting crimes against women in India which includes an interactive Shiny app)

The data in OGD is available for crimes against women in different states under different ‘crime heads’ like rape, dowry deaths, kidnapping & abduction etc. The data is available for years from 2001 to 2012. This data is plotted as a scatter plot and a linear regression line is then fit on the available data. Based on this linear model,  the projected incidence of crimes likes rapes, dowry deaths, abduction & kidnapping is performed for each of the states. This is then used to build a table of  different crime heads for all the states predicting the number of crimes till the year 2018. Fortunately, R  crunches through the data sets quite easily. The overall projections of crimes against as women is shown below based on the linear regression for each of these states

Projections over the next couple of years
The tables below are based on the projected incidence of crimes under various categories assuming that these states maintain their torrid crime rate. A cursory look at the tables below clearly indicate the Uttar Pradesh is the undisputed heavy weight champion in 4 of 5 categories shown. Maharashtra and Andhra Pradesh take 2nd and 3rd ranks in the total crimes against women and are significant contenders in other categories too.

A) Projected rapes in India
The top 3 heavy weights in projected rapes over the next 5 years are 1) Madhya Pradesh  2) Uttar Pradesh 3) Maharashtra


Full table: Rape.csv
B) Projected Dowry deaths in India 

Full table: Dowry Deaths.csv
C) Kidnapping & Abduction

Full table: Kidnapping&Abduction.csv
D) Cruelty by husband & relatives

Full table: Cruelty by husbands_relatives.csv
E) Total crimes against women


Full table: Total crimes.csv
Here is a visualization of ‘Total crimes against women’  created as a choropleth map

1The implementation for this analysis was done using the  R language.  The R code, dataset, output and the crime charts can be accessed at GitHub at crime-against-women

Directory structure
– R code
dataset used

The analysis has been completely parametrized. A quick look at the implementation is shown  below. A function state crime was created as given below

This function (statecrime.R)  does the following
a) Creates a scatter plot for the state for the crime head
b) Computes a best linear regression fir and draws this line
c) Uses the model parameters (coefficients) to compute the projected crime in the years to come
d) Writes the projected values to a text file
c) Creates a directory with the name of the state if it does not exist and stores the jpeg of the plot there.

statecrime <- function(indiacrime, row, state,crime) {
year <- c(2001:2012)
# Make seperate folders for each state
if(!file.exists(state)) {
crimeplot <- paste(crime,".jpg")

# Plot the details of the crime
plot(year,thecrime ,pch= 15, col="red", xlab = "Year", ylab= crime, main = atitle,
,xlim=c(2001,2018),ylim=c(ymin,ymax), axes=FALSE)

A linear regression line is fit using ‘lm’

# Fit a linear regression model
lmfit <-lm(thecrime~year)
# Draw the lmfit line

The model parameters are then used to draw the line and also project for the next 5 years from 2013 to 2018

nyears <-c(2013:2018)
nthecrime <- rep(0,length(nyears))
# Projected crime incidents from 2013 to 2018 using a linear regression model
for (i in seq_along(nyears)) {
nthecrime[i] <- lmfit$coefficients[2] * nyears[i] + lmfit$coefficients[1]

The projected data for each state is appended into an appropriate file which is then used to display the tables at the top of this post

# Write the projected crime rate in a file
nthecrime <- round(nthecrime,2)
nthecrime <- c(state, nthecrime, "\n")
#write(nthecrime,file=fileconn, ncolumns=9, append=TRUE,sep="\t")
filename <- paste(crime,".txt")
# Write the output in the ./output directory
cat(nthecrime, file=filename, sep=",",append=TRUE)

The above function is then repeatedly called for each state for the different crime heads. (Note: It is possible to check the read both the states and crime heads with R and perform the computation repeatedly. However, I have done this the manual way!)

# 1. Andhra Pradesh
i <- 1
statecrime(indiacrime, i, "Andhra Pradesh","Rape")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Kidnapping& Abduction")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Dowry Deaths")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Assault on Women")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Insult to modesty")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Cruelty by husband_relatives")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Imporation of girls from foreign country")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Immoral traffic act")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Dowry prohibition act")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Indecent representation of Women Act")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Commission of Sati Act")
i <- i+38
statecrime(indiacrime, i, "Andhra Pradesh","Total crimes against women")

and so on for all the states

Charts for different crimes against women

1) Uttar Pradesh

The plots for  Uttar Pradesh  are shown below

Rapes in UP


Dowry deaths in UP

Dowry Deaths

Cruelty by husband/relative

Cruelty by husband_relatives

Total crimes against women in Uttar Pradesh

Total crimes against women

You can find more charts in GitHub by clicking Uttar Pradesh

2) Maharashtra : Some of the charts for Maharashtra



Kidnapping & Abduction

Kidnapping& Abduction

Total crimes against women in Maharashtra

Total crimes against women

More crime charts  for Maharashtra

Crime charts can be accessed for the following states from GitHub ( in alphabetical order)

3) Andhra Pradesh
4) Arunachal Pradesh
5) Assam
6) Bihar
7) Chattisgarh
8) Delhi (Added as an exception based on its notoriety)
9) Goa
10) Gujarat
11) Haryana
12) Himachal Pradesh
13) Jammu & Kashmir
14) Jharkhand
15) Karnataka
16) Kerala
17) Madhya Pradesh
18) Manipur
19) Meghalaya
20) Mizoram
21) Nagaland
22) Odisha
23) Punjab
24) Rajasthan
25) Sikkim
26) Tamil Nadu
27) Tripura
28) Uttarkhand
29) West Bengal

The code, dataset and the charts can be cloned/forked from GitHub at crime-against-women

Let me know if you find any interesting patterns in the data.
Thoughts, comments welcome!

See also
A peek into literacy in India: Statiscal learning with R

You may also like
– Analyzing cricket’s batting legends – Through the mirage with R
– What’s up Watson? Using IBM Watson’s QAAPI with Bluemix, NodeExpress – Part 1
– Bend it like Bluemix, MongoDB with autoscaling – Part 1


40 thoughts on “A crime map of India in R – Crimes against women

  1. Pingback: A peek into literacy in India: Statistical Learning with R | Giga thoughts …

  2. Pingback: Analyzing cricket’s batting legends – Through the mirage with R | Giga thoughts …

  3. Pingback: R incantations for the uninitiated | Giga thoughts …

  4. Pingback: Working with Node.js and PostgreSQL | Giga thoughts …

  5. Les violeurs ont les condamnent à leur tour à être violés 1 fois par jour pendant une semaine.
    Voudront-ils recommencer leurs exactions ???

  6. Pingback: Masters of Spin: Unraveling the web with R | Giga thoughts …

  7. Pingback: What’s up Watson? Using IBM Watson’s QAAPI with Bluemix, NodeExpress – Part 1 | Giga thoughts …

  8. Pingback: Informed choices through Machine Learning-2: Pitting together Kumble, Kapil, Chandra | Giga thoughts …

  9. Pingback: Re-working the Lucy Richardson algorithm in OpenCV | Giga thoughts …

  10. Pingback: Hand detection through Haartraining: A hands-on approach | Giga thoughts …

  11. Pingback: Experiments with deblurring using OpenCV | Giga thoughts …

  12. Pingback: De-blurring revisited with Wiener filter using OpenCV | Giga thoughts …

  13. Pingback: Deblurring with OpenCV: Weiner filter reloaded | Giga thoughts …

  14. Pingback: Powershell GUI – Adding bells and whistles | Giga thoughts …

  15. Pingback: Mirror, mirror … the best batsman of them all? | Giga thoughts …

  16. Pingback: Thinking Web Scale-1: Map-Reduce – Bring compute to data | Giga thoughts …

  17. Pingback: TWS-4: Gossip protocol: Epidemics and rumors to the rescue | Giga thoughts …

  18. Pingback: Into the Telecom vortex | Giga thoughts …

  19. Pingback: Installing and using OpenCV with Visual Studio 2010 express | Giga thoughts …

  20. Pingback: Adventures in LogParser, HTA and charts | Giga thoughts …

  21. Pingback: Tips for building a decent HTA/HTML application | Giga thoughts …

  22. Pingback: Stir fry a VBA with Excel application quickly | Giga thoughts …

  23. Pingback: Building a respectable VBA with Excel Application | Giga thoughts …

  24. Pingback: Get your feet wet with Powershell GUI | Giga thoughts …

  25. Pingback: Google’s Page Rank: Predicting the movements of a random web walker | Giga thoughts …

  26. Pingback: The common alphabet of programming languages | Giga thoughts …

  27. Pingback: The mind of a programmer | Giga thoughts …

  28. Pingback: How to program – Some essential tips | Giga thoughts …

  29. Pingback: Programming Zen and now – Some essential tips-2 | Giga thoughts …

  30. Pingback: Introducing cricketr! : A R package to analyze performances of cricketers | Giga thoughts …

  31. Pingback: Taking cricketr for a spin – Part 1 | Giga thoughts …

  32. Pingback: cricketr digs the Ashes! | Giga thoughts …

  33. Pingback: cricketr plays the ODIs! | Giga thoughts …

  34. Pingback: cricketr adapts to the Twenty20 International! | Giga thoughts …

  35. Pingback: Natural language processing: What would Shakespeare say? | Giga thoughts …

  36. Pingback: Revisiting crimes against women in India | Giga thoughts …

  37. Pingback: Re-introducing cricketr! : An R package to analyze performances of cricketers | Giga thoughts …

  38. The plot and projections are the total no of different type of crimes. It will be better if the crime rate is calculated and the rates in different states compared.

  39. Pingback: cricketr sizes up legendary All-rounders of yesteryear | Giga thoughts …

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s