Benford's Analysis - State Elections India

This notebook was made to supplement one of my blogpost - Benford's Law as a Fraud Detection technique. Although, you'll be able to follow it along without reading the blogpost. You just need to know what Benford's law says. In this notebook, we'll see how to do a Benford's analysis and by doing so, we'll see if our dataset (historical temperatures of India) will follow the Benford's law.


  1. A basic exploration first
  2. Benford's Analysis


My source of data was the mailing list I follow - Data is Plural by Jeremy Singer-Vine

Five states in India, representing nearly 250 million residents — Punjab, Uttar Pradesh, Uttarakhand, Goa, and Manipur — have already held legislative assembly elections this year. India’s Election Commission publishes these results, but only as webpages. A couple of Hyderabad-based developers have scraped the website, and published CSVs of the data on GitHub.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline
In [3]:
data_file = "Data/Indian State Elections 2016/All_States_Constituencies_combined_2017.csv"
data = pd.read_csv(data_file)
Candidate Party Votes Constituency State
0 DR. PRAMOD PANDURANG SAWANT Bharatiya Janata Party 10058 Sanquelim Goa
1 DHARMESH SAGLANI Indian National Congress 7927 Sanquelim Goa
2 SURESH AMONKAR Goa Suraksha Manch 3831 Sanquelim Goa
3 PRATAP PRABHAKAR GAUNS Nationalist Congress Party 857 Sanquelim Goa
4 MILIND GAUNS Aam Aadmi Party 408 Sanquelim Goa

A basic exploration first

In [5]:
(7849, 5)

That's okayish number of rows. Not that great.


In [8]:
[i for i in data.State.unique()]
['Goa', 'Manipur', 'Punjab', 'Uttarakhand', 'Uttar Pradesh']

So, as the data description says, there all 5 states in the dataset. Goa is in Western India, Manipur in Eastern, and rest are in Northern part of India.

Political Parties

In [9]:
array(['Bharatiya Janata Party', 'Indian National Congress',
       'Goa Suraksha Manch', 'Nationalist Congress Party',
       'Aam Aadmi Party', 'Independent', 'Goa Su-Raj Party',
       'None of the Above', 'Maharashtrawadi Gomantak',
       'Samajwadi Janata Party (ChandraShekhar)', 'Goa Forward Party',
       'Communist Party of India', 'Shivsena',
       'Niz Goenkar Revolution Front', 'Goa Vikas Party',
       'United Goans Party', 'Goa Praja Party', 'Bahujan Mukti Party',
       'Ambedkarite Party of India', 'Manipur National Democratic Front',
       "National People's Party", 'Lok Jan Shakti Party',
       'All India Trinamool Congress', 'Naga Peoples Front',
       'North East India Development Party', 'Manipur People’s Party',
       'Republican Party of India (A)',
       'Peoples Resurgence and Justice Alliance', 'All India Forward Bloc',
       'Communist Party of India  (Marxist)',
       "People's Democratic Alliance",
       "Manipur Democratic Peoples's Front", 'Lok Insaaf Party',
       'Bahujan Samaj Party', 'Jammu & Kashmir National Panthers Party',
       'Aapna Punjab Party', 'Democratic Party of India (Ambedkar)',
       'Hindustan Shakti Sena', 'Sada Punjab Party', 'Shiromani Akali Dal',
       'Revolutionary Marxist Party of India',
       'Shiromani Akali Dal (Amritsar)(Simranjit Singh Mann)',
       'Inqalab Vikas Dal', 'Democratic Party of India',
       'Samaj Adhikar Kalyan Party', 'Revolutionary Socialist Party',
       'Bahujan Samaj Party  (Ambedkar)', 'Shiromani Lok Dal Party',
       'Democratic Swaraj Party', 'Bharatrashtra Democratic Party',
       'Punjab Democratic Party', 'Jai Jawan Jai Kisan Party',
       'Democratic Bharatiya Samaj Party',
       'Rashtriya Krantikari Samajwadi Party', 'Indian Krantikari Lehar',
       'Akhil Bharatiya Sangharsh Dal', 'Bharatiya Rashtravadi Paksha',
       'National Adhikar Insaf Party', 'Indian Union Muslim League',
       'Communist Party of India  (Marxist-Leninist)  (Liberation)',
       'Bharatiya Republican Paksha', 'Ojaswi Party',
       'Democratic Congress Party', 'Bahujan Sangharshh Dal',
       'Lokpriya Samaj Party', 'Swabhiman Party', 'Hindustan Utthan Party',
       'Krantikari Yuva Party', 'Hindusthan Nirman Dal',
       'Communist Party of India (Marxist-Leninist) Red Star',
       'Bharatiya Gaon Taj Dal', 'Punjab Labour Party',
       'Bharat (Integrated) Rakshak Party', 'Uttarakhand Kranti Dal',
       'UTTARAKHAND PARIVARTAN PARTY', 'Sainik Samaj Party',
       'Samajwadi Party', 'Bharatiya Subhash Sena',
       'Bharatiya Momin Front', 'Uttarakhand Kranti Dal (Democratic)',
       'Rashtriya Jan Sahay Dal', 'Bharat Ki Lok Jimmedar Party',
       'Rashtriya Lok Dal', 'Uttarakhand Raksha Morcha',
       'Prajamandal Party', 'Rashtriya Adarsh Party',
       'Indian Business Party', 'Sarv Vikas Party', 'Peace Party',
       'Bhartiya Sarvodaya Party', 'Hamari Janmanch Party',
       'AARAKSHAN VIRODHI PARTY', 'People’s Party', 'Bharat Kaumi Dal',
       'Lok Shahi Party (Secular)', 'Rashtriya Uttarakhand Party',
       'Nirbal Indian Shoshit Hamara Aam Dal', 'Rashtravyapi Janta Party',
       'Apna Dal', 'Lok Dal', 'Pichhravarg Mahapanchayat Party',
       'Hamari Apni Party', 'Mahan Dal', 'Kisan Majdoor Berojgar Sangh',
       'Sarv Sambhaav Party', 'Swatantra Jantaraj Party',
       'Jan Adhikar Party', 'Rashtriya Janshanti Party',
       'Bhartiya Shakti Chetna Party', 'Sarvodaya Bharat Party',
       'Moderate Party', 'Mahakranti Dal', 'Bhartiya  Sarvjan Party',
       'Suheldev Bhartiya Samaj Party', 'Lok Gathbandhan Party',
       'Ambedkar Samaj Party', 'All India Majlis-E-Ittehadul Muslimeen',
       'Bhartiya Bhaichara Party', 'Prajashakti Party Samdarshi',
       'Kalyankari Jantantrik Party', 'Pragatisheel Manav Samaj Party',
       'Mahamukti Dal', 'Bhartiya Krishak Dal', 'Jan Adhikar Manch',
       'Bahujan Maha Party', "All India Peoples' Front (Radical)",
       'Rashtriya Samaj Paksha', 'Moulik Adhikar Party',
       'Shoshit Samaj Dal', 'Rashtriya mahan Gantantra Party',
       'VANCHITSAMAJ INSAAF PARTY', 'Rashtriya Janadhar  Party',
       "Poorvanchal People's Party", 'Aadim Samaj Party',
       'Bhartiya Republican Party (Insan)', 'Rashtravadi Pratap Sena',
       'Manuvadi Party', 'Sabka Dal United', 'Indian Savarn Samaj Party',
       'Param Digvijay Dal', 'Rashtriya Shahri Vikas Party',
       'Jan Sevak Party', 'Apna Dal (Soneylal)',
       'Bhartiya Rashtriya Jansatta', 'Gondvana Gantantra Party',
       'Jan Jan Party', 'Rashtrawadi Praja Dal', 'Rashtriya Lokraj Party',
       'Desh Shakti Party', 'Pragatisheel Samaj Party',
       'Azad Mazdoor Kissan Party', 'Bharatiya Bahujan Parivartan  Party',
       'Rashtriya Sarvajan Party', 'Proutist Bloc, India',
       'Manav Hit Party', 'Rashtriya Parivartan Dal',
       'Rashtriya Kranti Party', 'Sarva Samaj Kalyan Party',
       'Rashtriya Sawarn Dal', 'Jay Hind Jay Bharat Party',
       'National Apni Party', 'Rashtriya Kisan Majdoor Party',
       'Bhartiya Janta Dal', 'Ittehad-E-Millait Council', 'Rajlok Party',
       'Manavtawadi Samaj Party', 'Bharatiya Janvadi Party',
       'Dr. Bhimrao Ambedkar Dal', 'Adarsh Samaj Party',
       'Asankhya Samaj Party', 'Rashtriya Lok Samta Party',
       'National Aman Party', 'Gareeb Kranti Party',
       'RASHTRIYA VIKLANG PARTY', 'Jai Hind Samaj Party',
       'Nagrik Ekta Party', 'Shoshit Samaj Party',
       'Bhartiya Imaandar Party', 'Labour Party of India (V.V. Prasad)',
       'Voters Party  International',
       'Rashtriya Congress (Babu Jagjivanram)',
       'Rashtriya Mazdoor Ekta Party', 'Rashtriya Congress(J) Party',
       'Bhartiya Samajik Kranti Dal', 'Federal Democratic Party',
       'Parivartan Samaj Party', 'Bharat Kranti Rakshak Party',
       'National Lokmat Party', 'Sarvajan Kalyan Loktantrik Party',
       'KIsan Shakti Jantantrik Party', 'Sampoorna Samaj Party',
       'All India Jan Congress Party', 'Public Political Party',
       'Jan Shakti Ekta Party', 'Jan Adhikar Party (Loktantrik)',
       'Akhil Bharatiya Jan Sangh', 'National Youth Party',
       'Manviya Bharat Party', 'Fauji Janta Party',
       'Uttar Pradesh Republican Party', 'Rashtriya Shoshit Samaj Party',
       'Bhagidari Kranti Dal', 'Akhil Bhartiya Vikas Congress Party',
       'Parcham Party of India', 'Samaj Seva Party',
       'Bharatiya Eklavya Party', 'Most Backward Classes Of India',
       'Rastriya Aam Jan Seva Party', 'Rashtriya Vikalp Party',
       'Rashtriya Lokshakti Party', 'Bhartiya Tarak Samaj Party',
       'Rashtriya Jan  Adhikar Party', 'National Democratic Front',
       'Akhil Bhartiya Swarajya Dal', 'Nav Jankranti Party',
       'Rashtriya Bahujan Congress Party', 'Rashtriya Vikas Manch Party',
       'Manavadhikar Janshakti Party', 'Bharti Samudaya Party',
       'Bharat Kalyan Party', 'Manav Kalyan Munch', 'Nirmal India Party',
       'Swarajya Party Of India', 'Jan Sangharsh Virat Party',
       'Socialist Party (India)', 'Mahila Swabhiman Party',
       'Rastriya Insaaf Party', 'Rashtriya Samanta Dal',
       'Bahujan Awam Party', 'Naveen Samajwadi Dal',
       'Bhartiya Vanchitsamaj Party', 'Rashtriya Janta Party',
       'Mahila Sashaktikaran Party', 'Bhartiya Jan Sahyog Party',
       'Kisan Kranti Dal', 'Rashtriya Samajwadi Party  (Secular)',
       'Rashtriya Janhit Sangharsh Party', 'Samaj Sevak Party',
       'Adarsh Vyawastha Party', 'Loktantrik Samajwadi Party',
       'Deshbhakt Nirman Party', 'Janta Raj Party', 'Sarvjan Samta Party',
       'Aam Janta Party Rashtriya', 'Rashtriya Janpriya Party',
       'Bharatiya Karyasth Sena',
       'Bharatiya Rashtriya Bahujan Samaj Vikas Party',
       'Akhand Rashtrawadi Party', 'Bhartiya Naujawan Inklav Party',
       'Bharatiya Samrat Subhash  Sena', 'Bharatiya Samaj Dal',
       'Adhikar Party', 'Adarshwaadi Congress Party',
       'Janvadi Party(Socialist)', 'Shoshit Sandesh Party',
       'Yuva Vikas Party', 'Jan Shakti Dal', 'Poorna Swaraj Manch',
       'Akhil Bharat Hindu Mahasabha', 'Manav Kranti Party',
       'Rashtriya Ulama Council', 'Bharatiya Kisan Parivartan Party',
       'Minorities Democratic Party', 'Vikas Party', 'Awami Vikas Party',
       'United National Party', 'Republican Party of India',
       'Rashtriya Apna Dal', 'Apni Zindgi Apna Dal', 'Saman Adhikar Party',
       'United Democratic Front Secular', 'Awami Samta Party',
       'Rashtriya Janadhikar Party', 'Vaicharik Kranti Party',
       'Eklavya Samaj Party', 'Naitik Party',
       'Kisan Majdoor Suraksha Party',
       'All India Pichhada Jan Samaj Party', 'Abhay Samaj Party',
       'Bharatiya Democratic  Socialist Alliance  Party',
       'Pashchimi Uttar Pradesh Vikas Party', 'Apna Desh Party',
       'Ambedkar Yug Party', 'Aajad Bharat Party (Democratic)',
       'Justice Party', 'Samtawadi Republican Party',
       'Purvanchal Janta Dal', 'Sanyukt Samajwadi Dal',
       'Kisan Samaj Party(S)', 'Bhartiya Harit Party',
       'Akhand Samaj Party', 'Jansatta Party',
       'National Peace Party Secular', 'Satya Kranti Party',
       'Bharatiya Jan Kranti Dal (Democratic)',
       'New All India Congress Party', 'Rashtriya Mahila Janshakti Party',
       'Hind Congress Party', 'Rashtriya Kisan Vikas Party',
       'All India Minorities Front', 'Loktantrik Manavatavadi Party',
       'Bundelkhand Kranti Dal', 'Hind Jan Congress',
       'Bhartiya Insan Party', 'Aadhi Aabadi Party', 'Sarvshreshth Dal',
       'Bharat Lok Sewak Party', 'Bharat Janta Dal (Tughlaq)',
       'Bharat Mata Party', 'Apna Dal United Party',
       'Bhartiya Janta Dal (Integrated)', 'Bharatiya Jan Sabha',
       'Hum Sabki Party', 'Bhartiya Nav Vikas Party',
       'National Democratic Peoples Front', 'Jan Seva Sahayak Party',
       'Brij Kranti Dal', 'Rashtriya Garib Dal', 'Satya Shikhar Party',
       'Hindustan Janmorcha', 'Samajwadi Samaj Party',
       'Ambedkar National Congress', 'Gandhi Ekta Party',
       'Rashtrawadi Janwadi Manch', 'Bhartiya Sangam Party',
       'Aam Janata Party', 'Bharatiya Jan Berojgar Chhatra Dal',
       'Al-Hind Party', 'Republican Sena',
       'Rashtriya Jantantrik Bharat Vikas Party',
       'Bahujan Sangharsh Party (Kanshiram)', 'Adarsh Lok Dal',
       'Islam Party Hind', 'Rashtriya Samajwadi Jankranti Party',
       'Samta Samjwadi Congress Party', 'Majdoor Kisan Union Party',
       'Bhartiya Kisan Janshakti Party', 'Pragatisheel Lok Manch',
       'Manavtawadi Kranti Dal', 'Samaj Parivartan Party',
       'Sanyukt Vikas Party', 'Nirdhan Samaj Party of India',
       'Bharat Nyay Dal', 'Swachh Swasth Swawlambijan Party',
       'Pichhada Samanya Alpsankhyak Vyapari Kisan Party',
       'Bhartiya Navodaya Party', 'Nasha Mukt Bharat Party',
       'Lok Raksha Party', 'Hind Vikas Party',
       'Bharatiya Kisan Sena Loktantrik', 'Rashtravadi Janata Party',
       'Hindustan Krantikari Dal', 'Bahujan Party', 'Lok Ekta Party',
       'Republican Party of India Ektavadi', 'Rashtriya Bandhutwa Party',
       'Manvadhikar National Party', 'Bharatiya Swajivi Nagrik Party',
       'Samajwadi Jan Parishad', 'Akhil Bharat Samagra Kranti Party',
       'Jan Kalyan Party', 'Bahujan Samaj Vikas Party',
       'Bharatiya Chhatrasangh Bharat', 'Suraksha Samaj Party'], dtype=object)

Holy shit! Don't expand that cell! That's a lot of political parties. I only knew a few of them. The major ones. Most of them must be alliances. And these are just 5 states. These are a lot of big states not in this list. I wonder how many political parties would there be in whole india.


In [26]:
count      7849.000000
mean      13981.559562
std       26560.548606
min          20.000000
25%         465.000000
50%        1023.000000
75%        9491.000000
max      262741.000000
Name: Votes, dtype: float64

The votes vary a lot. From 20 Votes (in state Goa. Feeling bad for the Candidate :/),they go up to 260K (in state Uttar Pradesh. Dayum dude! Nailed it).

In [47]:
data[(data.Votes==data.Votes.max()) | (data.Votes==data.Votes.min())]
Candidate Party Votes Constituency State
227 PARSHURAM DODAMANI Independent 20 Panaji Goa
6259 SUNIL KUMAR SHARMA Bharatiya Janata Party 262741 Sahibabad Uttar Pradesh
In [24]:
plt.figure(figsize=(15, 7))
plt.hist(data.Votes[data.Votes < 100000], bins=200)
plt.title("Histogram - Votes < 100000")

Pretty, right-skewed Histogram with a long tail, above kind of hints that Benford's law is being followed by the data.

2. Benford's Analysis

In [4]:
# First digit frequencies
freq = data.Votes.astype(str).str.replace("-", "").str.lstrip("0").str.lstrip(".").str.lstrip("0").str[0].value_counts()
freq = freq/data.shape[0]
print("First digit frequencies-\n")

# First frequency digit plot
plt.figure(figsize=(10,5)), freq, label='Calculated Frequency')
plt.plot(range(1, 10), [0.301, 0.176, 0.125, 0.097, 0.079, 0.067, 0.058, 0.051, 0.046], c='red', label='Theoretical Frequency')
plt.legend(loc='upper right')
plt.title("First digit frequencies")
First digit frequencies-

1    26.933367
2    13.606829
3    11.504650
4    10.370748
5    10.141419
6     8.752707
7     6.956300
8     6.179131
9     5.554848
Name: Votes, dtype: float64

As expected, it does follow the trend. Althouh, there are still hiccups here and there (1st and 2nd digit), but it follows the trend pretty closely.

It ought to follow the trend. Polulation data are found to follow the Benford's law, and our Votes data is kind of population data - Adult population who went to cast a vote on the election day. Good enough.