Using BIG DATA for Formulating Government Policies

August 18, 2015
, , , ,

As we have seen in our older post the uses of big data and we also examined the use of big data in ecommerce in another post. Today we understand how this big data is used in forming government policy. Government collects all sorts of data from the public, it has access to birth data, death data, area of country, security data. Government can use this data to reform policies even in real time. For e.g. When the government has access to security data like how many soldiers are present in the border areas, coastal areas and hilly regions, it also knows the no. of penetrations from terrorists in different regions, it can see the current assigned soldiers and assigned equipment. Through all these information the government can formulate a plan for research and development of particular weapons, and same for training soldiers for particular tactics..

Now let see some working examples, here we have a dataset of forest area from all the districts of Indian States. This data is from and can be downloaded from there. Now let’s go into some practical action.

The objective is to find the best districts which have good amount of dense forest cover, and the districts which really need to increase their forest cover. Now this insight into the forest area in every district helps Government to know how much budget is to be given to forest development department.

For this we use hive and pig to extract the results. The use of Hadoop enables the real time processing of this data to give results. Hadoop increases the efficiency and better reliability in comparison to SQL databases.

First we loaded the data into hive table.

loading the data into Hive table

loading the data into Hive table

Now we were able to find the districts with best and least forest cover percentage to their land area according to the constraints as >75% and less than < 10% respectively.

map reduce operation

Now the command >>select * from forest where percentage >75; when used on MySQL generate the same result but the time taken is a lot when the no. of records are huge. But as we can see in screen shots that this query fires a map reduce operation that divide the file into parts and then searches for the records this decreases latency which we generally experience with SQL databases when handling large volumes.

map reduce operation

Now we have to find how many of these districts exist of each kind so that we can look how many district need special attention to increase their forest covers and look which cities don’t have any very dense forest so as to declare some parts as green belts.

map reduce operation

map reduce operation

map reduce operation

From the pictures above we can see count of all districts is 586 and count of districts which have more than 75% cover is 43, and the districts with <10% coverage are 25.

Using this information government can formulate plans, like giving special budget for these districts for forest development. Equipment grants for pipelines and gardeners can be employed.

Districts with very less forest covers including scrubs and open forests,can be asked to declare green belts in them so as to increase greenery.

Similarly we can find out the changes in current and past years in forest area using Big Data. This forest case study was just an example, similarly you can also find market rates, stock market predictions, real time pricing of products and country economy growth factors.


About the Author

Saharsh Jain is an Big Data and programming enthusiast. He is pursuing B.Tech at Maharaja Agrasen Institute of Technology in Information Technology Engineering.He loves to get his hands dirty on different programming languages.

Popular Courses

Global Association of Risk Professionals, Inc. (GARP®) does not endorse, promote, review or warrant the accuracy of the products or services offered by EduPristine for FRM® related information, nor does it endorse any pass rates claimed by the provider. Further, GARP® is not responsible for any fees or costs paid by the user to EduPristine nor is GARP® responsible for any fees or costs of any person or entity providing any services to EduPristine Study Program. FRM®, GARP® and Global Association of Risk Professionals®, are trademarks owned by the Global Association of Risk Professionals, Inc

CFA® Institute does not endorse, promote, or warrant the accuracy or quality of the products or services offered by EduPristine. CFA® Institute, CFA® Program, CFA® Institute Investment Foundations and Chartered Financial Analyst® are trademarks owned by CFA® Institute.

Utmost care has been taken to ensure that there is no copyright violation or infringement in any of our content. Still, in case you feel that there is any copyright violation of any kind please send a mail to and we will rectify it.

Post ID = 79658