In this article we are going to visualize the history of air crashes that have happened between years 1993 to March of 2015. The data have been sourced from here. The data excludes small planes carrying less than 6 passengers and non-commercial aircrafts such as cargo, military and private.
The sourced data is in MS Excel file format and contains handful of columns such as date on which air crash happened, country in which it happened, the airline, fatality count, cause of the crash and phase of the flight during which crash occurred. Let us have a look at the dimensions and measures that needs to be understood in order to create different visualizations from this dataset.
The below table doesn’t contain all the columns available in the data source but only those which are meaningful from visualization creation perspective.
Data Exploration & Visualization
Step 1 – Connect to the data.
This is the preview of the data source that we have connected to. As can be seen all the dimensions and measures that we described in the above section are available in the data source.
Ignore the Tableau data interpreter warning.
Step 2 – Go to Sheet 1 and analyse/review the loaded data.
Step 3 – Fatality Count on Map
From this step onwards we will slice and dice the data and create different views of the data and at the same time create different visualizations to understand the data better and to derive insights from the same.
Double click on Country and put sum of Fatality count on colour and label. The colour palette can be chosen as per your own wish and you can play around with the advanced setting of the colour palette to suit your needs. Here is how the visualization looks like and it can be seen that majority of the fatalities have happened in USA followed by Russia, Iran, India and China.
There are some locations which you will need to correct manually as shown below and some due to lack of details can’t be corrected hence I have left them as Unknown.
You can rename the sheet as Fatality by Country
Step 4 – Fatality by Airline
This is yet another interesting question to be asked which airline company has seen lot of fatalities. We will create a tree map for this analysis and filter the airline which has seen more than 100 fatalities in total.
As can be seen this analysis falls in line with the analysis done in step above. American Airlines and United Airlines both operate out of USA which has seen the most number of fatalities. It will be interesting to see the cause of fatalities happening in Russia, my guess is it will be mostly due to en route issues.
Step 5 – Fatality trend over time
To create this line chart put Date on column and count of Fatality on row. As can be seen fatalities have decreased over time with the only exception of 2011 where incident of 9/11 happened. This shows that we are learning from our past mistakes and are trying hard to avoid fatalities. It will be interesting to see the variation in causes in 1993 versus recent years.
Step 6 – Fatality Cause trend over time
To create this visualization put Date on column and Cause and Count of Fatality on rows. As expected, fatality due to human error and mechanical failure has decreased over time in general.
Step 7 – Fatality by phase
To create this visualization put Phase on column and Count of Fatality on rows.
You will see lot of different phases meaning the same, e.g. take off, takeoff, initial takeoff and initial_climb all are same hence i have grouped them together in Take off. Groups are shown as below.
Now use this group created on phase instead of Phase to create the visual. As can be seen 48% of fatalities have happened during landing and other 36% while flight is en route. It will be interesting to see what causes are impacting the fatalities during this phases.
Step 8 – Cause vs. Phase analysis
To create this visualization put Cause on column and Phase group created in above step on row. Put count of Fatality on Size as well as Label (as quick table calculation – percent of total).
As can be seen below, during landing phase human error is the major cause of fatality followed by weather. Human error is also the main cause of fatality happening while flight is en route. Same goes for fatalities happening while take off.
Step 9 – Cause of fatality by year
Another interesting visualization could be to see the % of fatality by cause for each year. For this we can put Date on row and Count of fatality on column and cause on colour. Count of Fatality should be a table calculation calculated using table across.
As can be seen for all the years human error is the leading factor for fatalities happening.
That is it for this time; stay tuned for more learning with Tableau.
One can visit the official Tableau website to find more details about Tableau and its product offering and features.