In this blog, we will look at how to create word cloud systematically with the help of Tableau. One should understand what word cloud is and when it is typically used before getting into “How-To” part.
The Wikipedia definition of word cloud (a.k.a tag cloud) states that “word cloud is a visual representation for text data typically used to depict keyword metadata (tags) on websites, or to visualize free form text.” One can refer to the article (and various others on the Internet) to understand more details about word cloud.
The image below shows a sample word cloud of 100 most used passwords. One can easily interpret that “123456” is most used password as represented by its size followed by “password” followed by“12345678” and so on.
This article on bbc.com analyses Mr. Narendra Modi’s speech as a PM Candidate and as a PM. The image below is sourced from the same article, which depicts Mr.Modi’s words as Prime Minister.
The data has been sourced from howstat.com and formatted appropriately for Tableau’s consumption. This is the first, most important and often time-consuming step before data visualization and exploration can happen. We have batting data for One Day International (ODI) matches played between years 1971 to 2011 with close to 60,000 data points. The below table gives you a quick overview of important dimensions and measures present in the dataset.
|Player name||Score Rate (runs per 100 balls faced)|
As always, we will start with a question. Let us begin.
Who has scored more than 1000 runs against India?
Let us first conceptualize what we are trying to visualize and construct a series of steps to achieve the same.
We need to create a word cloud of Player names of various Countries that have scored equal or more than 1000 Runsversus India.
Note:The words in bold correspond to dimensions or measures we already have in our data.
Step 1: Connect to Data
Step 2: Go to Worksheet
Step 3: Setup a filter. In our case, the filter would be Versus = India
Step 4: Drag Player on to Label
Step 5: Drag Runs (by default Sum is chosen as aggregation method) on to Size
Step 6: Put a filter on Runs for criteria "at least 1000"
Step 7: Choose Marks as Text instead of Automatic. This is the key to creating a Word Cloud in any example that you build.
Step 8: Drag Country on to Color.
Word Cloud is ready. One can observe that Sanath Jayasuriya has scored the most number of runs against India followed by Inzamam and Ricky Ponting. In general, Sri Lankans, Australians and Pakistani batsmen have scored heavily against India. The reason is these four countries have played most ODI matches and have played very frequently against each other.
Surprising none of the England Batsmen feature in the visualization and three of the Zimbabwean batsmen appear in the list.
Here is the count of matches played by these countries against India.
Using Word Cloud for above analysis is certainly not right, tree map or bar chart is the best fit. As one would still be required to understand how much runs scored or how many number of matches are played by those players against India. The take away from this blog is how to create Word Cloud with Tableau. The best scenario for using word cloud is to analyse textual data, their frequency of occurrence. That said, one should be cautious, as Word Cloud emphasize on frequency of the word not necessarily their importance. In addition, they do not provide the context in which those words are used so again Word Clouds are good way to do some quick exploratory analysis of text.
Stay tuned for more exciting visualizations and learning with Tableau.
Tableau (NYSE : DATA ) headquartered in Seattle, Washington has a mission to help people see and understand data. It offers a product portfolio for data visualization focused on business intelligence.
One can visit the official Tableau website to find more details about Tableau and its product offering and features.
Global Association of Risk Professionals, Inc. (GARP®) does not endorse, promote, review or warrant the accuracy of the products or services offered by EduPristine for FRM® related information, nor does it endorse any pass rates claimed by the provider. Further, GARP® is not responsible for any fees or costs paid by the user to EduPristine nor is GARP® responsible for any fees or costs of any person or entity providing any services to EduPristine Study Program. FRM®, GARP® and Global Association of Risk Professionals®, are trademarks owned by the Global Association of Risk Professionals, Inc
CFA Institute does not endorse, promote, or warrant the accuracy or quality of the products or services offered by EduPristine. CFA Institute, CFA®, Claritas® and Chartered Financial Analyst® are trademarks owned by CFA Institute.
Utmost care has been taken to ensure that there is no copyright violation or infringement in any of our content. Still, in case you feel that there is any copyright violation of any kind please send a mail to email@example.com and we will rectify it.
2015 © Edupristine. ALL Rights Reserved.