R Programming Series: Data Wrangling and Visualization

Data wrangling & visualization

Previously, we have explored the concept of 3D visualization in R programming language by using the plot3d package. Now, we will get insights into the concepts of data wrangling and data visualization in R programming language.

Data wrangling is considered as the process of cleaning and unifying messy and complex data sets which calls for easy access and analysis. Data wrangling is also referred to as data munging. With the increasing amount of data and data sources, it is getting more important to focus on large amounts of available data which is organized for analysis.

This process typically includes all the manual conversions such as converting and mapping from one raw form into another format which allows more convenient consumption and organization for the data. We will now focus on the data which includes all the attributes of athletes so that we can analyze the rank and other holdings of players with respective countries.

Step 1

Include the necessary libraries which are needed for data wrangling and visualization in R.

Step 2

Understand the attributes of the data frame with structure and other dimensions. It is important to understand whether we need any exploratory data analysis which comes as a part of the data wrangling procedure. 

Dataset of athletes- 1

Step 3

We can observe that there are some missing values that need to be treated before starting data analysis and visualization in R.

Step 4

Once the missing values are treated we can focus on further steps on analysis and visualization. Let us focus on athletes of China who were successful to gain medals for countries.

Medal tally bar graph - 2

The bar graph depicts the count of medal tally rate is high for the United States in comparison with other countries mentioned in the list.

Step 5

Let us focus on the medal tally rate of countries every year which is described below:

Time series analysis

In the above-mentioned step, we summarize the medal tally rate every year and create an analysis with every country. The plot represents the time series analysis with cyclic patterns noticed from 1994 to 2016 and an uncertain pattern was noticed from 1896 to 1992.

So, this was all about Data Wrangling and Visualization in R programming!

In the next article, we will learn about Exploratory Data Analysis using R programming language.


Please enter your comment!
Please enter your name here