DATA SCIENCE ZING
  • Data Science
  • Machine Learning
  • Deep Learning
  • Artificial Intelligence
  • Big Data
  • Computer Vision
  • Blog
  • Advertise with us

Shiny Application to Analyse and Forecast Global Video Game Sales

12/14/2016

2 Comments

 
Picture
This analysis of global video game sales data is done as a part of various discussions on public data sets in  Kaggle.com . The data set can be downloaded from kaggle public data repository.

This data set provides information about Global, North America, Europe , Japan and Other country sales revenue in USD for different video game publishers. Data also contains details about the Platform, Year of sales  and Genre of the video games sold.


Click here to view  Shiny application ----------> VG SALES
​  
Download Github source code --------------> VG Sales Github 

Analysis

    ​Analysis  can be divided into two sections:

1) Analysis about the data in general
2) Analysis of data about each publisher.

General Data Analysis
  1.  Top publishers
  Top publishers can be identified by  taking the pivot table of total publishers and sorting them to identify the top 20 publishers.

The R code is :

​   sales_publisher<-as.data.frame(table(vgsales$Publisher))
    colnames(sales_publisher)<-c("publisher","numbers")
    sales_publisher<-sales_publisher[order(-sales_publisher$numbers),]
    top_20_sales_publisher<-head(sales_publisher,n=20)
    ggplot(top_20_sales_publisher,aes(x=reorder(publisher,numbers),y=numbers))+geom_bar(stat="identity",fill="orange")+theme_minimal()+coord_flip()+geom_text(aes(label=numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Publisher")+ggtitle("Top  Selling Publishers")


    2. Video Game Releases per year
       Total number of video game sales by year is identified by creating a pivot table of video game sales per year and creating bar plots for the video game sales per year

R Code:
    sales_year<-as.data.frame(table(vgsales$Year))
    colnames(sales_year)<-c("Year","Numbers")
    sales_year<-sales_year[-nrow(sales_year),]
    ggplot(sales_year,aes(x=Year,y=Numbers))+geom_bar(stat="identity",fill="lightgreen")+theme(axis.text=element_text(size=8))+geom_text(aes(label=Numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Year")+ggtitle("Video Game Sales by Year")


    3. Video Game Revenue per year
Total revenue of video game sales in a year is calculated by aggregating  global video game sales by year. 

R Code:
 sales_year_revenue<as.data.frame(aggregate(vgsales$Global_Sales,by=list(Year=vgsales$Year),FUN=sum))
   colnames(sales_year_revenue)<-c("Year","Sales")
    sales_year_revenue<-sales_year_revenue[-nrow(sales_year_revenue),]
    ggplot(sales_year_revenue,aes(x=Year,y=Sales))+geom_bar(stat="identity",fill="magenta")+theme(axis.text=element_text(size=8))+geom_text(aes(label=Sales),vjust=0.5,color="black",size=4.0)+ylab("Total  Sales Revenue")+xlab("Year")+ggtitle("Video Game Sales revenue by Year")


    4. Top Selling Platforms
   Top selling platforms are identified by creating a pivot table of gaming platforms and sorting them in descending order to find top 20 .

R Code: 
 sales_platform<-as.data.frame(table(vgsales$Platform))
    colnames(sales_platform)<-c("platform","Numbers")
    sales_platform<-sales_platform[order(-sales_platform$Numbers),]
    top_20_sales_platform<-head(sales_platform,n=20)
    ggplot(top_20_sales_platform,aes(x=reorder(platform,Numbers),y=Numbers))+geom_bar(stat="identity",fill="steelblue")+theme_minimal()+coord_flip()+geom_text(aes(label=Numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Platform")+ggtitle("Top  Selling Video Game Platforms")

    
Analysis by Publisher

The data is filtered based on publishers using Shiny dashboard and subset based the publisher name selected from the drop down menu.
  1. Top Selling Games
         Top selling games by a publisher is identified by sorting the total sales and identifying the top 20. This analysis is repeated for different countries and global.

R Code:
​ggplot(head(vgsales_publisher,n=20),aes(x=reorder(Name,Global_Sales),y=Global_Sales))+geom_bar(stat="identity",fill="steelblue")+theme_minimal()+coord_flip()+geom_text(aes(label=Global_Sales),vjust=0.5,color="black",size=4.0)+ylab("Global Sales in Millions of Dollars")+xlab("Video Game")+ggtitle("Top Global Selling Games")

       2. Top Selling Platforms
            Sales by platform is identified by aggregating platform based on sales revenue and creating pie-charts to undertstand the distribution. Then repeated for different countries.

R Code:
sales_platform_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Platform=vgsales_publisher$Platform),FUN=sum))
    colnames(sales_platform_global)<-c("platform","total_Sales")
     Pie1<-gvisPieChart(sales_platform_global,labelvar = "Platform",options = list(title="Global Sales by Platform",width=1000,height=500))


       
3. Top Selling Genre
                Sales by platform is identified by aggregating genre  based on sales revenue and creating pie-charts to undertstand the distribution. Then repeated for different countries.

R Code:

sales_genre_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Genre=vgsales_publisher$Genre),FUN=sum))
    colnames(sales_genre_global)<-c("genre","total_Sales")
    Pie1<-gvisPieChart(sales_genre_global,labelvar = "Genre",options = list(title="Global Sales by Genre",width=1000,height=500))

     
    
4. Sales By Year
           Sales by Year is calculated by aggregating sales with respect toevery year. This is then evaluated using a Line Chart and repeated for different countries.

R Code:
sales_year_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Year=vgsales_publisher$Year),FUN=sum))
    colnames(sales_year_global)<-c("Year","total_sales")
    sales_year_global<-sales_year_global[-nrow(sales_year_global),]
    line1<-gvisLineChart(sales_year_global ,options = list(title="Global Sales by Year",width=1000,height=500))




Forecasting

Forecasting of time series data related to video game sales per year of different publishers are based on following two forecasting models:

1) ARIMA  Model
2)ETS Model

The code and step by step procedure followed for building the model as in the blogs You Canalytics , Analytics Vidhya and Dataiku.

​




2 Comments
    Picture

      Subscribe now

    Subscribe to Newsletter
    Picture

    RSS Feed

    Categories

    All
    Basics
    Classification
    Courses
    Foreacsting
    Mapping
    R
    Shiny
    Visualization

    Picture
    Picture
    Picture
    200x200 Machine Learning Expert
    Picture
Powered by Create your own unique website with customizable templates.
  • Data Science
  • Machine Learning
  • Deep Learning
  • Artificial Intelligence
  • Big Data
  • Computer Vision
  • Blog
  • Advertise with us