DATA SCIENCE ZING
  • Data Science
  • Machine Learning
  • Deep Learning
  • Artificial Intelligence
  • Big Data
  • Computer Vision
  • Blog
  • Advertise with us

How to create histograms in R using simple, ggplot, googleVis methods

8/1/2017

0 Comments

 
Picture

Get the temperature data from airquality dataset

library(datasets) hist(airquality$Temp,main="Temperature variation in 5 Months",xlab="Temperature in Fahrenheit",right = FALSE) 

plot of chunk unnamed-chunk-1

Simple Histogram with colors

hist(airquality$Temp,main="Temperature variation in 5 Months",xlab="Temperature in Fahrenheit",right = FALSE,breaks=12,col = "blue") 

plot of chunk unnamed-chunk-2

colors = c("red", "yellow", "green", "violet", "orange","blue", "pink", "cyan") hist(airquality$Temp,main="Temperature variation in 5 Months",xlab="Temperature in Fahrenheit",right = FALSE,breaks=12,col = colors) 

plot of chunk unnamed-chunk-2

Using ggplot2

library(ggplot2) hist_ggplot<-ggplot(airquality,aes(x=Temp))+geom_histogram(binwidth = 5,fill="brown")+ggtitle("Temperature variation in 5 Months")+xlab("Temperature in Fahrenheit") plot(hist_ggplot) 

plot of chunk unnamed-chunk-3

Using googleVis

library(googleVis) op <- options(gvis.plot.tag='chart') hist_gvis<-gvisHistogram(data = airquality["Temp"],options = list(title="Temperature variation in 5 Months",width=900,height=700)) plot(hist_gvis) 
Picture
Picture
Picture
Picture
0 Comments

How to create pie-charts in R using simple,3D, ggplot2 and googleVis methods

6/9/2017

0 Comments

 

Create a dataset- Sales of different region

region<-c("US","Europe","Japan","China","Others")
sales<-c(25000,12000 ,10000,5000,2000)
region_sales<-data.frame(region,sales)

Simple Normal Piechart

pie(sales,labels = region,main="Sales per region")

plot of chunk unnamed-chunk-2

Simple Pie chart with Percentage lables

pct<-round(sales/sum(sales)*100)
label<-paste(region,pct)
label<-paste(label,"%",sep="")
pie(sales,labels = label,col=rainbow(length(label)),main="Sales per region")

plot of chunk unnamed-chunk-3

3D Pie Chart

library(plotrix)
pie3D(sales,labels = region,main="3D Pie chart of Sales per region")

plot of chunk unnamed-chunk-4

Using ggplot2

library(ggplot2)
bar<-ggplot(region_sales,aes(x="",y=sales,fill=region))+geom_bar(width=1,stat="identity")
bar

plot of chunk unnamed-chunk-5

pie<-bar+coord_polar("y",start=0)+scale_fill_brewer(palette = "Dark2")+theme_minimal()
pie

plot of chunk unnamed-chunk-5

Using ggplot2 -Percentage Annotation

ggplot(transform(transform(region_sales, sales=sales/sum(sales)), labPos=cumsum(sales)-sales/2), 
       aes(x="", y = sales, fill = region)) +
  geom_bar(width = 1, stat = "identity") +
  scale_fill_manual(values = c("red", "yellow","blue", "green", "cyan")) +
  coord_polar(theta = "y") +
  labs(title = "Percentage Sales per Region") + 
  geom_text(aes(y=labPos, label=scales::percent(sales)))

plot of chunk unnamed-chunk-6

Using googleVis

library(googleVis)
op <- options(gvis.plot.tag = "chart")
pie <- gvisPieChart(region_sales, options = list(title = "Sales per region", 
    width = 1000, height = 500))
plot(pie)

0 Comments

Shiny Application to Analyse and Forecast Global Video Game Sales

12/14/2016

2 Comments

 
Picture
This analysis of global video game sales data is done as a part of various discussions on public data sets in  Kaggle.com . The data set can be downloaded from kaggle public data repository.

This data set provides information about Global, North America, Europe , Japan and Other country sales revenue in USD for different video game publishers. Data also contains details about the Platform, Year of sales  and Genre of the video games sold.


Click here to view  Shiny application ----------> VG SALES
​  
Download Github source code --------------> VG Sales Github 

Analysis

    ​Analysis  can be divided into two sections:

1) Analysis about the data in general
2) Analysis of data about each publisher.

General Data Analysis
  1.  Top publishers
  Top publishers can be identified by  taking the pivot table of total publishers and sorting them to identify the top 20 publishers.

The R code is :

​   sales_publisher<-as.data.frame(table(vgsales$Publisher))
    colnames(sales_publisher)<-c("publisher","numbers")
    sales_publisher<-sales_publisher[order(-sales_publisher$numbers),]
    top_20_sales_publisher<-head(sales_publisher,n=20)
    ggplot(top_20_sales_publisher,aes(x=reorder(publisher,numbers),y=numbers))+geom_bar(stat="identity",fill="orange")+theme_minimal()+coord_flip()+geom_text(aes(label=numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Publisher")+ggtitle("Top  Selling Publishers")


    2. Video Game Releases per year
       Total number of video game sales by year is identified by creating a pivot table of video game sales per year and creating bar plots for the video game sales per year

R Code:
    sales_year<-as.data.frame(table(vgsales$Year))
    colnames(sales_year)<-c("Year","Numbers")
    sales_year<-sales_year[-nrow(sales_year),]
    ggplot(sales_year,aes(x=Year,y=Numbers))+geom_bar(stat="identity",fill="lightgreen")+theme(axis.text=element_text(size=8))+geom_text(aes(label=Numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Year")+ggtitle("Video Game Sales by Year")


    3. Video Game Revenue per year
Total revenue of video game sales in a year is calculated by aggregating  global video game sales by year. 

R Code:
 sales_year_revenue<as.data.frame(aggregate(vgsales$Global_Sales,by=list(Year=vgsales$Year),FUN=sum))
   colnames(sales_year_revenue)<-c("Year","Sales")
    sales_year_revenue<-sales_year_revenue[-nrow(sales_year_revenue),]
    ggplot(sales_year_revenue,aes(x=Year,y=Sales))+geom_bar(stat="identity",fill="magenta")+theme(axis.text=element_text(size=8))+geom_text(aes(label=Sales),vjust=0.5,color="black",size=4.0)+ylab("Total  Sales Revenue")+xlab("Year")+ggtitle("Video Game Sales revenue by Year")


    4. Top Selling Platforms
   Top selling platforms are identified by creating a pivot table of gaming platforms and sorting them in descending order to find top 20 .

R Code: 
 sales_platform<-as.data.frame(table(vgsales$Platform))
    colnames(sales_platform)<-c("platform","Numbers")
    sales_platform<-sales_platform[order(-sales_platform$Numbers),]
    top_20_sales_platform<-head(sales_platform,n=20)
    ggplot(top_20_sales_platform,aes(x=reorder(platform,Numbers),y=Numbers))+geom_bar(stat="identity",fill="steelblue")+theme_minimal()+coord_flip()+geom_text(aes(label=Numbers),vjust=0.5,color="black",size=4.0)+ylab("Total Number of Sales")+xlab("Platform")+ggtitle("Top  Selling Video Game Platforms")

    
Analysis by Publisher

The data is filtered based on publishers using Shiny dashboard and subset based the publisher name selected from the drop down menu.
  1. Top Selling Games
         Top selling games by a publisher is identified by sorting the total sales and identifying the top 20. This analysis is repeated for different countries and global.

R Code:
​ggplot(head(vgsales_publisher,n=20),aes(x=reorder(Name,Global_Sales),y=Global_Sales))+geom_bar(stat="identity",fill="steelblue")+theme_minimal()+coord_flip()+geom_text(aes(label=Global_Sales),vjust=0.5,color="black",size=4.0)+ylab("Global Sales in Millions of Dollars")+xlab("Video Game")+ggtitle("Top Global Selling Games")

       2. Top Selling Platforms
            Sales by platform is identified by aggregating platform based on sales revenue and creating pie-charts to undertstand the distribution. Then repeated for different countries.

R Code:
sales_platform_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Platform=vgsales_publisher$Platform),FUN=sum))
    colnames(sales_platform_global)<-c("platform","total_Sales")
     Pie1<-gvisPieChart(sales_platform_global,labelvar = "Platform",options = list(title="Global Sales by Platform",width=1000,height=500))


       
3. Top Selling Genre
                Sales by platform is identified by aggregating genre  based on sales revenue and creating pie-charts to undertstand the distribution. Then repeated for different countries.

R Code:

sales_genre_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Genre=vgsales_publisher$Genre),FUN=sum))
    colnames(sales_genre_global)<-c("genre","total_Sales")
    Pie1<-gvisPieChart(sales_genre_global,labelvar = "Genre",options = list(title="Global Sales by Genre",width=1000,height=500))

     
    
4. Sales By Year
           Sales by Year is calculated by aggregating sales with respect toevery year. This is then evaluated using a Line Chart and repeated for different countries.

R Code:
sales_year_global=as.data.frame(aggregate(vgsales_publisher$Global_Sales,by=list(Year=vgsales_publisher$Year),FUN=sum))
    colnames(sales_year_global)<-c("Year","total_sales")
    sales_year_global<-sales_year_global[-nrow(sales_year_global),]
    line1<-gvisLineChart(sales_year_global ,options = list(title="Global Sales by Year",width=1000,height=500))




Forecasting

Forecasting of time series data related to video game sales per year of different publishers are based on following two forecasting models:

1) ARIMA  Model
2)ETS Model

The code and step by step procedure followed for building the model as in the blogs You Canalytics , Analytics Vidhya and Dataiku.

​




2 Comments

Twitter Sentiment Analysis of Rio Olympics 2016 using R and Shiny

8/23/2016

2 Comments

 
So, finally we have come to the conclusion of world's biggest sporting event -Rio Olympics 2016.  It started with a grand opening ceremony on August 4th 2016 and finally came to end with another amazing closing ceremony   on 21st August 2016.  Athletes from almost 206 countries have participated in this great event and was followed by viewers all around the globe.  One unique feature of last few Olympics is that more opinions about athletes and countries are shared by people all over world through the medium of social media.  So obviously everyone will be curious to know what most people are talking over social media.

For the purpose of this study, I have considered tweets about most popular athletes and general category of Rio Olympics 2016 .It was impossible to consider all medal winners and all country athletes. So I have tried to include only the most popular athletes. Tweets are fetched for a particular time period when the athletes were competing.   for mist of athletes  5000-15000 tweets were extracted depending on their popularity. Finally sentiment analysis is performed on the tweets and visualized using Shiny and R.

Click Here to view the sentiment analysis App -  RIO OLYMPICS 2016 APP

T
he categories considered for this analysis study are:

1) Rio Olympics 2016
2)Rio Olympics Opening ceremony
3)Rio Olympics Closing ceremony


USA  Athletes

1) Micheal Phelps  - Swimming
2)Justin Gatlin   - Track and Field
3) Ashton Eaton  - Track and Field
4)Simone Biles  -Gymnastics
5)Katie Ledecky - Swimming
6)Ryan Lochte - Swimming
7)Kayla Harrison - Judo
8) Allyson Felix - Track and Field
9) 
Brianna Rollins- Track and Field
10)Tori Bowie- Track and Field
11) Nia Ali - Track and Field
12) Haley Anderson- Swimming
13)Jake Dalton- Gymnastics

14) Jeff Henderson - Track and Field
15) Christian Taylor- Track and Field

JAMAICA Athletes
1) Usain Bolt -Track and Field
2) Yohan Blake - Track and Field
3) Elaine Thompson - Track and Field
4) Omar McLeod- Track and Field


UK Athletes
1) Max Whitlock - Gymnastics
2) Andy Murray -Tennis
3) Mo Farah - Track and Field
4) Adam Peaty - Swimming

CANADA Athletes
1)Andre De Grasse - Track and Field
2)Erica Wiebe - Wrestling
3) Derek Drouin -Track and Field

BRAZIL Athletes
1) Neymar -Football

ITALY Athletes
1) Fabio Basile -Judo

AFRICA Athletes
1) Wayne van Niekerk - Track and Field
2)Jemima Jelagat Sumgong - Track and Field

BAHAMAS Athletes
1)Shaunae Miller -Track and field

INDIA Athletes
1) PV Sindhu - Badmintion
2) Sakshi Malik - Wrestling
3) Dipa Karmakar -Gymnastics
4) Abhinav Bindra - Shooting


Detailed Code  and files can be obtained from my Github Repository









2 Comments

Twitter Sentiment Analysis of T20 Cricket World cup  players using R and Shiny

4/14/2016

6 Comments

 
Cricket is a very popular sport in many countries and International Cricket Council(ICC ) will conduct World Cup Tournaments for two formats of the game, usually 20 Overs and 50 Overs. It is  followed by millions of viewers across the globe and  it normally creates a buzz in social media about various popular players.

For the purpose of this study, I have considered tweets about various players in different teams during the period of T20  worldcup  from 2016-03-08 until 2016-04-03, the duration of the worldcup . There were 10 major teams participating and from each team , 1-3 popular players were considered and tweets of the players during the time frame is extracted and sentiment analysis and wordcloud visualization is performed on each player tweets using shiny. For most of the players a  maximum of 5000 tweets extracted ,  and for few popular players ,a maximum of 10000 tweets were extracted.

Click here to view the App - Cricket T20 Shiny APP


The list of players considered from  various  teams are:

      INDIA 
  1. Virat Kohli
  2. MS Dhoni
  3. Jasprit Bumrah

        ENGLAND
  1. Joe Root
  2. Jos Buttler
  3. Ben Stokes

        WEST INDIES
  1. Chris Gayle
  2. Dwayne Bravo
  3. Carlos Brathwaite

        AUSTRALIA
  1. David Warner
  2. Shane Watson
  3. Glenn Maxwell

       BANGLADESH
  1. Mushfiqur Rahim
  2. Tamim Iqbal
  3. Mustafizur Rahman

       PAKISTAN
  1. Shahid Afridi
  2. Mohammad Amir      

       SOUTH AFRICA
  1. Quinton de Kock
  2. AB de Villiers
  3. Hashim Amla

      NEWZELAND
  1. Martin Guptill
  2. Mitchell Santner
  3. Ross Taylor

       SRILANKA
  1. Angelo Mathews
  2. Tillakaratne Dilshan
  3. Lasith Malinga

       AFGHANISTAN
  1. Mohammad Shahzad

Extracting the tweets from twitter

A detailed tutorial about  using twitteR  package in R for extracting tweets can be found here -Extract tweets in R .  Details of the tweets extraction is not provided in this blog

A sample code to fetch tweets of the player Virat Kohli is provided below.

kohli_tweets<-searchTwitter('Virat Kohli',since = '2016-03-08',until = '2016-04-03',n=10000,lang = "en")
kohli_tweets<-sapply(kohli_tweets,function(x) x$getText())


Detailed code can be obtained from my Github

Clean the tweets

​

The cleaning of tweets require the following steps:
  1. Remove html links from the tweets
  2. Remove retweet entities
  3. Remove all hashtags
  4. Remove all @people
  5. Remove all punctuation
  6. Remove all numbers
  7. Remove all unnecessary white spaces 
  8. Convert all text into lowercase and
  9. Remove duplicates

  Three separate functions are created for the entire cleaning of the tweets , the code can be obtained from Github

Sentiment classification

​Classification of sentiments  can be done using the package 'sentiment' in R.  First convert the tweets into a dataframe.

A sample code is given as below:

library(RCurl)
require(sentiment)

###Tweets Classification

# classify emotion
class_emo = classify_emotion(kohli_tweets$tweets, algorithm="bayes", prior=1.0)

# get emotion best fit
emotion = class_emo[,7]

# classify polarity
class_pol = classify_polarity(kohli_tweets$tweets, algorithm="bayes")

# get polarity best fit
polarity = class_pol[,4]



Repeat this procedure for all the players and classify the emotions of tweets.

Sentiment Score Classification

We can generate a sentiment score based on comparison of tweet words with positive and negative words lexicon and come up with a sentiment score.

#Scan positive words
opinion.lexicon.pos<-scan("positive-words.txt",what = 'character',comment.char = ';')

#Scan negative words
opinion.lexicon.neg<-scan("negative-words.txt",what = 'character',comment.char = ';')

pos.words = c(opinion.lexicon.pos,'upgrade')
neg.words = c(opinion.lexicon.neg,'wait','waiting', 'wtf', 'cancellation')

getSentimentScore = function(sentences, words.positive,
                             words.negative, .progress='none')
{
  require(plyr)
  require(stringr)
  scores = laply(sentences,
                 function(sentence, words.positive, words.negative) {
                   # Let first remove the Digit, Punctuation character and Control characters:
                   sentence = gsub('[[:cntrl:]]', '', gsub('[[:punct:]]', '',
                                                           gsub('\\d+', '', sentence)))
                   # Then lets convert all to lower sentence case:
                   sentence = tolower(sentence)
                   # Now lets split each sentence by the space delimiter
                   words = unlist(str_split(sentence, '\\s+'))
                   # Get the boolean match of each words with the positive & negative opinion-lexicon
                   pos.matches = !is.na(match(words, words.positive))
                   neg.matches = !is.na(match(words, words.negative))
                   # Now get the score as total positive sentiment minus the total negatives
                   score = sum(pos.matches) - sum(neg.matches)
                   return(score)
                 }, words.positive, words.negative, .progress=.progress )
  # Return a data frame with respective sentence and the score
  return(data.frame(score=scores))
}


score<-getSentimentScore(kohli_tweets$tweets,pos.words,neg.words)
kohli_tweets<-cbind(kohli_tweets, data.frame(emotion,polarity,score))

Shiny Code

global.R

library(tm)
library(wordcloud)
library(memoise)
library(googleVis)
library(ggplot2)


#Create a list of players

players<-list("Virat Kohli"="kohli",
              "MS Dhoni"= "dhoni",
              "Jasprit Bumrah" ="bumrah",
              "Joe Root"="root",
              "Jos Buttler"="butler",
              "Ben Stokes"="Ben Stokes",
              "Chris Gayle"="gayle",
              "Dwayne Bravo"="bravo",
              "Carlos Brathwaite"="brathwaite",
              "David Warner"="warner",
              "Shane Watson"="watson",
              "Glenn Maxwell"="maxwell",
              "Mushfiqur Rahim"="mushfiqur",
              "Tamim Iqbal"="tamim",
              "Mustafizur Rahman"="mustafizur",
              "Shahid Afridi"="afridi",
              "Mohammad Amir"="amir",
              "Quinton de Kock"="dekock",
              "AB de Villiers"="devillers",
              "Hashim Amla"="amla",
              "Martin Guptill"="guptill",
              "Mitchell Santner"="santner",
              "Ross Taylor"="taylor",
              "Angelo Mathews"="mathews",
              "Tillakaratne Dilshan"="dilshan",
              "Lasith Malinga"="malinga",
              "Mohammad Shahzad"="shahzad"

              )

catch.error = function(x)
{
  # Create a missing value for test purpose
  y = NA

  # Try to catch that error (NA) we just created

  catch_error = tryCatch(tolower(x), error=function(e) e)

  # if not an error, convert y to lowercase
  if (!inherits(catch_error, "error"))

    y = tolower(x)

  # check result if error exists, otherwise the function works fine.
  return(y)
}

cleanTweets<- function(tweet){

  # Clean the tweet for sentiment analysis
  # remove html links, which are not required for sentiment analysis

  tweet = gsub("(f|ht)(tp)(s?)(://)(.*)[.|/](.*)", " ", tweet)

  # First we will remove retweet entities from

  tweet = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", " ", tweet)

  # Then remove all "#Hashtag"

  tweet = gsub("#\\w+", " ", tweet)

  # Then remove all "@people"

  tweet = gsub("@\\w+", " ", tweet)

  # Then remove all the punctuation

  tweet = gsub("[[:punct:]]", " ", tweet)

  # Then remove numbers, we need only text for analytics

  tweet = gsub("[[:digit:]]", " ", tweet)

  # finally, we remove unnecessary spaces (white spaces, tabs etc)

  tweet = gsub("[ \t]{2,}", " ", tweet)
  tweet = gsub("^\\s+|\\s+$", "", tweet)

  tweet = catch.error(tweet)

  tweet
}

cleanTweetsAndRemoveNAs<- function(Tweets) {

  TweetsCleaned = sapply(Tweets, cleanTweets)

  # Remove the "NA" tweets from this tweet list

  TweetsCleaned = TweetsCleaned[!is.na(TweetsCleaned)]

  names(TweetsCleaned) = NULL

  # Remove the repetitive tweets from this tweet list

  TweetsCleaned = unique(TweetsCleaned)

  TweetsCleaned
}

#Get the tweets cleaned

getCleanTweets <- memoise(function(player) {


  if (!(player %in% players))
    stop("Unknown player")

  tweets <-readLines(sprintf("./Data/%s.txt",player)) 

  tweetsCleaned<-cleanTweetsAndRemoveNAs(tweets)

  tweetsCleaned

})

#Generate a term matrix for word cloud

getTermMatrix <- memoise(function(player) {


  if (!(player %in% players))
    stop("Unknown Player")

  text <- readLines(sprintf("./Data/%s.txt", player),
                    encoding="latin1",warn=FALSE)
  #Create a corpus   
  myCorpus = Corpus(VectorSource(text))
  #Convert text to lowercase
  myCorpus = tm_map(myCorpus, content_transformer(tolower))
  #Remove Punctuations
  myCorpus = tm_map(myCorpus, removePunctuation)

  # remove URLs
  removeURL <- function(x) gsub("http[^[:space:]]*", "", x)
  myCorpus <- tm_map(myCorpus, content_transformer(removeURL))

  # remove anything other than English letters or space
  removeNumPunct <- function(x) gsub("[^[:alpha:][:space:]]*", "", x)
  myCorpus <- tm_map(myCorpus, content_transformer(removeNumPunct))

  #remove numbers
  myCorpus = tm_map(myCorpus, removeNumbers)
  #remove stopwords in english
  myCorpus = tm_map(myCorpus, removeWords,stopwords("en"))

  myCorpus = tm_map(myCorpus, removeWords,
                    c(stopwords("SMART"), "thy", "thou", "thee", "the", "and", "but"))
  # remove extra whitespace
  myCorpus <- tm_map(myCorpus, stripWhitespace)


  myDTM = TermDocumentMatrix(myCorpus,
                             control = list(minWordLength = 1))



  m<-as.matrix(myDTM)

  sort(rowSums(m), decreasing = TRUE)
})

#Get the data for emotions

getEmotions <- memoise(function(player) {


  if (!(player %in% players))
    stop("Unknown player")

  data <-read.csv(sprintf("./Emotions/%s.csv",player)) 

  data
})

ui.R

library(shinydashboard)
library(shiny)

dashboardPage(

  dashboardHeader(title = "T20 Cricket Players "),
  dashboardSidebar(
    h3("Choose the Player"),
    selectInput("selection", "",
                choices = players),
    actionButton("update", "Change"),
    hr(),

    sidebarMenu(
      menuItem("Wordcloud",tabName = "wordcloud",icon = icon("cloud")),
      menuItem("Top Words",tabName = "barchart",icon = icon("bar-chart")),
      menuItem("Emotions",tabName = "emotions",icon = icon("smile-o"))


    )
  ),

  dashboardBody(

    tabItems(

      tabItem(tabName ="wordcloud",



              fluidRow(

                tabBox(title = "",width = 12,

                       tabPanel(title = tagList(shiny::icon("comments"),"Tweets"),


                                box(plotOutput("wordcloud1",height=500,width = 300)),

                                box(title = "Controls",
                                    sliderInput("freq","Minimum Frequency:",
                                                min = 1,  max = 50, value = 15),
                                    sliderInput("max","Maximum Number of Words:",
                                                min = 1,  max = 300,  value = 100)
                                )
                       )






                )



              )
      ),

      tabItem(tabName = "barchart",

              fluidRow(

                tabBox(title = "",width = 12,

                       tabPanel(title = tagList(shiny::icon("heart"),"Reviews"),

                                plotOutput("bar1")
                       )



                )


              )


      ),

      tabItem(tabName = "emotions",

              fluidRow(

                tabBox(title = "",width = 12,

                       tabPanel(title = tagList(shiny::icon("smile-o"),"Polarity"),

                                htmlOutput("pie2")
                       ),

                       tabPanel(title = tagList(shiny::icon("pie-chart"),"Emotions"),

                                htmlOutput("pie1")
                       ),


                       tabPanel(title = tagList(shiny::icon("thumbs-o-up"),"Emotion Score"),

                                plotOutput("score1")

                       )


                )


              )


      )








    )


  )
)

server.r

function(input, output, session) {
  # Define a reactive expression for the document term matrix
  terms <- reactive({
    # Change when the "update" button is pressed...
    input$update
    # ...but not for anything else
    isolate({
      withProgress({
        setProgress(message = "Processing corpus...")
        getTermMatrix(input$selection)
      })
    })
  })

  data_emotion<-reactive({

    getEmotions(input$selection)

  })



  # Make the wordcloud drawing predictable during a session
  wordcloud_rep <- repeatable(wordcloud)

  #Create the wordcloud
  output$wordcloud1 <- renderPlot({
    v <- terms()
    wordcloud_rep(names(v), v, scale=c(5,0.5),
                  min.freq = input$freq, max.words=input$max,
                  colors=brewer.pal(8, "Dark2"))
  })

  #Create a barchart for high frequency terms

  output$bar1<-renderPlot({

    plot1<-head(data.frame(Freq=terms()),n=20)
    plot1$word<-row.names(plot1)
    ggplot(plot1,aes(x=reorder(word,Freq),y=Freq))+geom_bar(stat="identity",fill="steelblue")+theme_minimal()+coord_flip()+geom_text(aes(label=Freq),vjust=0.5,color="black",size=4.0)+ylab("Frequency of words")+xlab("Top Words")+ggtitle("Top frequency words")
  }) 

  #Create a pie chart for the emotions
  output$pie1<-renderGvis({

    data<-data_emotion()
    emotion1<-as.data.frame(table(data$emotion))
    Pie1<-gvisPieChart(emotion1,options = list(width=1200,height=600))
    return(Pie1)

  })

  #Create a pie chart for the polarity

  output$pie2<-renderGvis({

    data<-data_emotion()
    emotion2<-as.data.frame(table(data$polarity))
    Pie2<-gvisPieChart(emotion2,options = list(width=1200,height=600))
    return(Pie2)

  })

  #create a histogram for the emotion score
  output$score1<-renderPlot({

    data<-data_emotion()
    ggplot(data,aes(x=score))+geom_histogram(bins=50,color="black",fill="blue")+theme_minimal()+xlab("Sentiment Score")+ylab("count")+ggtitle("Sentiment Scores of reviews")
  })
}

6 Comments
<<Previous
Forward>>
    Picture

      Subscribe now

    Subscribe to Newsletter
    Picture

    RSS Feed

    Categories

    All
    Basics
    Classification
    Courses
    Foreacsting
    Mapping
    R
    Shiny
    Visualization

    Picture
    Picture
    Picture
    200x200 Machine Learning Expert
    Picture
Powered by Create your own unique website with customizable templates.
  • Data Science
  • Machine Learning
  • Deep Learning
  • Artificial Intelligence
  • Big Data
  • Computer Vision
  • Blog
  • Advertise with us