Social media analysis for understanding customer preferences


Assignment Problem: Social Media Analysis for Understanding Customer Preferences and Sentiments

Assignment Objective:

The learning objective of Assignment is to further develop your understanding and skills on social media analytics via performing analysis on two case studies:

1) Case Study A: you will work as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximize customers' involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, shares, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, business, etc. that are of interest to different clients in various industries.

2) Case Study B: you will be a data scientist working for a hotel review firm to develop a sentiment analytics engine for Twitter, which is used to predict consumers' review sentiments. The aim is to develop both dictionary-based and machine learning-based sentiment analytics scripts using a number of R libraries and SAS Sentiment Analysis Studio (covered in the workshop activities on Week 4 and Week 5). You are required to use the developed engine to predict hotel reviewers' sentiments and benchmark various algorithms and analytics tools.

Case Study A:

Leveraging the power of content and social media marketing can help elevate the audience and customer base in a dramatic way. However, using social media for marketing without any previous experience or insight could be challenging. It is vital for a marketing team to understand social media marketing fundamentals. If a company publish exciting, high-quality content and build an online audience of quality followers, they can share it with their own follower audience on Twitter, Facebook, LinkedIn, Google+, their own blogs and many other social media platforms. This sharing and discussing of content open up new entry points for search engines like Google to find it in a keyword search. Those entry points could grow to hundreds or thousands or more potential ways for people to find a company, product or service online. Finding and understanding the online influencers in the market who have quality audiences and are likely to be interested in the product, service or business could make a huge positive impact.

The consulting company collected information on articles that were shared by people on social media. The dataset contains approximately 39000 articles and a large number (with the total of 31) of features were extracted from the HTML code of the article, including the title and the content of each article. (The description of the dataset is provided as an appendix.) Some of the features depend on characteristics of the service used, which could be analyzed based on the meta-data provided: articles have the meta-data, such as keywords, data channel type and the total number of shares (on Facebook, Twitter, Google+, LinkedIn, Pinterest), etc. The data channel categories are: 'Lifestyle', 'Business', 'Entertainment', 'Social Media', 'Technology', and 'World'. In addition, several natural language processing features were also extracted.

Task Requirements:

As a data analytics team member for the consultancy firm, you are required to carry out a number of data analytics tasks for the consulting company using the data collected. You are given access to a sample of the data where some of the variables have been removed as they are not considered important for the analysis of this assignment.

The company is interested in identifying for each data channel:

  • Investigate the impact of the article properties on sharing;
  • Use the SAS Text Miner for text analysis to identify key features in the articles and analyze their contribution towards low and high sharing.

To achieve the above, you need to carry out the following data analytics tasks:

a) Task 1: Explore the impact of article properties

Explore the data and investigate what properties of the article correlate with the high number of shares of the article on social media.

A) Open the dataset 'online_news_popularity.xlsx' using Microsoft Excel.

B) Explore the dataset to understand and manage the six types of data channels (lifestyle, entertainment, bus, socmed, tech, world) and the associating data. In each data channel column, the value of 1 represents that the data in the row is of the corresponding data channel.

C) Copy the separate datasets for each channel to different Excel sheets (sort and filter by each data channel to separate).

D) In each data channel, identify the articles with a high number of shares (with the threshold of top 10% in the dataset).

E) Investigate the following properties and explain how they could have affected the high number of shares. You should provide the explanation to support your argument.

a) Number of tokens in the title

b) Was the article published on the weekend

c) Number of links

d) Number of images

e) Number of videos

(Hint: To do this, you can create plots in R between the corresponding columns and the number of shares. You may want to include a fitted line to your plots to investigate the correlation for continuous variables.)

2. Task 2: Use SAS Text Miner for keyword analysis

A) Use the SAS Text Miner to extract the keywords from the title in each data channel. (Hint: To do this, you can refer to the workshop activities in Week 3 and Week 4; by setting 'Title' column as the only 'Text' role in the variable setting.)

B) What are the highly used (top 10) topics in each category? Use the SAS Result window to explain your answers.

(Hint: 'Topic' column will need to be set as the only 'Text' role.)

C) Are there common topics which span across data channels and relate to a high number of shares and a low number of shares? Use the whole dataset in the SAS Text Miner to identify the relationship. You should provide the explanation to support your argument.

(Hint: Use the whole dataset to identify the articles with the high number of shares and the low number of shares - by using appropriate thresholds with the top 10% and the bottom 10% in the dataset. Separate the dataset using Excel based on this before the analysis and use these two datasets to analyze the common topics in each of them. In this question, please use 'Title' column as the only 'Text' role for topic modelling.)

You are required to:

a) Prepare a report for the Case Study A with all the analytics results to the above two key tasks. (You can use an appendix for any additional screenshots, figures and tables, which you feel are important for the report). The report should be named as:

b) Save the R script after Task 1 above as: Assignment1A.r

c) Save the SAS project for Task 2 above as Assignment1_Task1.spk. You may zip the SPKs files if you have multiple of them. The SAS project file should be named as:

Case Study B:

Sentiment analysis is the technique aiming to gauge the attitudes of customers in relation to topics, products and services of interests. It is a pivotal technology for providing insights to enhance the business bottom line in campaign tracking, customer-centric marketing strategy and brand awareness. Sentiment analytics approaches are used to produce sentiment categories such as 'positive', 'negative' and 'neutral'. More specific human emotions are also the topic of interest. There are two major streams of methods to develop sentiment analytics engine: the dictionary-based and machine learning-based approaches. In this assignment, you are required to perform sentiment analytics based on both approaches.

Task Requirements:

As a data scientist, you are required to perform a number of data analytics tasks. You are tasked to develop both dictionary-based and machine-learning sentiment analytics engines using R programming language and apply it to predict the sentiments of hotel review tweets from a sample of data. You are also required to use the SAS Sentiment Analysis Studio to compare the results.

To achieve the above, you need to carry out the following data analytics tasks:

Task 1: Develop a dictionary-based sentiment analytics engine based on the R library 'syuzhet' to analyze the different emotions from hotel review tweets.

A) Analyze and aggregate the eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust) from the hotel review tweets file 'hotel_tweets.csv' using the function 'get_nrc_sentiment'.

B) You are required to plot a chart to visualize these emotions using the R library 'ggplot2'.

C) You should combine both negative and positive tweets into one before conducting the analysis.

Task 2: Develop a machine learning-based model using the R libraries 'tm' and 'e1071' as well as evaluate the predictive accuracies of SVM classifier.

A) Develop R scripts and import the data set 'hotel_tweets.csv' for training and testing.

B) Use the first 200 negative tweets and the first 200 positive tweets as the training dataset; and use the rest of the 63 negative tweets and 63 positive tweets as the testing dataset.

(Hint: You may need to use as.character() function to convert a dataframe column from factors to characters.)

C) Develop a machine learning-based sentiment analytics engine and predict sentiment categories (only 'positive' and 'negative') using 'tm' and 'e1071' with the SVM classifier.

D) Evaluate the testing accuracies and report the predicted results.

Task 3: Develop a statistical model using SAS Sentiment Analysis studio and evaluate the accuracies (5%).

A) Use the data folder: 'hotel_tweets' which contain 'negative' and 'positive' tweets for training and testing.

B) Build a statistical model using SAS Sentiment Analysis (either simple or advanced), you may change configurations in the advanced model to obtain the best training accuracy.

(Hint: Refer to the SAS Sentiment Analysis Studio tutorial.)

C) Evaluate and compare the testing accuracies for different models and report the results.

D) Compare this result with the previous predictive results using R and discuss.

You are required to:

a) Prepare a report for Case Study B with all the analytics results to the above three key tasks. (You can use an appendix for any additional screenshots which you feel are important for the report). The report should be named as:

Assignment1B_Report.doc

b) Save the R script after Task 2 above as: Assignment1B.r

c) Save the SAS Sentiment Studio project as: Assignment1_SAS2.zip

A number of students opt this course in their academic curriculum and find themselves in trouble while completing the assignment tasks and seek for a reliable and consistent Social Media Analytics Assignment Help service for precisely completing their university assignment task.

Tags: Social Media Analytics Assignment Help, Social Media Analytics Homework Help, Social Media Analytics Coursework, Social Media Analytics Solved Assignments, Social Media Marketing Assignment Help, Social Media Marketing Homework Help, Statistical Model Assignment Help, Statistical Model Homework Help

Attachment:- Customer Analytics and Social Media.rar

Request for Solution File

Ask an Expert for Answer!!
Other Subject: Social media analysis for understanding customer preferences
Reference No:- TGS03027528

Expected delivery within 24 Hours