Prepare a data model that combines range of data sets


Learning Outcomes:

A. Demonstrate the ability to identify and integrate data of various types from traditional and alternative sources, and make informed judgements about their use in data science research

B. Critically evaluate the methodologies applied in data collection, data processing, data analysis & dissemination of research findings

C. Critically assess methods and data strengths and limitations combined to application of R and/or Python

Introduction:

In this coursework you will prepare a data model that combines a range of data sets. We are primarily interested in the processes you take to achieve your data model, though you will need to produce a final data set and model.

Scenario:

Oxford Brookes University would like to offer a new service to staff to encourage the brightest and best staff to join us, and in recognition of the fact that Oxford itself can be a very expensive place to live.

This new service is a town advice service that recommends towns in Oxfordshire based on a certain key characteristics, these being:

  • House prices
  • Broadband speed
  • Crime in the area over the last month

They would also like to consider other factors such as:

  • Nearby rights of way
  • Distance from Oxford vs size of the road
  • Availability of Allotments

There may be other factors. So you should also gather more information from a member of Oxford Brookes academic staff to find about any other key issues that might affect a person's choice of location.

Tasks:

You must use datasets that are published on by the UK government, either centrally or through a public body that would be available to a member of the UK public. You should prepare a brief questionnaire about the knowledge acquisition and send it to a domain expert (in this case Dr. Younas) to gain an insight into any other data sources you may wish to query. Dr. Younas's email

Using this information, you should produce a unified data set and model that could be used to drive a recommendation system, documenting and explaining all the processes that you undertake to achieve this data set and model. You must ensure that -

  • All data used is normalised to at least 3NF
  • You must use the MySQL server on SOTS to store the data or another MySQL server. You should include your tables as part of the report
  • Your model must use the three key characteristics
  • Your model may use the additional characteristic(s) suggested above or that arise from the knowledge acquisition session
  • The combined data set must be stored in a MySQL server
  • You should demonstrate that you can query the data set in R
  • You should have a simple recommendation system, written in R, that allows the user to specify a value in the range 0-10, for each of the three key characteristics and then produces a score for a town and displays the top 3 towns in order
  • The towns used are in Oxfordshire.
  • You may restrict the number of towns you look at to main towns, but you must justify your selection in your report

You should produce a report detailing

  • The stages you took to identify, obtain, clean, and use the data sets associated with the three key characteristics
  • The stages you took to identify, obtain, clean and use any additional data sets that you needed to either combine or fully utilise the three key characteristics
  • A justification of the approaches used in identifying, cleaning, and using the datasets
  • How you might obtain, clean and use any one data set associated with the optional characteristics (Note: you do not have to do the actual work, just say what the issues are with this type of data and how you might incorporate it into your system)
  • The results of your knowledge acquisition questionnaire with your domain expert
  • How you might obtain, clean and use any one additional data set based on your knowledge acquisition questionnaire (Note: you do not have to do the actual work, just say what the issues are with this type of data and how you might incorporate it into your system)
  • A discussion of any legal or ethical issues with the proposed system and the data used
  • An overview/design of your R code
  • Your R code
  • Names and descriptions of the MySQL database tables
  • Testing of your system

With our finest Data Science Foundations Assignment Help, you will never secure less than an A++ in your academics!

Tags: Data Science Foundations Assignment Help, Data Science Foundations Homework Help, Data Science Foundations Coursework, Data Science Foundations Solved Assignments 

Attachment:- Data Science Foundations.rar

Request for Solution File

Ask an Expert for Answer!!
Programming Languages: Prepare a data model that combines range of data sets
Reference No:- TGS03053080

Expected delivery within 24 Hours