Blperform any other needed data preparation required


Data Preparation - Cleaning up any issues in the data to allow it to be analyzed using various software tools such as Tableau. In a project, this phase can take 80 to 90% of the overall effort.

• Decide how to handle any blank values. If blank is unknown, you may want to leave the value blank. On the other hand, it blank means "not applicable", you may want to replace the blank cell with "NA".

• If feasible, merge tables together as needed to join together two or more tables that have different information about the same objects. A common field in multiple tables is needed to join the tables together.

• Manually (or using tools if available), review the data to look for unusual patterns or distributions in the data that might call into question the validity of the data. It involves using a critical eye to examine the data.

Identifying inconsistent data encodings (e.g., different abbreviations might be used for state)

Identifying suspicious data responses (e.g., when physically questionable numbers are put in for a response such as the same answer on a survey for all the questions.)

Are there outliers that don't seem to make sense? For example, salaries for teenagers that are in the six figures or average traffic at a store that is typically in the thousands but then seeing some values that are in the ten range or million range.

• Perform any other needed data preparation required. This is an open-ended step and specific details will depend on the changes needed and software tools used. Make sure to

• Compare the data provided as well as the data that you have prepared to the questions to be analyzed from the Business Understanding phase. Does it appear that it is possible to answer the questions from the data provided?

If you are missing needed data and the sponsor does not have the data nor can the data be generated by the sponsor; the project needs to be revised or cancelled. Make sure to document the data that is needed. If feasible, determine how this data can be collected or generated for future analysis.

• Keep track of issues found during this phase. This might be recommendation back to the sponsor to capture that data originally using a different format or method to reduce the effort needed to clean the data. In some cases, this can be one of the more valuable contributions of your project. Data preparation can take 80 to 90% of a project's overall time and resources.

If issues can be reduced going forward, this can save a great deal of time and money and allow further analysis to be performed easier.

Request for Solution File

Ask an Expert for Answer!!
Basic Computer Science: Blperform any other needed data preparation required
Reference No:- TGS01302449

Expected delivery within 24 Hours