Which instructors are most popular


Problem

Consider the following scenario, we are interested in modeling student enrollment in Stanford courses. We would like to answer questions such as:

1) Which courses are most popular? Which instructors are most popular?

2) Which courses are most popular among Data Analysis and Management degree? Among all BAS students?

3) Are there courses for which the assigned capsize is too large or too small?

We are planning to have a course enrollment fact table with the grain of one row per student per course enrollment. In other words, if a student enrolls in 5 courses there will be 5 rows for that student in the fact table.

We will use the following dimensions: Course, Program, Student, Quarter, CapSize, and Instructor. There will be a single fact measurement column, EnrollmentCount. Its value will always be equal to 1.

We are considering several options for dealing with the Instructor dimension. Interesting attributes of instructors include FirstName, LastName, Title (e.g. Assistant Professor), and Program. The difficulty is that a few courses (less than 5%) have multiple instructors. Thus, it appears we cannot include the Instructor dimension in the fact table because it doesn't match the intended grain. Here are the options under consideration:

Option A: Modify the Instructor dimension by adding special rows representing instructor teams. For example, ISIT 333 is taught by Li and Cheng, so there will be an Instructor row representing "Li/Cheng" (as well as separate rows for Li and Cheng, assuming that they sometimes teach courses as sole instructors). In this way, the Instructor dimension becomes true to the grain and we can include it in the fact table.

Option B: Change the grain of the fact table to be one row per student enrollment per course per instructor. For example, there will be two fact rows for each student enrolled in ISIT333, one that points to Li as an instructor and one that points to Cheng. However, each of the two rows will have a value of 0.5 in the EnrollmentCount field instead of a value of 1, in order to allow the fact to aggregate properly. (Enrollments are "allocated" equally among the multiple instructors.)

Option C: Create two fact tables. The first has the grain of one row per student enrollment per course and doesn't include the Instructor dimension. The second has the grain of one row per student enrollment per course per instructor and includes the Instructor dimension (as well as all the other dimensions). Unlike Option B, the value of EnrollmentCount will be 1 for all rows in the second fact. Tell warehouse users to use the second fact table for queries involving attributes of the instructor dimension and the first fact table for all other queries.

• What are the strengths and weaknesses of each option?

• Which option would you choose and why?

• Would your answer to Question 2 be different if the majority of classes had multiple instructors? How about if only one or two classes had multiple instructors?

• Can you think of another reasonable alternative design besides Options A, B, and C? If so, what are the advantages and disadvantages of your alternative design?

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: Which instructors are most popular
Reference No:- TGS03312148

Expected delivery within 24 Hours