R Tutorial: Measurement, validity & reliability

Показать описание

---
The objective of this course is to help you craft surveys possessing

both reliability and validity. These are complex concepts which you'll learn more about throughout the course. Let's introduce the basic ideas in this lesson.

We can define measurement as the process of observing and recording.

To do that, we generally need some kind of tool, or instrument.Consider, for example, a ruler. It's a tool we use to measure something -- height.

Of course, measuring something like consumer brand perceptions is a little trickier than measuring height. We need to check if it's "calibrated," so to speak. Namely, we need to check if our survey is reliable and valid. Let's take an in-depth look at each.

Measurement reliability means that we are measuring consistently and we can reproduce what we are measuring. We already checked one type of reliability: equivalence, as measured by inter-rater reliability. There are several other types. We'll cover the most common in the remainder of the course.

Now for validity. Are we measuring what we claim to measure? Just as with reliability, there are several types of validity. Here we have the so-called "three c's" of validity. Again, we already measured the first, content. We'll cover the others later in the course.

Let's touch back briefly on our flowchart here. We constructed a set of items to measure customer satisfaction, using the tools from the last lesson, and collected the data, which puts us between steps 2 and 3 in our flowchart. Now what?

We're in the exploratory data analysis, or EDA phase of the process. We want to learn about our data before modeling it.

First, let's check response frequencies.

Are respondents unanimous in rating each item, or is there a range of opinions?

We can get a great visualization of this with the likert package. To do this, we first pass our data frame to the likert() function. All items must be converted to factors for this function, so we'll do that using dplyr's mutate_if() function converting all integer variables to factors.

We can pass our new object here to plot a bar chart of our response frequencies.

We also need to check for items that are worded in the opposite way, so to speak, of the other items.

To take an example, let's examine the items of our customer satisfaction survey. All these items carry positive connotations except the item difficulttouse. A company certainly doesn't want its site to be difficult to use, so we want to measure the opposite of the responses here. This is called reverse coding.

Let's create a new variable, difficulttouse dot r, which is a recoded version of the difficultouse item.

To do that, we will use the recode() function from car. This is not to be confused with the recode() function from dplyr, so we will explicitly call from the car package here. Next, we specify how to recode the values. Ones become fives, twos become fours and so forth. Threes will stay threes, so no need to include that in our syntax!

Let's confirm what we just did there.

Using select() from dplyr to analyze just our two items of interest, we will use the response-dot-frequencies() function from psych. Let's also round the output for readability.

We can see the two items are indeed reverse images of each other!

Okay, we're ready to explore our survey data! Let's get started.

#R #RTutorial #DataCamp #Survey #Measurement #Development #marketing #research #validity #reliability