A bivariate dataset is an ordered sequence of pairs of numbers:
Bivariate Datasets
To generate a bivariate dataset we can sample two random variables from the same input set:
Generating Bivariate Datasets
Suppose we had a set of shapes and two random variables:
C - that took a shape and returned the number of corners
E - that took a shape and return the number of edges
If we choose a shape and apply both random variables, we get a pair of numbers:
If we randomly choose a sequence of shapes and apply both random variables, we get a sequence of pairs of numbers:
We can refer to the dataset as and each data point as where represents the place of that data point in the sequence, so for example:
Each random variable has its own range of possible values. By considering all possible pairs, we can obtain the range of scores for a bivariate dataset.
Two-Way Tables
For categorical or numerical discrete bivariate data, the range of scores is often small enough to create a two-way table:
Creating Two-Way Tables
Suppose we surveyed people about:
Whether or not they like coriander
Whether they prefer Soccer or Basketball
Suppose we had the following results
28 people like coriander and prefer Soccer
32 people like coriander and prefer Basketball
27 people don’t like coriander and prefer Soccer
13 people don’t like coriander and prefer Basketball
We can create a frequency distribution in a two-way table:
Like
Coriander
Don’t Like
Coriander
Soccer
28
27
Basketball
32
13
Often we also add an extra row and column for sums:
Like
Coriander
Don’t Like
Coriander
Sum:
Soccer
28
27
55
Basketball
32
13
45
Sum:
60
40
100
By taking the sum along rows and columns we can obtain more information. From above we can see that:
60 people like coriander
40 people don’t like coriander
55 people prefer Soccer
45 people prefer Basketball
100 people were surveyed in total
Scatterplot
If our data is numerical, we can think of each data point as a coordinate:
Scatterplot
If our dataset has repeated data points, we need to be able to indicate this on our scatterplot:
Plotting Repeated Data Points
Suppose we had the dataset:
We can see there is a repeated data point:
One approach we can take to show this on a scatterplot is to use different colours:
If we can’t use different colours, we can instead use different shapes. Either way, it is important to always have a legend with our plot.
Lesson Loading
Difficulty
00
Time
SOLUTION
Homework will appear here once homework is set for this session
Edit Homework
Confirm
Clear All
Cancel
Presets
Two Way Tables
Cambridge 2 Unit Year 12 | Exercise 9A
Question 6
Question 16
Scatterplot
Cambridge 2 Unit Year 12 | Exercise 9D
Question 3
Lesson Loading
×
×
Bivariate Data
Review Questions
What is a bivariate dataset?
A dataset generated by a single random variable
A dataset generated by a pair of random variables
A dataset generated by an unknown amount of random variables
×
Bivariate Data
Review Questions
What is a bivariate dataset?
A dataset generated by a pair of random variables
A dataset generated by a single random variable
A dataset generated by an unknown amount of random variables
×
Two-Way Tables
Review Questions
For the final two-way table in the video, how many people were male and overweight?
21
9
6
11
For the final two-way table in the video, how many people were female?
11
21
9
6
For the final two-way table in the video, how many people were overweight?
21
11
9
6
×
Two-Way Tables
Review Questions
For the final two-way table in the video, how many people were male and overweight?
9
6
21
11
For the final two-way table in the video, how many people were female?
11
21
6
9
For the final two-way table in the video, how many people were overweight?