TDM 10100: Project 12 — 20223
Motivation:
In the previous project we manipulated dates, this project we are going to continue to work with dates.
Working with dates in R
can require more attention than working with other object classes. These packages will help simplify some of the common tasks related to date data.
Dates and times can be complicated. For instance, not every year has 365 days. Dates are difficult because they have to accommodate for the Earth’s rotation and orbit around the sun. We need to handle timezones, daylight savings, etc. If suffices to say that, when focusing on dates and date-times in R, the simpler the better.
Dataset(s)
The project will use the following dataset:
-
/anvil/projects/tdm/data/restaurant/orders.csv
Questions
Go ahead and use the fread
function from the data.table
library, to read in the dataset to a data frame called orders
.
Question 1 (2 pts)
-
Use the
substr
function to get (only) the month-and-year of each date in thecreated_at
column. How many times does each month-and-year pair occur? You may find more information about thesubstr
function here: R substring -
Now (instead) use the
month
function and theyear
function on thecreated_at
column, and make sure that your results agree with the results from 1a. -
Finally, use the
format
function to extract the month-and-year pairs from thecreated_at
column, and make sure that your results (again!) agree with the results from 1a.
Question 2 (2 pts)
-
Which
customer_id
placed the largest number of orders altogether? (Each row of the data set represents exactly one order.) -
For the
customer_id
that you found in question 2a, either use thesubset
function or use indexing to find the month-and-year pair in which that customer placed the most orders.
Question 3 (2 pts)
-
There are 5 types of payments in the
payment_mode
column. How many times are each of these 5 types of payments used in the data set? -
If we focus on the
customer_id
found in question 2a, which type of payment does that customer prefer? How many times did that customer use each of the 5 types of payments?
Question 4 (2 pts)
-
Use the
subset
function to make a data frame calledordersJan2020
that contains only the orders from January 2020. -
Create a plot using the
ordersJan2020
data that shows the sum of thegrand_total
values for each of the 7 days of the week.
Project 12 Assignment Checklist
-
Jupyter Lab notebook with your code, comments and output for the assignment
-
firstname-lastname-project12.ipynb
-
-
R code and comments for the assignment
-
firstname-lastname-project12.R
.
-
-
Submit files through Gradescope
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |