TDM 10100: Project 10 — 2023
Motivation: As we have learned, functions are foundational to more complex programs and behaviors.
There is an entire programming paradigm based on functions called functional programming.
Context:
We will apply functions to entire vectors of data using tapply and sapply. We learned how to create functions, and now the next step we will take is to use it on a series of data.
Dataset(s)
The project will use the following dataset(s):
-
/anvil/projects/tdm/data/restaurant/orders.csv -
/anvil/projects/tdm/data/restaurant/vendors.csv
|
The read.csv() function automatically delineates by a comma`,` You can also load the |
Questions
Question 1 (2 pts)
Please load the datasets into data frames named orders and vendors
There are many websites that explain how to use grep and grepl (the l stands for logical) to search for patterns. See, for example: statisticsglobe.com/grep-grepl-r-function-example
-
Use the
greplfunction and thesubsetfunction to make a new data frame fromvendors, containing only the rows with "Fries" in the column calledvendor_tag_name. -
Now use the
grepfunction and row indexing, to make a data frame fromvendorsthat (as before) contains only the rows with "Fries" in the column calledvendor_tag_name. -
Verify that your data frames in questions 1a and 1b are the same size.
Question 2 (2 pts)
-
In the data frame
vendors, there are two types ofdelivery_chargevalues: 0 (which represented free delivery) and 0.7 (which represents non-free delivery). Make a table that shows how many of each type of value there are in thedelivery_chargecolumn. -
Please use the
prop.tablefunction to convert these counts into percentages.
Question 3 (2 pts)
-
Consider only the vendors with
vendor_category_id == 2. Among these vendors, find the percentages of thedelivery_chargecolumn that are 0 (free delivery) and 0.7 (non-free delivery). -
Now consider only the vendors with
vendor_category_id == 3, and again find the percentages of thedelivery_chargecolumn that are 0 (free delivery) and 0.7 (non-free delivery).
Question 4 (1 pt)
-
Solve questions 3a and 3b again, but this time, solve these two questions with one application of the
tapplycommand, which provides the answers to both questions. (It is fine to give only the counts here, in question 4a, and convert the counts to percentages in question 4b.) -
Now (instead) use an user-defined function inside the
tapplyto convert your answer from counts into percentages.
Question 5 (1 pt)
-
Starting with your solution to question 4a, now use the
sapplycommand to convert your answer from counts into percentages. Your solution should agree with the percentages that you found in question 4b.
Project 10 Assignment Checklist
-
Jupyter Lab notebook with your code, comments and output for the assignment
-
firstname-lastname-project10.ipynb
-
-
R code and comments for the assignment
-
firstname-lastname-project10.R.
-
-
Submit files through Gradescope
|
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |