TDM 30200: Project 1 — 2023
Motivation: Welcome back! This semester should be a bit more straightforward than last semester in many ways. In the first project back, we will do a bit of UNIX review, a bit of Python review, and I’ll ask you to learn and write about some terminology.
Context: This is the first project of the semester! We will be taking it easy and slowly getting back to it.
Scope: UNIX, Python
Questions
Question 1
Google the difference between synchronous and asynchronous — there is a lot of information online about this.
Explain what the following tasks are (in day-to-day usage) and why: asynchronous, or synchronous.
-
Communicating via email.
-
Watching a live lecture.
-
Watching a lecture that is recorded.
-
Code used to solve this problem.
-
Output from running the code.
Question 2
Given the following scenario and rules, explain the synchronous and asynchronous ways of completing the task.
You have 2 reports to write, and 2 wooden pencils. 1 sharpened pencil will write 1/2 of 1 report. You have a helper that is willing to sharpen 1 pencil at a time, for you, and that helper is able to sharpen a pencil in the time it takes to write 1/2 of 1 report.
Please assume you start with 2 sharpened pencils. |
-
Code used to solve this problem.
-
Output from running the code.
Question 3
Write Python code that simulates the scenario in question (2) that is synchronous. Make the time it takes to sharpen a pencil be 2 seconds. Make the time it takes to write .5 reports 5 seconds.
Use |
How much time does it take to write the reports in theory?
Here is some skeleton code to get you started.
|
-
Code used to solve this problem.
-
Output from running the code.
Question 4
Read the StackOverflow post and think about the scenario in question (2) that is asynchronous. Assume the time it takes to sharpen a pencil is 2 seconds and the time it takes to write .5 reports is 5 seconds.
How much time does it take to write the reports in theory, if you use the asynchronous method? Explain.
-
Code used to solve this problem.
-
Output from running the code.
Question 5
In your own words, describe the difference between concurrency and parallelism. Then, look at the flights datasets here: /anvil/projects/tdm/data/flights/subset
. Describe an operation that you could do to the entire dataset as a whole. Describe how you (in theory) could parallelize that process.
Now, assume that you had the entire frontend system at your disposal. Use a UNIX command to find out how many cores the frontend has. If processing 1 file took 10 seconds to do. How many seconds would it take to process all of the files? Now, approximately how many seconds would it take to process all the files if you had the ability to parallelize on this system?
Don’t worry about overhead or the like. Just think at a very high level.
Best make sure this sounds like a task you’d actually like to do — I may be asking you to do it in the not-too-distant future. |
-
Code used to solve this problem.
-
Output from running the code.
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |