TDM 40200: Project 13 — 2023
Motivation: Containers are everywhere and a very popular method of packaging an application with all of the requisite dependencies. In the previous series of projects you’ve built a web application. While right now it may be easy to share and run your application with another individual, as time goes on and packages are updated, this is less and less likely to be the case. Containerizing your application ensures that the application will have the proper versions of the proper packages available in the proper location to run.
Context: This is a third of a series of projects focused on containers. The end goal of this series is to solidify the concept of a container, and enable you to "containerize" the application you’ve spent the semester building. You will even get the opportunity to deploy your containerized application!
Scope: Python, containers, UNIX
Questions
Question 1
The end goal of this project is to containerize your frontend and backend (into two different containers), and make sure that they can communicate with each other. The following is a rough sketch of the steps involved in this process, so you have a general idea what is next at each step.
-
On Anvil, launch and connect to your VM with Docker pre-installed.
-
Copy the frontend and backend code from Anvil to your VM.
-
Create a
Dockerfile(or, what can more generically be referred to as aContainerfile) for each of the frontend and backend. -
Use
Dockerto build a container image for each of the frontend and backend. -
Run the containers and make sure they can communicate with each other.
Ultimately, in the next project, you will be deploying your frontend and backend on a Kubernetes cluster, Geddes, behind a URL! So, at the very end of this project, we will ask you to verify your access to Geddes (which you’ve hopefully already been granted).
For this question, simply prep your working environment. Launch a SLURM job, prop up your VM, and ensure you can connect to it. The only thing you need to submit is a screenshot showing that you can connect to your VM.
-
Get a terminal on Anvil — you may complete this part however you like. I like to use
sshto connect to Anvil from my local machine, however, you may also use ondemand.anvil.rcac.purdue.edu, launch a Jupyter Lab session, and launch a terminal from within Jupyter Lab. Either works equally as well as the other. -
Clear out any potential SLURM environment variables:
for i in $(env | awk -F= '/SLURM/ {print $1}'); do unset $i; done; -
Launch SLURM job with 8 cores and about 16 GB of memory and get a shell into the given backend node:
salloc -A cis220051 -p shared -n 8 -c 1 -t 04:00:00This job will only buy you 4 hours of time on the backend node. If you need more time, you will need to re-launch the job and change the arguments to
sallocto request more time. -
Once you have a shell on the backend node, you will need to load the
qemumodule:module load qemu -
Next, copy over a fresh VM image to use for this project:
cp /anvil/projects/tdm/apps/qemu/images/alpine.qcow2 $SCRATCHIf at any time you want to start fresh, you can simply copy over a new VM image from
/anvil/projects/tdm/apps/qemu/images/alpine.qcow2to your$SCRATCHdirectory. Any changes you made to the previous image will be lost. This is good to know in case you want to try something crazy but are worried about breaking something! No need to worry, you can simply re-copy the VM image and start fresh anytime! -
The previous command will result in a new file called
alpinel.qcow2in your$SCRATCHdirectory. This is the VM image you will be using for this project. Now, you will need to launch the VM:qemu-system-x86_64 -vnc none,ipv4 -hda $SCRATCH/alpine.qcow2 -m 8G -smp 4 -enable-kvm -net nic -net user,hostfwd=tcp::2200-:22 &The last part of the previous command forwards traffic from port 2200 on Anvil to port 22 on the VM. If you receive an error about port 2200 being used, you can change this number to be any other unused port number. To find an unused port you can use a utility we have available to you.
module use /anvil/projects/tdm/opt/core module load tdm find_portThe
find_portcommand will output an unused port for you to use. If, for example, it output12345, then you would change theqemucommand to the following.qemu-system-x86_64 -vnc none,ipv4 -hda $SCRATCH/alpine.qcow2 -m 8G -smp 4 -enable-kvm -net nic -net user,hostfwd=tcp::12345-:22 & -
After launching the VM, it will be running in the background as a process (this is what the
&at the end of the command does). After about 15-30 seconds, the VM will be fully booted and you can connect to the VM from Anvil using thesshcommand.ssh -p 2200 tdm@localhost -o StrictHostKeyChecking=noYou may be prompted for a password for the user
tdm. The password is simplypurdue.If in a previous step you changed the port from say
2200to something like12345, you would change thesshcommand accordingly. -
Finally, you should be connected to the VM and have a new shell running inside the VM, great! If you were successful, contents of the terminal should look very similar to the following.
|
If at any time you would like to "save" your progress and restart the project at a later date or time, you can do this by exiting the VM by running the |
-
Code used to solve this problem.
-
Output from running the code.
Question 2
The next step is to copy the application /anvil/projects/tdm/etc/project13 and the database /anvil/projects/tdm/data/movies_and_tv/imdb.db to the VM (the database belongs in /home/tdm for this project). You can do this by using the scp command. scp uses ssh to securely transfer files between hosts. Remember, your VM is essentially another machine with open port 2200 for ssh (and scp). Figure out how to accomplish this task and then copy the application to the VM.
For this question, submit a screenshot of the following on the VM.
ls -la /home/tdm/project13/frontend
ls -la /home/tdm/project13/frontend/templates
ls -la /home/tdm/project13/backend/api
-
Code used to solve this problem.
-
Output from running the code.
Question 3
Create two Dockerfile files:
-
/home/tdm/project13/frontend/Dockerfile -
/home/tdm/project13/backend/Dockerfile
As long as your images build and work correctly, you can use any base image you want. However, if you want the potential to get better/faster help (via Piazza), you should use the following base image: python:3.11.3-slim-bullseye (hub.docker.com/_/python/tags?page=1&name=3.11).
Here are some general guidelines for your Dockerfile files.
Frontend
-
Use the
python:3.11.3-slim-bullseyebase image. -
Optionally use the
WORKDIRcommand to set an internal (to the container) working directory/app. -
Copy the
project13/frontenddirectory to the container, maybe in the/appworkdir.You can use
COPY . /app/to copy the contents of the current directory (the directory where yourDockerfilelives) to the/appdirectory in the container. -
Install the required Python packages using
pip.The following are the required Python packages:
httpxand"fastapi[all]"(the double quotes are needed). -
Use
EXPOSEto mark port 8888 as being used by the container. -
Use
CMDorENTRYPOINTto start the application.Use the
--hostargument touvicornand specify0.0.0.0to broadcast on all network interfaces.Since you are running your application from a different perspective than before, you will need to modify
backend.endpoints:apptoendpoints:app.
|
To build the image, you can use the following command.
|
Backend
-
Use the
python:3.11.3-slim-bullseyebase image. -
Optionally use the
WORKDIRcommand to set an internal (to the container) working directory/app. -
Copy the
project13/backenddirectory to the container, maybe in the/appworkdir.You can use
COPY . /app/to copy the contents of the current directory (the directory where yourDockerfilelives) to the/appdirectory in the container. -
Install the required Python packages using
pip.The following are the required Python packages:
httpx,"fastapi[all]",aiosql==7.2, andpydantic(the double quotes are needed). -
Use
EXPOSEto mark port 7777 as being used by the container. -
Use
VOLUMEto specify a mount point inside the container. This will be where we will mountimdb.dbso that our application can access the databse outside of the container. You should use the location/data. -
Use
CMDorENTRYPOINTto start the application.Use the
--hostargument touvicornand specify0.0.0.0to broadcast on all network interfaces.Since you are running your application from a different perspective than before, you will need to modify
frontend.api.api:apptoapi.api:app.
|
To build the image, you can use the following command.
|
For this question, include the contents of both of your Dockerfile files in your submission. If you make mistakes and need to modify your Dockerfile files in future questions, please update your submission for this question to be the functioning Dockerfile files.
-
Code used to solve this problem.
-
Output from running the code.
Question 4
Okay, awesome! You now have a couple of container images built and available on your VM, named client and server. You should be able to see these images by running the following command.
docker images
Okay, the next step is to run both of the containers, making sure that they can communicate. Our ultimate goal here is to run the following command and get the following results.
curl localhost:8888/people/nm0000148
<html>
<head>
<title>Harrison Ford</title>
<script src="https://unpkg.com/htmx.org@1.8.6"></script>
</head>
<body>
<div hx-target="this" hx-swap="outerHTML">
<div>
<label for="person_id">Person ID:</label> nm0000148
</div>
<div>
<label for="name">Name:</label> Harrison Ford
</div>
<div>
<label for="born">Born:</label> 1942
</div>
<div>
<label for="died">Died:</label> None
</div>
<button hx-get="http://localhost:8888/people/nm0000148/update">Click to update</button>
</div>
</body>
</html>
We want those results because it demonstrates, in a single command, a variety of important things:
-
We can access the frontend from the host machine (our VM).
-
The frontend can access the backend.
-
The backend can access the database.
This is enough evidence for us to say that our containers are communicating properly and are good enough to deploy (in the next project).
First thing is first. By default, Docker will add any running container to the bridge network. You can see this network listed by running the following.
docker network ls
NETWORK ID NAME DRIVER SCOPE 6c21df067202 bridge bridge local 8acdd7457852 host host local 78e8c707cf0c none null local
In theory, if you ran our frontend on the network on 0.0.0.0:8888 and the server on the same network at 0.0.0.0:7777, they should be able to communicate. However, with the way we have our frontend configured in endpoints.py, it will not work. We can’t just specify localhost and move on, instead, we would need to specify the actual IP address that the server is assigned on the bridge network. This is a bit of a pain, so we are going to create a new user network and run our containers on that network. This way, we can refer to other containers on the same network by their name rather than their IP address.
Let’s create this network. We can call it anything, however, we will call it tdm-net.
docker network create tdm-net
Upon success, you should see the network in your list of networks.
docker network ls
NETWORK ID NAME DRIVER SCOPE 6c21df067202 bridge bridge local 8acdd7457852 host host local 78e8c707cf0c none null local 40574054296e tdm-net bridge local
Now, in order to run our client (frontend) and server (backend) on the tdm-net network, we just need to add --net tdm-net to our docker run commands. Great!
Frontend
|
The |
|
It would be best to run this container using |
|
Don’t forget to run this container on the |
Backend
|
By default, we have |
|
The |
|
It would be best to run this container using |
|
Use the |
|
Don’t forget to run this container on the |
General tips
|
You can see if your containers are running properly by running |
|
If you need to tear down a running container named
|
|
If
|
|
If you want to "pop into" a running container, for example, the client, you can do so by running the following.
|
|
You may be wondering why we are using It is very common to have a need to persist some type of data. When this is needed, look towards using |
For this question, simply include a screenshot showing the successful curl command and output.
-
Code used to solve this problem.
-
Output from running the code.
Question 5
Finally, please verify that you have access to two resources for the next project (even if you don’t plan on doing it). On the Purdue VPN or on a Purdue network, please visit the following links:
-
Login using your 2-factor authentication (Purdue Login on Duo Mobile).
-
Click on the "geddes" name under the "Clusters" section.
-
Click on the
Projects/Namespacesunder the "Cluster" tab on the left-hand side. -
Make sure you can see something like "The Data Mine - Students (tdm-students)". If you can, take a screenshot and you are done with this part. If you cannot, please email post in Piazza with your Purdue username and specify that you could not see the Geddes project.
-
Login using your Purdue alias and regular password.
-
If you get logged in successfully, take a screenshot and you are done with this part. If you cannot, please post in Piazza with your Purdue username and specify that you could not login to the Geddes registry.
Include both screenshots for this question. If you failed on one or more of the steps, please just specify that you posted in Piazza and you will receive full credit.
-
Code used to solve this problem.
-
Output from running the code.
|
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |