Something I don't understand about Docker (as a new user)
5 votes c/questions Posted by zPlus — 5 votes, 1 commentsSource

I’m new to docker. I’m trying to learn it because it looks like something that I could use. I went through their intro tutorial but there’s something I don’t understand.

The tutorial basically creates a new image by pulling another one called “Python3”, and then it installs a minimal Flask app with only one route for showing a cat picture.

Question 1: how can I tell what the base distro is? I have a feeling it’s either debian or busybox, but I can’t tell

Question 2: the final image is ~1GB. Is this… normal? Am I missing some details? The Flask app is just 2KB, but the final image is 1GB. For people working on deploying docker services, is it acceptable to deploy 1GB when the app (with dependencies) is maybe a few MB at most?

This is the Dockerfile used for building the image:

FROM python:3.8
WORKDIR /usr/src/app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "./"]

The base distro is the distro you decided to use. The first line you pointed is the base image of your container. The image is based on another image that based on another image… What is about root image? ¯\_(ツ)_/¯
In general Docker file is just a set of instructions for building the image. FROM says that you want inheritance (just like including header file) like in programming from another docker file.

Do you need to know what is base distro? Go to the place where you get your bases. Most possibly it is docker hub. Then you need to search your concrete base image and it’s tag. And in your case it is

Yeah, it is normal that your image is 1GB because the python image is based on debian

One important thing about how docker works. It is not VM it works on cgroups and other resource management capabilities of the kernel.
In general you need to pack in your docker image whole system except kernel (I don’t know but logic told me it). Kernel creates virtual list of processes (not VM it is not emulation) where all the process can see kernel and other prisoners of the sandbox. They don’t know they are in the sandbox and work with each other as expect. Kernel split scopes so main system doesn’t affect the sandbox and the sandbox doesn’t interact/conflict with the programs of main system.