Skip to content

Dockerfile change #11

Open
Open
@wilberh

Description

@wilberh

Had to do 2 local changes listed below in the Dockerfile to make it work. Only the first time it took long to create the image because it was downloading the jupyter/pyspark-notebook base image(s) and all those spacy packages. I could be wrong on this but noticed it used at least 40GB of my local drive (that included me trying to find the correct tag for the base image), in order to produce a 12.9GB docker image.

Dockerfile changes:

  • had to set a specific Python3.8 version
  • added an ENTRYPOINT using "jupyter-lab"

Also, created a docker-compose file to simplify the cli-command [ docker compose up -d --build ] to build and (re)deploy/run the image.

# Based on the Dockerfiles from the Jupyter Development Team which 
# are Copyright (c) Jupyter Development Team and distributed under 
# the terms of the Modified BSD License.
ARG OWNER=jupyter
ARG BASE_CONTAINER=$OWNER/pyspark-notebook:python-3.8
FROM $BASE_CONTAINER

LABEL maintainer="Paul Deitel <[email protected]>"

# Fix: https://github.com/hadolint/hadolint/wiki/DL4006
# Fix: https://github.com/koalaman/shellcheck/wiki/SC3014
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

RUN mamba install --yes \
    'dnspython' \
    'folium' \
    'geopy' \
    'imageio' \
    'nltk'  \
    'pymongo' \
    'scikit-learn' \
    'spacy' \
    'tweepy' 
     
RUN pip install --upgrade \
    'tensorflow' \
    'openai' \
    'beautifulsoup4' \
    'deepl' \
    'mastodon.py' \
    'better_profanity'  \
    'tweet-preprocessor' \
    'ibm-watson' \
    'pubnub' \
    'textblob' \
    'wordcloud' \
    'dweepy' \
    'sounddevice'
    

# download data required by textblob and spacy
RUN python -m textblob.download_corpora && \
    python -m spacy download en_core_web_sm && \
    python -m spacy download en_core_web_md && \
    python -m spacy download en_core_web_lg 

# clean up
RUN mamba clean --all -f -y && \
    fix-permissions "${CONDA_DIR}" && \
    fix-permissions "/home/${NB_USER}"

ENTRYPOINT ["start.sh", "jupyter-lab"]

Docker compose file:

version: "3"

services:
  deitelpydsft:
    container_name: deitelpydsft
    user: root
    volumes:
      - .:/home/jovyan/work
    build: .
    restart: always
    # env_file: .env
    ports:
      - "8888:8888"
      - "4040:4040"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions