Notebook Servers: Difference between revisions
Brian Wilson (talk | contribs) mNo edit summary |
Brian Wilson (talk | contribs) |
||
Line 86: | Line 86: | ||
In theory I can [https://stackoverflow.com/questions/65151990/installing-conda-on-deepnote run in a Docker], | In theory I can [https://stackoverflow.com/questions/65151990/installing-conda-on-deepnote run in a Docker], | ||
but I have to set up access to the Google docker repos. | but I have to set up access to the Google docker repos. | ||
It does not seem like they want me to do this. So I won't, I will use Zeppelin. | |||
Put this in a Dockerfile | Put this in a Dockerfile |
Revision as of 22:30, 9 August 2022
Basically Jupyter already runs as a server on your local machine, but now there are a bunch of other ways to run "notebooks".
I am looking at alternatives to the ArcGIS Notebook Server because it's $20000 + $5000/year for what appears to be basically a Docker manager. Esri uses the commercial version of Docker, that means they have to license it from Mirantis.
I've never had a need to license Docker, I just use the community version.
Resources
DataScienceNotebook -- I started making a list of options but then I found https://datasciencenotebook.org/ which was created by someone at the Deepnote project. I marched through the ones marked "OpenSource". The problem is the emphasis here is not on Python notebooks. It's on "data science" notebooks, some of which include Python support.
I found many links to people giving suggestions on how to set up just a plain old Jupyter server for example here. I need other features though.
My requirements
- Must support Conda so that I can install arcgis.
- Can I schedule jobs to run?
- Is there a dark mode?
- Can I store notebooks in git?
Okay maybe the last one is not a requirement.
I just decided to look at Deepnote first. It's running already on someone else's server.
Deepnote
They don't charge for it, so does it do what I need?
YES in fact it appears to check all the boxes. I have not tried storing a project in Github or running a local copy yet.
I used my brian32768@github account to access it.
I was able to run an arcgis task in it.
Can I install arcgis module?
Of course you can.
Create a notebook and install conda:
# 1. Install Conda and make Conda packages available in current environment !wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh !chmod +x Miniconda3-latest-Linux-x86_64.sh !sudo bash ./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local import sys sys.path.append('/usr/local/lib/python3.7/site-packages/')
Install a package:
!sudo conda install -y arcgis -c esri
Use it:
from arcgis import gis as GIS gis = GIS(portal="", username="", password="") cm = gis.content maps = cm.search("", item_type='Web Map', outside_org=False,max_items=-1) thismap = 0 for map in maps: thismap += 1 print(f"{thismap}: {map.title}")
Okay, so that took all of 10 minutes.
Point goes to Deepnote.
What about scheduling?
Yes, another point for Deepnote. See How to schedule a notebook
Running locally
In theory I can run in a Docker, but I have to set up access to the Google docker repos.
It does not seem like they want me to do this. So I won't, I will use Zeppelin.
Put this in a Dockerfile
FROM gcr.io/deepnote-200602/templates/deepnote RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh RUN bash ~/miniconda.sh -b -p $HOME/miniconda ENV PATH $HOME/miniconda/bin:$PATH RUN conda install python=3.7 ipykernel -y RUN conda install <insert packages here> -y RUN python -m ipykernel install --user --name=conda ENV DEFAULT_KERNEL_NAME "conda"
docker build -t deepnote .
It fails because I don't have access to the Google data. See https://docs.deepnote.com/integrations/google-container-repository
Git integration
You can store projects in github. https://docs.deepnote.com/integrations/github
Zeppelin
docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.10.0
Okay now what -- that worked. I can type Python in a browser window and run it.
Can I do the same things I did in Deepnote to install the arcgis module?
- You don't have to install conda, it's already installed for you.
- Nothing actually works the way you expect.
- I gave up.
Fails--
%python.conda install arcgis -c esri
Polynote
Runs on Apache Spark.
Python depends on pip, strangely awkward. Moving on.
Install https://polynote.org/latest/docs/installation/ In Docker, they give me a blank page with an "edit" pencil. Huh. https://polynote.org/latest/docs/docker/ See also https://hub.docker.com/r/polynote/polynote :-) And for actual instructions, see https://github.com/polynote/polynote/tree/master/docker
cat > config.yml listen: host: 0.0.0.0 storage: dir: /opt/notebooks mounts: examples: dir: examples
Then run this; if you don't create 'notebooks', Docker will create it and it won't be writeable.
mkdir notebooks docker run --rm -it -p 8192:8192 -p 4040-4050:4040-4050 -v `pwd`/config.yml:/opt/config/config.yml -v `pwd`/notebooks:/opt/notebooks/ polynote/polynote:latest --config /opt/config/config.yml
Then go to http://cc-testmaps:8192/
I might be able to create my own image with arcgis pre-installed in it?
I was able to download and install Miniconda interactively, which means I should be able to run it in a Dockerfile?
JupyterHub
Looks insanely complicated.
CoCalc
"On Prem" = $999 / year
The list says it's "Open Source" but look like that is no longer true.
nteract (sic)
Not even sure what this is.
Querybook
Forget this, there is no option to use Python. QueryBook is "science for dummies". Might be a great way to experiment with SQL queries.
It's friendly though. I wonder if I can tone that down: FRIENDLINESS_LEVEL=10 # Default:10 Set to an integer, 0-10
Looks like it wants a lot of memory.
git clone https://github.com/pinterest/querybook.git cd querybook make
http://cc-testmaps.clatsop.co.clatsop.or.us:10001/
Did not complete. If I can't start a service in a docker, I'm thinking it's time to move on. So unfair of me, and yet, I've already seen two candidates for this project that look promising, DeepNote and Zeppelin.
Okay okay I gave it a second try. I just want to see its web page. It's spitting out lots of warning messages but in fact it did start. I can see redis, mysql, elasticsearch, a "scheduler", a "worker", and a web server.