Notebook Servers: Difference between revisions

From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs)
mNo edit summary
Brian Wilson (talk | contribs)
mNo edit summary
Line 7: Line 7:
I've never had a need to license Docker, I just use the community version.
I've never had a need to license Docker, I just use the community version.


I started making a list of options but then I found https://datasciencenotebook.org/
=== Resources ===
 
''DataScienceNotebook'' -- I started making a list of options but then I found https://datasciencenotebook.org/
which was created by someone at the Deepnote project. I marched through the ones marked "OpenSource".
which was created by someone at the Deepnote project. I marched through the ones marked "OpenSource".
The problem is the emphasis here is not on Python notebooks. It's on "data science" notebooks,  
The problem is the emphasis here is not on Python notebooks. It's on "data science" notebooks,  
some of which include Python support.
some of which include Python support.
I found many links to people giving suggestions on how to set up just a plain old Jupyter server for example
[https://ipython.org/ipython-doc/3/notebook/public_server.html here.] I need other features though.


== My requirements ==
== My requirements ==

Revision as of 15:55, 29 July 2022

Basically Jupyter already runs as a server on your local machine, but now there are a bunch of other ways to run "notebooks".

I am looking at alternatives to the ArcGIS Notebook Server because it's $20000 + $5000/year for what appears to be basically a Docker manager. Esri uses the commercial version of Docker, that means they have to license it from Mirantis.

I've never had a need to license Docker, I just use the community version.

Resources

DataScienceNotebook -- I started making a list of options but then I found https://datasciencenotebook.org/ which was created by someone at the Deepnote project. I marched through the ones marked "OpenSource". The problem is the emphasis here is not on Python notebooks. It's on "data science" notebooks, some of which include Python support.

I found many links to people giving suggestions on how to set up just a plain old Jupyter server for example here. I need other features though.

My requirements

  1. Must support Conda so that I can install arcgis.
  2. Can I schedule jobs to run?
  3. Is there a dark mode?
  4. Can I store notebooks in git?

Okay maybe the last one is not a requirement.

I just decided to look at Deepnote first. It's running already on someone else's server.


Deepnote

They don't charge for it, so does it do what I need?

YES in fact it appears to check all the boxes. I have not tried storing a project in Github or running a local copy yet.

I used my brian32768@github account to access it.

I was able to run an arcgis task in it.

Can I install arcgis module?

Of course you can.

Create a notebook and install conda:

# 1. Install Conda and make Conda packages available in current environment

!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
!chmod +x Miniconda3-latest-Linux-x86_64.sh
!sudo bash ./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local

import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

Install a package:

!sudo conda install -y arcgis -c esri

Use it:

from arcgis import gis as GIS
gis = GIS(portal="", username="", password="")
cm = gis.content
maps = cm.search("", item_type='Web Map', outside_org=False,max_items=-1)
thismap = 0
for map in maps:
    thismap += 1
    print(f"{thismap}: {map.title}")

Okay, so that took all of 10 minutes.

Point goes to Deepnote.

What about scheduling?

Yes, another point for Deepnote. See How to schedule a notebook

Running locally

In theory I can run in a Docker, but I have to set up access to the Google docker repos.

Put this in a Dockerfile

FROM gcr.io/deepnote-200602/templates/deepnote
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
RUN bash ~/miniconda.sh -b -p $HOME/miniconda
ENV PATH $HOME/miniconda/bin:$PATH
RUN conda install python=3.7 ipykernel -y
RUN conda install <insert packages here> -y
RUN python -m ipykernel install --user --name=conda
ENV DEFAULT_KERNEL_NAME "conda"

docker build -t deepnote .

It fails because I don't have access to the Google data. See https://docs.deepnote.com/integrations/google-container-repository

Git integration

You can store projects in github. https://docs.deepnote.com/integrations/github

Zeppelin

https://zeppelin.apache.org/

docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.10.0

Okay now what -- that worked. I can type Python in a browser window and run it.

Can I do the same things I did in Deepnote to install the arcgis module?

  1. You don't have to install conda, it's already installed for you.
  2. Nothing actually works the way you expect.
  3. I gave up.

Fails--

%python.conda install arcgis -c esri


Polynote

Runs on Apache Spark.

Python depends on pip, strangely awkward. Moving on.

Install https://polynote.org/latest/docs/installation/ In Docker, they give me a blank page with an "edit" pencil. Huh. https://polynote.org/latest/docs/docker/ See also https://hub.docker.com/r/polynote/polynote :-) And for actual instructions, see https://github.com/polynote/polynote/tree/master/docker

cat > config.yml
listen:
  host: 0.0.0.0

storage:
  dir: /opt/notebooks
  mounts:
    examples:
      dir: examples

Then run this; if you don't create 'notebooks', Docker will create it and it won't be writeable.

mkdir notebooks
docker run --rm -it -p 8192:8192 -p 4040-4050:4040-4050 -v `pwd`/config.yml:/opt/config/config.yml -v `pwd`/notebooks:/opt/notebooks/ polynote/polynote:latest --config /opt/config/config.yml

Then go to http://cc-testmaps:8192/

I might be able to create my own image with arcgis pre-installed in it?

I was able to download and install Miniconda interactively, which means I should be able to run it in a Dockerfile?

JupyterHub

Looks insanely complicated.

CoCalc

"On Prem" = $999 / year

The list says it's "Open Source" but look like that is no longer true.

nteract (sic)

Not even sure what this is.

Querybook

Forget this, there is no option to use Python. QueryBook is "science for dummies". Might be a great way to experiment with SQL queries.

It's friendly though. I wonder if I can tone that down: FRIENDLINESS_LEVEL=10 # Default:10 Set to an integer, 0-10

Looks like it wants a lot of memory.

git clone https://github.com/pinterest/querybook.git
cd querybook
make

http://cc-testmaps.clatsop.co.clatsop.or.us:10001/

Did not complete. If I can't start a service in a docker, I'm thinking it's time to move on. So unfair of me, and yet, I've already seen two candidates for this project that look promising, DeepNote and Zeppelin.

Okay okay I gave it a second try. I just want to see its web page. It's spitting out lots of warning messages but in fact it did start. I can see redis, mysql, elasticsearch, a "scheduler", a "worker", and a web server.