Pandas

From Wildsong
Jump to navigationJump to search

All this stuff is amazing but hard to wrap my head around because it's more like working with SQL than nice linear procedural programming.

When something is hard for me, I write about it in my Wiki.

This page currently is really all about GeoPandas, for now.

There is also the ArcGIS version and I should write about that here too.

Pandas

This is a Python package for manipulating data that implements dataframes. Dataframes are well amazing. Also extremely well documented elsewhere.

So run along and learn Pandas THEN come back and learn GeoPandas.

Esri "ArcGIS API for Python"

Yeah so you can go install this and learn about it, docs are... difficult.

They have these installed,

 packages in environment at C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3:
#
# Name                    Version                   Build  Channel
appdirs                   1.4.4                      py_0
arcgis                    1.8.3                 py37_1668    esri
arcgispro                 2.7                           0    esri
arcpy                     2.7             py37_arcgispro_26810  [arcgispro]  esri
argon2-cffi               20.1.0           py37he774522_1
asn1crypto                1.4.0                      py_0
async_generator           1.10                       py_0
atomicwrites              1.4.0                      py_0
attrs                     20.2.0                     py_0
backcall                  0.2.0                      py_0
black                     19.10b0                    py_0
blas                      1.0                         mkl
bleach                    3.2.1                      py_0
blinker                   1.4                      py37_0
brotlipy                  0.7.0           py37he774522_1000
cached-property           1.5.2                      py_0
certifi                   2020.6.20                py37_0
cffi                      1.14.3           py37h7a1dbc1_0
cftime                    1.0.0b1                  py37_0    esri
chardet                   3.0.4                 py37_1003
click                     7.1.2                      py_0
colorama                  0.4.3                      py_0
cppzmq                    4.4.1                         2    esri
cryptography              2.8                      py37_0    esri
cycler                    0.10.0                   py37_0
decorator                 4.4.2                      py_0
defusedxml                0.6.0                      py_0
despatch                  0.1.0                    py37_0    esri
entrypoints               0.3                      py37_0
et_xmlfile                1.0.1                   py_1001
fastcache                 1.1.0            py37he774522_0
flake8                    3.8.3                      py_0
freetype                  2.10.1                   vc14_0  [vc14]  esri
future                    0.18.2                   py37_0    esri
gdal                      2.3.3           arcgispro_py37_16713  [arcgispro]  esri
h5py                      2.10.0          py37_arcgispro_9  [arcgispro]  esri
html5lib                  1.1                        py_0
icc_rt                    2019.0.5            arcgispro_0  [arcgispro]  esri
idna                      2.10                       py_0
importlib-metadata        1.7.0                    py37_0
importlib_metadata        1.7.0                         0
iniconfig                 1.0.1                      py_0
intel-openmp              2020.0            arcgispro_166  [arcgispro]  esri
ipykernel                 5.3.4            py37h5ca1d4c_0
ipython                   7.18.1                   py37_0    esri
ipython_genutils          0.2.0                    py37_0
ipywidgets                7.5.1                      py_1
jdcal                     1.4.1                      py_0
jedi                      0.17.2                   py37_0    esri
jinja2                    2.11.2                     py_0
jpeg                      9d                            0    esri
json5                     0.9.4                    py37_0    esri
jsonschema                3.2.0                    py37_1
jupyter_client            6.1.7                      py_0    esri
jupyter_console           6.2.0                      py_2    esri
jupyter_contrib_core      0.3.3                    py37_3    esri
jupyter_contrib_nbextensions 0.5.1                   py37_10    esri
jupyter_core              4.6.3                    py37_2    esri
jupyter_highlight_selected_word 0.2.0                    py37_2    esri
jupyter_latex_envs        1.4.4                    py37_1    esri
jupyter_nbextensions_configurator 0.4.1                    py37_1    esri
jupyterlab                2.2.7                      py_0    esri
jupyterlab_pygments       0.1.1                      py_1    esri
jupyterlab_server         1.2.0                      py_0
keyring                   21.4.0                   py37_0    esri
kiwisolver                1.2.0            py37h74a9793_0
lerc                      2.2                        py_0    esri
libpng                    1.6.37               h2a8f88b_0
libsodium                 1.0.18                        1    esri
libtiff                   4.1.0                         0    esri
libxml2                   2.9.10              arcgispro_0  [arcgispro]  esri
libxslt                   1.1.34               he774522_0
lxml                      4.5.2            py37h1350720_0
lz4-c                     1.9.2                hf4a77e7_3
markupsafe                1.1.1            py37hfa6e2cd_1
matplotlib                3.3.1           py37_arcgispro_0  [arcgispro]  esri
mccabe                    0.6.1                    py37_1
mistune                   0.8.4           py37hfa6e2cd_1001
mkl                       2020.0            arcgispro_167  [arcgispro]  esri
mkl-service               2.3.0            py37hb782905_0
mkl_fft                   1.2.0            py37h45dec08_0
mkl_random                1.2.0                    py37_0    esri
mpmath                    1.1.0                    py37_0
mypy_extensions           0.4.3                    py37_0
nbclient                  0.5.0                      py_0    esri
nbconvert                 5.6.1                    py37_0    esri
nbformat                  5.0.7                      py_1    esri
nest-asyncio              1.3.2                      py_0    esri
netcdf4                   1.5.4           py37_arcgispro_5  [arcgispro]  esri
networkx                  2.5                      py37_0    esri
nlohmann_json             3.7.0                         1    esri
nose                      1.3.7                 py37_1004
notebook                  5.7.10                   py37_0
ntlm-auth                 1.4.0                      py_0    esri
numexpr                   2.7.1            py37h25d0782_0
numpy                     1.19.1           py37h5510c5b_0
numpy-base                1.19.1           py37ha3acd2a_0
oauthlib                  3.1.0                      py_0
olefile                   0.46                     py37_0
openpyxl                  3.0.5                      py_0
openssl                   1.1.1g                        2    esri
packaging                 20.4                       py_0
pandas                    1.1.1            py37ha925a31_0
pandocfilters             1.4.2                    py37_1
parso                     0.7.0                      py_0
pathspec                  0.7.0                      py_0
pefile                    2019.4.18                  py_0
pickleshare               0.7.5                 py37_1001
pillow-simd               7.1.2                    py37_3    esri
pip                       20.2.2                   py37_0
pluggy                    0.13.1                   py37_0
pro_notebook_integration  2.7                      py37_1    esri
prometheus_client         0.8.0                      py_0    esri
prompt_toolkit            3.0.5                      py_0    esri
psutil                    5.7.2            py37he774522_0
py                        1.9.0                      py_0
pybind11                  2.3.0                         1    esri
pycodestyle               2.6.0                      py_0
pycparser                 2.20                       py_2
pyflakes                  2.2.0                      py_0
pygments                  2.7.0                      py_0    esri
pyjwt                     1.7.1                    py37_0
pyopenssl                 19.1.0                     py_1
pyparsing                 2.4.7                      py_0
pypdf2                    1.26.0                     py_2    esri
pyrsistent                0.17.3           py37he774522_0
pyshp                     2.1.2                      py_0
pysocks                   1.7.1                    py37_1
pytest                    6.1.1                    py37_0    esri
python                    3.7.9                         2    esri
python-certifi-win32      1.2                      py37_0    esri
python-dateutil           2.8.1                      py_0
pytz                      2020.1                   py37_0    esri
pywin32-ctypes            0.2.0                    pypi_0    pypi
pywin32-security          228                      py37_3    esri
pywinpty                  0.5.7                    py37_0    esri
pyyaml                    5.3.1            py37he774522_1
pyzmq                     19.0.2                   py37_1    esri
regex                     2020.7.14        py37he774522_0
requests                  2.24.0                     py_0
requests-kerberos         0.12.0                        0    esri
requests-negotiate-sspi   0.5.2                    py37_1    esri
requests-oauthlib         1.3.0                      py_0
requests-toolbelt         0.9.1                      py_0
requests_ntlm             1.1.0                      py_0    esri
scipy                     1.5.2            py37h9439919_0
send2trash                1.5.0                    py37_0
setuptools                49.6.0                   py37_0
simplegeneric             0.8.1                    py37_2
six                       1.15.0                     py_0
sqlite                    3.33.0               h2a8f88b_0
sympy                     1.5.1                    py37_0    esri
terminado                 0.8.3                    py37_0
testpath                  0.4.4                      py_0
toml                      0.10.1                     py_0
tornado                   6.0.4            py37he774522_1
traitlets                 4.3.3                    py37_0
typed-ast                 1.4.1            py37he774522_0
typing_extensions         3.7.4.3                    py_0
ujson                     3.1.0            py37ha925a31_0
urllib3                   1.25.10                    py_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_0    esri
wcwidth                   0.2.5                      py_0
webencodings              0.5.1                    py37_1
wheel                     0.35.1                     py_0
widgetsnbextension        3.5.1                    py37_0
win_inet_pton             1.1.0                    py37_0
wincertstore              0.2                      py37_0
winkerberos               0.7.0                    py37_1
winpty                    0.4.3                         4
wrapt                     1.12.1           py37he774522_1
x86cpu                    0.4                      py37_1    esri
xarray                    0.16.0                     py_0
xeus                      0.24.1                        1    esri
xlrd                      1.2.0                      py_0
xlwt                      1.3.0                    py37_0
xtl                       0.6.5                         1    esri
xz                        5.2.5                h62dcd97_0
yaml                      0.2.5                         0    esri
zeromq                    4.3.2                         2    esri
zipp                      3.1.0                      py_0
zlib                      1.2.11               h62dcd97_4
zstd                      1.4.5                h04227a9_0

GeoPandas

GeoPandas is geospatially enabled Pandas. Go right to the page for full info, https://geopandas.org.

Installation

Apparently there is a problem using the geopandas package and its dependencies from conda-forge. Long form, problem with fiona: https://github.com/geopandas/geopandas/issues/989 Short answer: disable conda-forge in .condarc and start again.

conda remote -n geopandas --all
conda create -n geopandas python autopep8 geopandas matplotlib jupyter jupyterlab
conda activate geopandas

In "conda list" I see conda found packages in the esri channel, not sure where exactly it picked that up. So be it. I ran the same commands on my tiny Linux server, Bellman, and it happily complied. I note it did not grab anything from esri, I must have had something cached at work?

Hello World from GeoPandas

In VSCODE, start a Jupyter Notebook. Follow along with someone else's sample. See the Reference section below.

import pandas as pd
import geopandas
import matplotlib as plt
roads = geopandas.read_file("K:/LISData/roads_county.shp") # Of course put your own shapefile here!
roads.head(5)

and I glory in this!

STREET	FROMLEFT	TOLEFT	FROMRIGHT	TORIGHT	LESN	RESN	NOTES	COMMENTS	ST_CHK	...	SYMBOL	CITY	IRISNUM	sign_lomil	sign_himil	FUNCLASSD	FUNCLASSM	FUNCLASSN	Shape_Leng	geometry
0	HIGHLANDS LN	33301	33399	33300	33398	None	None	None	VERIFY ADDRESS RANGE	GSI	...	18	None	1010.0	0.15	0.29	Rural Local	Local	4	960.080746	LINESTRING (7333565.484 887114.321, 7333598.68...
1	HAWKINS RD	90200	90298	90201	90299	None	None	None	None	GSI	...	18	None	1020.0	0.19	0.51	Rural Local	Local	4	1038.814776	LINESTRING (7336949.000 902144.750, 7336920.40...
2	MAKI RD	0	0	0	0	None	None	None	None	None	...	18	None	411.0	0.00	0.93	Rural Local	Local	4	325.748159	LINESTRING (7399846.775 921085.451, 7399836.70...
3	LOWER NEHALEM RD	79500	79998	79501	79999	None	None	None	None	GSI	...	19	None	1404.0	3.21	5.25	Rural Major Collector	Collector	3	6089.149044	LINESTRING (7411398.616 799370.758, 7411406.96...
4	LOWER NEHALEM RD	

and CRIKEY it works, right there in VSCODE, the wonder of it!

On the Linux box Bellman, I started ipython3 from the command line and pasted the code in. Over there, I had to use the shape file /home/bwilson/source/GIS/libspatialite-4.3.0/test/shp/merano-3d/roads.shp and instantaneously it responded with

   PK_UID  FEATURE_ID  CODE  SUB_TYPE  ...     E_ID  WINKEL    Length                                           geometry
0     156       92002     0        20  ...  9200200       0  0.192562  LINESTRING (666766.765 5169845.031, 666762.979...
1     185       92002     0        20  ...  9200200       0  0.011543  LINESTRING (666544.351 5170211.610, 666554.939...
2     187       92002     0        20  ...  9200200       0  0.138272  LINESTRING (666544.351 5170211.610, 666544.345...
3     188       92002     0        20  ...  9200200       0  0.115159  LINESTRING (666354.540 5170396.642, 666350.671...
4     189       92002     0        20  ...  9200200       0  0.134453  LINESTRING (666354.540 5170396.642, 666354.545... 
[5 rows x 25 columns]

Amazed I am. "roads.plot()" will run but since I am on a terminal window it had no place to open the plot. I'd need X11 working but I think I'll try VSCODE in remote mode first. And it too, works! Thrilled I am.

References

Mastering pandas - Second Edition has a short but sweet chapter on GeoPandas.

Learning Geospatial Analysis with Python - Third Edition has just a brief description of GeoPandas

There are some older references in O'Reilly as well.

This video with Andy Eschbacher has some talk about GeoPandas starting at about 30:00 More tools to look at (ugh): matplotlib, Folium