
A few months ago I attended the Spatial Data Science Conference in New York City, hosted by Carto.
Spatial Data Science is a field that deals with spatial data – that is, data points that have a geography associated with them. For example, a database of sales by ZIP code would be spatial data. In the planning world, spatial data tends to show up in conjunction with GIS (Geographic Information Systems), the most prevalent of which is ArcGIS (and its open-source equivalent, QGIS).
In college, I worked for a semester at the maps library at the University of Michigan. I like maps, and spatial data, so when I found out about this conference I was excited to see how spatial data was used in industry.
I will have to type up my notes from the conference in a different post, but I’m writing this post because I want to learn more about spatial data science. I’m ~manifesting~ this by writing about it, hoping that it will keep me accountable. Back in 2023, I wrote about GIS being one of the three technical skills I wanted to learn.
Here are some things that interest me about spatial data science, that I would like to learn. I don’t have a plan or goal associated with this yet, because I want to make sure that anything I commit to is realistic (which I haven’t done the greatest job of doing in the past…oops).
Things I want to learn about Spatial Data
GIS
I’m lucky enough to have access to ArcGIS at work, and it’s something that my manager and others have said would be super useful to learn. I’d like to take advantage of more of the tutorials that ESRI offers, and build a good base in GIS.
In particular, I’ve been told to focus on the following GIS topics:
- merging shapefiles
- creating new attributes
- manually editing those attributes
- buffer analyses (and general geoprocessing)
- network analyses
- ArcOnline (including being able to share maps with other people and allowing them to comment)
GeoPandas
Pandas is a library that’s used in data science. It stands for the Python Data Analysis Library. GeoPandas is an “open source project to make working with geospatial data in python easier.”
I’d like to learn to use GeoPandas to solve real world problems. I’d also like to learn more about the principal data structures of GeoPandas, GeoSeries and GeoDataFrames.
GeoJSON
GeoJSON is a format that you can use to share geographic data between different platforms. I first heard about it because it’s one of the supported formats for Kepler.gl, an open-source data visualization tool. It can also be used with GeoPandas. I’d like to learn more about it.
PySAL
As the Spatial Data Science Conference, I attended a talk by Eli Knaap, an urban social / spatial data scientist at San Diego State University. His talk was about PySAL, which is a library for Geospatial Data Science.
It looks really cool, and I’d love to learn more about it.