I am working in direction of plotting a big GIS dataset of which I’ve proven a pattern above of about 1/6 of the info. I am pleased with how rapidly the info masses in, and bokeh
renders the html almost instantaneously. Nonetheless, I’ve encountered a fairly scorching loop in my code that’s not scaling properly as I improve the 1) variety of rows and a couple of) decision of the polygons. I am simply getting killed within the #depend factors
loop and am questioning if there is not a greater manner of doing this?
I discovered the suggestion for a loop off a GIS readthedoc.io and was pleased with its efficiency for a couple of thousand factors a pair months in the past. However now the challenge must course of a GeoDataFrame
with >730000 rows. Is there a greater methodology I am suppose to be utilizing to depend the variety of factors in every polygon? I am on a contemporary desktop to do the computation however the challenge has entry to Azure sources so perhaps that is most individuals professionally do that kind of computation? I might favor to do the computation regionally however it means my desktop may need to sit down at max cpu cycles in a single day or longer which is not an exhilarating prospect. I am utilizing Python 3.8.2 & Conda 4.3.2.
from shapely.geometry import Polygon
import pysal.viz.mapclassify as mc
import geopandas as gpd
def count_points(main_df, geo_grid, ranges=5):
"""
outputs a gdf of polygons with a columns of classifiers for use for coloration mapping
"""
pts = gpd.GeoDataFrame(main_df["geometry"]).copy()
#counts factors
pts_in_polys = []
for i, poly in geo_grid.iterrows():
pts_in_this_poly = []
for j, pt in pts.iterrows():
if poly.geometry.accommodates(pt.geometry):
pts_in_this_poly.append(pt.geometry)
pts = pts.drop([j])
nums = len(pts_in_this_poly)
pts_in_polys.append(nums)
geo_grid['number of points'] = gpd.GeoSeries(pts_in_polys) #Provides variety of factors in every polygon
# Provides Quantiles column
classifier = mc.Quantiles.make(ok=ranges)
geo_grid["class"] = geo_grid[["number of points"]].apply(classifier)
# Provides Polygon grid factors to new geodataframe
geo_grid["x"] = geo_grid.apply(getPolyCoords, geom="geometry", coord_type="x", axis=1)
geo_grid["y"] = geo_grid.apply(getPolyCoords, geom="geometry", coord_type="y", axis=1)
polygons = geo_grid.drop("geometry", axis=1).copy()
return polygons