I’ve a desk known as places with ~70000 information which features a geometry column of factors and a spatial index as follows:
UPDATE places SET geom = ST_GeomFromText(CONCAT( 'POINT(', latitude, ' ', longitude, ')' ), 4326);
ALTER TABLE places MODIFY geom GEOMETRY NOT NULL;
ALTER TABLE places ADD SPATIAL INDEX(geom);
I’ve one other desk known as location_boundaries with ~350 information and a geometry column for nation/area boundaries transformed from a geojson string of polygons/multipolygons as follows:
UPDATE location_boundaries SET geom = ST_GeomFromGeoJSON(geo_json, 1, 4326);
ALTER TABLE location_boundaries MODIFY geom GEOMETRY NOT NULL;
ALTER TABLE location_boundaries ADD SPATIAL INDEX(geom);
I’ve a desk known as jobs and I’ve the next question to return all jobs inside a specific area – on this case North West England (observe the geojson boundry for this area was taken from the next website: https://nominatim.openstreetmap.org/search?q=North%20West%20England&countrycodes=gb&polygon_geojson=1&format=json
The purpose of this question is to generate a paginated consequence set. Nonetheless the question takes perpetually to execute (properly over 40 minutes earlier than I needed to cancel it). To filter via ~6000 jobs is way too sluggish.
SET @g = (SELECT geom
FROM location_boundaries
WHERE NAME = 'North West England');
SELECT DISTINCT jobs.*
FROM jobs
INNER JOIN places
ON places.id = jobs.location_id
WHERE St_contains(@g, places.geom)
AND jobs.start_at <= now()
AND jobs.end_at >= now()
AND jobs.status_id = 4
ORDER BY Coalesce(IF(jobs.product_id = Four AND jobs.refreshed_at <= UTC_TIMESTAMP(), jobs.refreshed_at, jobs.start_at)) DESC
, jobs.id DESC
LIMIT 25 offset 0
That is the clarify:
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+--------- +---------+-------------------------+------+----------+----------------------------------------------+
| id | select_type | desk | partitions | sort | possible_keys | key | key_len | ref | rows | filtered | Additional |
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+---------+---------+-------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | jobs | NULL | ALL | jobs_location_id_foreign,jobs_status_id_start_at_index,jobs_status_id_foreign | NULL | NULL | NULL | 5731 | 33.33 | Utilizing the place; Utilizing momentary; Utilizing filesort |
| 1 | SIMPLE | places | NULL | eq_ref | PRIMARY,geom | PRIMARY | 4 | testdb.jobs.location_id | 1 | 5.00 | Utilizing the place; Distinct |
+----+-------------+-----------+------------+--------+-------------------------------------------------------------------------------+---------+---------+-------------------------+------+----------+----------------------------------------------+
Why arnt the spatial indexes getting used? How can I optimize this question. Any assist appreciated.
Surprisingly when swap the boundary to a unique multipolygon say to nation degree as follows I can get the question to execute in 32 seconds which continues to be too sluggish.
SET @g = (SELECT geom
FROM location_boundaries
WHERE NAME = 'United Kingdom');
The multipolygon for United Kingdom was taken from geonames shapes_simplified_low.txt file right here: http://obtain.geonames.org/export/dump/ so might or not it’s one thing to do with the polygons? Some work and others do not.
for instance I’ve jobs on the following level in Canada:
@level = ST_GeomFromText('POINT(60.1086700 -113.6425800)', 4326)
After I search a multipolygon which has the boundary for Canada from the geonames dump, the question crashes.
Lets assume the polygons are possibly invalid however what concerning the ones that do work. Why are they nonetheless so sluggish?