Geomancer is a geospatial feature engineering library. It allows you to query from a geospatial data warehouse in order to create features for downstream tasks (analysis, modelling, visualization, etc.). Its features include:
- Feature primitives for geospatial feature engineering
- Ability to switch out data warehouses
- Compilation and sharing your features
The basic building blocks in Geomancer are called Spells. These are SQL queries that were packaged in logical groups. Given a set of coordinates, you can obtain features such as the distance to the nearest point-of-interest (POIs), number of POIS within a certain range, and etc.
For example, we wish to obtain the distance to the nearest embassy given a sample of coordinates:
In : # Load the sample_points as a dataframe In : from tests.conftest import sample_points In : df = sample_points
In : df.head() Out : WKT code 0 POINT (121.0042183 14.6749145) 2082 1 POINT (121.0052375 14.6767411) 2110 2 POINT (121.009712 14.68067) 2082 3 POINT (121.0093311 14.6799482) 2082 4 POINT (121.0073296 14.6783498) 2082
The geometries are encoded as a str inside a column named WKT. In addition, there is a code column that represents any arbitrary column in your data. What Geomancer will do is just add another column for your chosen feature while retaining the columns you originally have.
In : from geomancer.spells import DistanceToNearest In : # Configure and cast the spell In : spell = DistanceToNearest("embassy", source_table="geospatial.ph_osm.gis_osm_pois_free_1", feature_name="dist_embassy") In : df_with_features = spell.cast(df, dburl="bigquery://geospatial")
It then returns a DataFrame with an added column, dist_embassy:
In : df_with_features.head() Out : WKT code dist_embassy 0 POINT (121.0042183 14.6749145) 2082 4948.580211 1 POINT (121.0052375 14.6767411) 2110 5084.787270 2 POINT (121.009712 14.68067) 2082 5319.746371 3 POINT (121.0093311 14.6799482) 2082 5256.165257 4 POINT (121.0073296 14.6783498) 2082 5162.177598
Data Warehouse Flexibility¶
Geomancer is powered by a data warehouse backend for engineering features. It is then possible to compile features from different sources through this flexible API. So far, we’ve supported (and planning to support) the following database backends:
- Google BigQuery, an analytics data warehouse from the Google Cloud Platform
- PostGIS, a geospatial extension for PostGreSQL
- SpatiaLite, a geospatial extension for SQLite
First you need to setup your data warehouse in order to accommodate Geomancer. For more instructions, please see the Setup instructions in this documentation.