PL/Python¶
The PostgreSQL documentation has a complete reference on authoring PL/Python functions.
This section will cover uses of PL/Python with plpygis
.
Enabling PL/Python¶
Prior to using PL/Python, it must be loaded in the current database:
# CREATE LANGUAGE plpython3u;
Warning
PL/Python is an “untrusted” language, meaning that Python code will have unrestricted access to the system at the same level as the database administrator.
Python 2 and Python 3¶
plpygis
is compatible with both Python 2 and Python 3. For Python 3, the language is plpython3u
and for Python 2 it is plpython2u
(the generic plpythonu
currently refers to Python 2 in PostgreSQL but this may change in the future).
Function declarations¶
PL/Python function declarations follow the following template:
CREATE FUNCTION funcname (argument-list)
RETURNS return-type
AS $$
# PL/Python function body
$$ LANGUAGE plpython3u;
Named arguments are provided as a comma-separated list, with the argument name preceding the argument type:
CREATE OR REPLACE FUNCTION make_point(x FLOAT, y FLOAT)
RETURNS geometry
AS $$
# PL/Python function body
$$ LANGUAGE plpython3u;
Warning
Variables passed as arguments should never be assigned to in a PL/Python function.
Type mappings¶
The mapping between types in PL/Python and is PostgreSQL is covered in the Data Values section of the documentation; it is the role of plpygis
to assist in mapping between PL/Python and PostGIS types.
PostGIS types¶
When authoring a Postgres function that takes a PostGIS geometry as an input parameter or returns a geometry as output, Geometry
objects will provide the automatic conversion between types.
CREATE OR REPLACE FUNCTION make_point(x FLOAT, y FLOAT)
RETURNS geometry
AS $$
from plpygis import Point
p = Point((x, y))
return p
$$ LANGUAGE plpython3u;
Input parameter¶
A PostGIS geometry passed as the argument to Geometry()
will initialize the instance.
CREATE OR REPLACE FUNCTION find_hemisphere(geom geometry)
RETURNS TEXT
AS $$
from plpygis import Geometry
point = Geometry(geom)
if point.type != "Point":
return None
gj = point.geojson
lon = gj["coordinates"][0]
lat = gj["coordinates"][1]
if lon < 0:
return "West"
elif lon > 0:
return "East"
else:
return "Meridian"
$$ LANGUAGE plpython3u;
db=# SELECT name, find_hemisphere(ST_Centroid(geom)) FROM countries LIMIT 10;
name | find_hemisphere
-------------------------+-----------------
Aruba | West
Afghanistan | East
Angola | East
Anguilla | West
Albania | East
American Samoa | West
Andorra | East
Argentina | West
Armenia | East
Bulgaria | East
(10 rows)
Return value¶
A Geometry
can be returned directly from a PL/Python function.
CREATE OR REPLACE FUNCTION make_point(x FLOAT, y FLOAT)
RETURNS geometry
AS $$
from plpygis import Point
return Point((x, y))
$$ LANGUAGE plpython3u;
db=# SELECT make_point(-52, 0);
make_point
--------------------------------------------
01010000000000000000004AC00000000000000000
(1 row)
This custom make_point(x, y)
functions identically to PostGIS’s native ST_MakePoint(x, y).
db=# SELECT ST_MakePoint(-52, 0);
st_makepoint
--------------------------------------------
01010000000000000000004AC00000000000000000
(1 row)
geometry
and geography
¶
Both PostGIS geometry
and geography
types may be used as arguments or return types. plpygis
does not support box2d
, box3d
, raster
or any topology types.
geometry
and geography
arguments will be treated identically by plpygis
, as they share an common WKB format.
However, a PL/Python function that has a return value of geography
must not have an SRID of any value except 4326. It will also be treated differently by certain PostGIS functions.
Imagine two PL/Python functions that both create a polygon with lower-left coordinates at (0, 0)
and upper-right coordinates at (50, 50)
. If box_geom
has a return type of geometry
and box_geog
has a return type of geography
, area calculations will be evaluated as follows:
db=# SELECT ST_Area(box_geom());
st_area
------------------
2500
(1 row)
db=# SELECT ST_Area(box_geog());
st_area
------------------
27805712533424.3
(1 row)
Arrays and sets¶
In addition to returning single values, plpygis
functions may return a list of geometries that can be either interpreted as a PostgreSQL array or set.
db=# CREATE OR REPLACE FUNCTION make_points(x FLOAT, y FLOAT)
RETURNS SETOF geometry
AS $$
from plpygis import Geometry
from shapely.geometry import Point
p1 = Point(x, y)
p2 = Point(y, x)
return [Geometry.shape(p1), Geometry.shape(p2)]
$$ LANGUAGE plpython3u;
db=# SELECT ST_AsText(make_points(10,20));
st_astext
--------------
POINT(10 20)
POINT(20 10)
Python’s yield
keyword may also be used to return elements in a set rather than returning them as elements in a list.
plpy
¶
The plpy
module provides access to helper functions, notably around logging to PostgreSQL’s standard log files.
See Utility Functions in the PostgreSQL documentation.
Aggregate functions¶
PostGIS includes several spatial aggregate functions that accept a set of geometries as input parameters. An aggregate function definition requires different syntax from a normal PL/Python function:
CREATE AGGREGATE agg_fn (
SFUNC = _state_function,
STYPE = geometry,
BASETYPE = geometry, -- optional
FINALFUNC = wrapup_func, -- optional
INITCOND = 'POINT(0 0)' -- optional
);
An aggregate will accept individual inputs of the type defined by BASETYPE
and incrementally producing a single type defined by STYPE
. If many geometries will be collapsed down to a single geometry, then both BASETYPE
and STYPE
will be geometry
. If many geometries will produce more than one geometry, then the types will be geometry
and geometry[]
respectively.
An example aggregate function would be point_cluster
, which takes n input geometries and outputs m geometries, where m < n.
CREATE AGGREGATE point_cluster (
SFUNC = _point_cluster,
BASETYPE = geometry,
STYPE = geometry[],
INITCOND = '{}'
);
The function indicated by SFUNC
must accept the STYPE
as the first parameter and BASETYPE
as the second parameter, returning another instance of STYPE
. If INITCOND
is provided, this will be the value of the first argument passed to the first call of SFUNC
. If it is omitted, the value will be initially set to None
.
CREATE FUNCTION _point_cluster(geoms geometry[], newgeom geometry)
RETURNS geometry[]
AS $$
# incremental clustering algorithm here
$$ LANGUAGE plpython3u;
Alternatively, the SFUNC
can simply collect all the individual geometries into a list and then rely on a single FINALFUNC
to create a new list of geometries that represents the clustered points.
CREATE AGGREGATE point_cluster (
SFUNC = array_append,
BASETYPE = geometry,
STYPE = geometry[],
INITCOND = '{}',
FINALFUNC = _point_cluster
);
The parameter of the FINALFUNC
will be a single geometry[]
, representing the collection of individual points.
CREATE FUNCTION _point_cluster(geoms geometry[])
RETURNS geometry[]
AS $$
# clustering algorithm here
$$ LANGUAGE plpython3u;