Validation by Filtering¶
Filters can be expressed according to the text encoding for the CQL2 grammar directly inside in the original CWL Workflow as a Hint
0. Prepare the inputs¶
Let's download 2 arbitrary STAC Items and load them into a request body envelope:
In [1]:
Copied!
from pystac_client import Client
API_URL = "https://earth-search.aws.element84.com/v1" # public STAC API
# 1) Open the STAC API
client = Client.open(API_URL)
# 2) Search for two Sentinel-2 L2A items (Rome area, low clouds, example dates)
search = client.search(
collections=["sentinel-2-l2a"],
bbox=[12.3, 41.7, 12.7, 42.0], # Rome area
datetime="2024-06-01/2024-06-30", # any window with data works
query={"eo:cloud_cover": {"lt": 30}},
max_items=2,
)
# 3) Get PySTAC Items (already parsed objects)
items = list(search.items()) # -> List[pystac.Item]
print(f"Fetched {len(items)} items")
request_body = {
"inputs": {
"item_1": items[0].to_dict(),
"item_2": items[1].to_dict(),
"aoi": {"crs": "CRS84", "bbox": [12.3, 41.7, 12.7, 42.0]},
"aoi2": "12.3, 41.7, 12.7, 42.0",
}
}
from pystac_client import Client
API_URL = "https://earth-search.aws.element84.com/v1" # public STAC API
# 1) Open the STAC API
client = Client.open(API_URL)
# 2) Search for two Sentinel-2 L2A items (Rome area, low clouds, example dates)
search = client.search(
collections=["sentinel-2-l2a"],
bbox=[12.3, 41.7, 12.7, 42.0], # Rome area
datetime="2024-06-01/2024-06-30", # any window with data works
query={"eo:cloud_cover": {"lt": 30}},
max_items=2,
)
# 3) Get PySTAC Items (already parsed objects)
items = list(search.items()) # -> List[pystac.Item]
print(f"Fetched {len(items)} items")
request_body = {
"inputs": {
"item_1": items[0].to_dict(),
"item_2": items[1].to_dict(),
"aoi": {"crs": "CRS84", "bbox": [12.3, 41.7, 12.7, 42.0]},
"aoi2": "12.3, 41.7, 12.7, 42.0",
}
}
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1 ----> 1 from pystac_client import Client 2 3 API_URL = "https://earth-search.aws.element84.com/v1" # public STAC API 4 ModuleNotFoundError: No module named 'pystac_client'
Let's dump them just to show the nature of the parsed STAC Items
In [2]:
Copied!
import json
print(json.dumps(request_body, indent=2))
import json
print(json.dumps(request_body, indent=2))
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[2], line 3 1 import json 2 ----> 3 print(json.dumps(request_body, indent=2)) NameError: name 'request_body' is not defined
Let's declare a set of assertion filters we want to check on the inputs, supported features:
- nested keys via dot paths
- standard operators (AND/OR, =, <, BETWEEN, LIKE, IS NULL, etc.)
In [3]:
Copied!
from assertions_mate import Cql2FilterHint, Cql2Query
filter = Cql2FilterHint(
custom_functions="""
from shapely import geometry
from typing import Any, List, Mapping, Union
def ensure_bbox(input: Union[Mapping[str, Any], List[float], str]):
value = []
if isinstance(input, dict):
value = input["bbox"]
if not value:
raise ValueError(f"Input {input} doesn't have a 'bbox' property")
elif isinstance(input, str):
value = [float(x) for x in str(input).split(",")]
else:
value = input
return geometry.box(*value)
""",
queries=[
Cql2Query(
id="Items platforms are the same",
cql2="inputs.item_1.properties.platform = inputs.item_2.properties.platform",
message="It is expected that all items are part of the same platform",
),
Cql2Query(
id="First item intersects AOI",
cql2="s_intersects(inputs.item_1, ensure_bbox(inputs.aoi))",
message="First item expected to intersect the AOI",
),
Cql2Query(
id="Second item intersects AOI",
cql2="s_intersects(inputs.item_2, ensure_bbox(inputs.aoi2))",
message="Second item expected to intersect the AOI",
),
Cql2Query(
id="Items intersects each other",
cql2="s_intersects(inputs.item_1, ensure_spatial(inputs.item_2))",
message="Items do not intesect each other",
),
],
)
from assertions_mate import Cql2FilterHint, Cql2Query
filter = Cql2FilterHint(
custom_functions="""
from shapely import geometry
from typing import Any, List, Mapping, Union
def ensure_bbox(input: Union[Mapping[str, Any], List[float], str]):
value = []
if isinstance(input, dict):
value = input["bbox"]
if not value:
raise ValueError(f"Input {input} doesn't have a 'bbox' property")
elif isinstance(input, str):
value = [float(x) for x in str(input).split(",")]
else:
value = input
return geometry.box(*value)
""",
queries=[
Cql2Query(
id="Items platforms are the same",
cql2="inputs.item_1.properties.platform = inputs.item_2.properties.platform",
message="It is expected that all items are part of the same platform",
),
Cql2Query(
id="First item intersects AOI",
cql2="s_intersects(inputs.item_1, ensure_bbox(inputs.aoi))",
message="First item expected to intersect the AOI",
),
Cql2Query(
id="Second item intersects AOI",
cql2="s_intersects(inputs.item_2, ensure_bbox(inputs.aoi2))",
message="Second item expected to intersect the AOI",
),
Cql2Query(
id="Items intersects each other",
cql2="s_intersects(inputs.item_1, ensure_spatial(inputs.item_2))",
message="Items do not intesect each other",
),
],
)
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[3], line 1 ----> 1 from assertions_mate import Cql2FilterHint, Cql2Query 2 3 filter = Cql2FilterHint( 4 custom_functions=""" ModuleNotFoundError: No module named 'assertions_mate'
1. Instantiate the validator¶
In [4]:
Copied!
validator = filter.validator()
validator = filter.validator()
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[4], line 1 ----> 1 validator = filter.validator() AttributeError: type object 'filter' has no attribute 'validator'
2. Evaluation¶
In [5]:
Copied!
error = validator.validate_inputs(request_body)
if error:
print(error.model_dump_json(indent=2, exclude_none=True))
else:
# No violations — choose what to output. A simple success marker is fine:
print("Go ahead on submitting the /execution request")
error = validator.validate_inputs(request_body)
if error:
print(error.model_dump_json(indent=2, exclude_none=True))
else:
# No violations — choose what to output. A simple success marker is fine:
print("Go ahead on submitting the /execution request")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[5], line 1 ----> 1 error = validator.validate_inputs(request_body) 2 3 if error: 4 print(error.model_dump_json(indent=2, exclude_none=True)) NameError: name 'validator' is not defined