Logo
  • Home

Tutorials

  • First Validated Workflow

How-to Guides

  • Validation by Schema
  • Validate Polygon Parameter
  • Check Polygon Area (CQL2)
  • Add CQL2 Area Function
  • Validate URI Input
  • Validate Datetime Input
  • Validate Date Range Input
  • Validation by Policy
  • Validation by Filtering
    • 0. Prepare the inputs
    • 1. Instantiate the validator
    • 2. Evaluation
  • Validate Point In Polygon (CQL2)
  • Validate Polygon Intersects AOI (CQL2)
  • Validate BBOX Overlap (CQL2)
  • Validate Required Property (Rego)
  • Validate Numeric Range (Rego)
  • Validate Array Cardinality (JSON Schema)
  • Combine Multiple Hints
  • Author CQL2 JSON Encoding
  • Use Packed vs Single CWL
  • Troubleshoot Rego Invalid Literal
  • Troubleshoot CWL Loader Errors
  • Validate DateTime Window (Rego)
  • Validate URI Host Constraints (Rego)
  • Create Reusable Rego Snippets
  • Test Hints With pytest
  • Validate Disjoint Geometries (CQL2)
  • Validate Optional Input (Rego)
  • Validate Enum Values (JSON Schema)
  • Validate Cross-Field Dependency (Rego)
  • Validate Conditional Required Field (Rego)

Reference

  • CLI
  • Assertion Hints
    • Hint Reference
    • Schema
  • Runtime Compatibility
  • Errors

Explanation

  • Architecture
  • Compatibility and Limitations
  • Testing and Reliability
  • Diataxis Mapping
Assertions Mate
  • How-to Guides
  • Validation by Filtering
  • Edit on Terradue/assertions-mate

Validation by Filtering¶

Filters can be expressed according to the text encoding for the CQL2 grammar directly inside in the original CWL Workflow as a Hint

0. Prepare the inputs¶

Let's download 2 arbitrary STAC Items and load them into a request body envelope:

In [1]:
Copied!
from pystac_client import Client

API_URL = "https://earth-search.aws.element84.com/v1"  # public STAC API

# 1) Open the STAC API
client = Client.open(API_URL)

# 2) Search for two Sentinel-2 L2A items (Rome area, low clouds, example dates)
search = client.search(
    collections=["sentinel-2-l2a"],
    bbox=[12.3, 41.7, 12.7, 42.0],  # Rome area
    datetime="2024-06-01/2024-06-30",  # any window with data works
    query={"eo:cloud_cover": {"lt": 30}},
    max_items=2,
)

# 3) Get PySTAC Items (already parsed objects)
items = list(search.items())  # -> List[pystac.Item]

print(f"Fetched {len(items)} items")

request_body = {
    "inputs": {
        "item_1": items[0].to_dict(),
        "item_2": items[1].to_dict(),
        "aoi": {"crs": "CRS84", "bbox": [12.3, 41.7, 12.7, 42.0]},
        "aoi2": "12.3, 41.7, 12.7, 42.0",
    }
}
from pystac_client import Client API_URL = "https://earth-search.aws.element84.com/v1" # public STAC API # 1) Open the STAC API client = Client.open(API_URL) # 2) Search for two Sentinel-2 L2A items (Rome area, low clouds, example dates) search = client.search( collections=["sentinel-2-l2a"], bbox=[12.3, 41.7, 12.7, 42.0], # Rome area datetime="2024-06-01/2024-06-30", # any window with data works query={"eo:cloud_cover": {"lt": 30}}, max_items=2, ) # 3) Get PySTAC Items (already parsed objects) items = list(search.items()) # -> List[pystac.Item] print(f"Fetched {len(items)} items") request_body = { "inputs": { "item_1": items[0].to_dict(), "item_2": items[1].to_dict(), "aoi": {"crs": "CRS84", "bbox": [12.3, 41.7, 12.7, 42.0]}, "aoi2": "12.3, 41.7, 12.7, 42.0", } }
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from pystac_client import Client
      2 
      3 API_URL = "https://earth-search.aws.element84.com/v1"  # public STAC API
      4 

ModuleNotFoundError: No module named 'pystac_client'

Let's dump them just to show the nature of the parsed STAC Items

In [2]:
Copied!
import json

print(json.dumps(request_body, indent=2))
import json print(json.dumps(request_body, indent=2))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 3
      1 import json
      2 
----> 3 print(json.dumps(request_body, indent=2))

NameError: name 'request_body' is not defined

Let's declare a set of assertion filters we want to check on the inputs, supported features:

  • nested keys via dot paths
  • standard operators (AND/OR, =, <, BETWEEN, LIKE, IS NULL, etc.)
In [3]:
Copied!
from assertions_mate import Cql2FilterHint, Cql2Query

filter = Cql2FilterHint(
    custom_functions="""
from shapely import geometry
from typing import Any, List, Mapping, Union

def ensure_bbox(input: Union[Mapping[str, Any], List[float], str]):
    value = []

    if isinstance(input, dict):
        value = input["bbox"]
        if not value:
            raise ValueError(f"Input {input} doesn't have a 'bbox' property")
    elif isinstance(input, str):
        value = [float(x) for x in str(input).split(",")]
    else:
        value = input

    return geometry.box(*value)
    """,
    queries=[
        Cql2Query(
            id="Items platforms are the same",
            cql2="inputs.item_1.properties.platform = inputs.item_2.properties.platform",
            message="It is expected that all items are part of the same platform",
        ),
        Cql2Query(
            id="First item intersects AOI",
            cql2="s_intersects(inputs.item_1, ensure_bbox(inputs.aoi))",
            message="First item expected to intersect the AOI",
        ),
        Cql2Query(
            id="Second item intersects AOI",
            cql2="s_intersects(inputs.item_2, ensure_bbox(inputs.aoi2))",
            message="Second item expected to intersect the AOI",
        ),
        Cql2Query(
            id="Items intersects each other",
            cql2="s_intersects(inputs.item_1, ensure_spatial(inputs.item_2))",
            message="Items do not intesect each other",
        ),
    ],
)
from assertions_mate import Cql2FilterHint, Cql2Query filter = Cql2FilterHint( custom_functions=""" from shapely import geometry from typing import Any, List, Mapping, Union def ensure_bbox(input: Union[Mapping[str, Any], List[float], str]): value = [] if isinstance(input, dict): value = input["bbox"] if not value: raise ValueError(f"Input {input} doesn't have a 'bbox' property") elif isinstance(input, str): value = [float(x) for x in str(input).split(",")] else: value = input return geometry.box(*value) """, queries=[ Cql2Query( id="Items platforms are the same", cql2="inputs.item_1.properties.platform = inputs.item_2.properties.platform", message="It is expected that all items are part of the same platform", ), Cql2Query( id="First item intersects AOI", cql2="s_intersects(inputs.item_1, ensure_bbox(inputs.aoi))", message="First item expected to intersect the AOI", ), Cql2Query( id="Second item intersects AOI", cql2="s_intersects(inputs.item_2, ensure_bbox(inputs.aoi2))", message="Second item expected to intersect the AOI", ), Cql2Query( id="Items intersects each other", cql2="s_intersects(inputs.item_1, ensure_spatial(inputs.item_2))", message="Items do not intesect each other", ), ], )
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[3], line 1
----> 1 from assertions_mate import Cql2FilterHint, Cql2Query
      2 
      3 filter = Cql2FilterHint(
      4     custom_functions="""

ModuleNotFoundError: No module named 'assertions_mate'

1. Instantiate the validator¶

In [4]:
Copied!
validator = filter.validator()
validator = filter.validator()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 1
----> 1 validator = filter.validator()

AttributeError: type object 'filter' has no attribute 'validator'

2. Evaluation¶

In [5]:
Copied!
error = validator.validate_inputs(request_body)

if error:
    print(error.model_dump_json(indent=2, exclude_none=True))
else:
    # No violations — choose what to output. A simple success marker is fine:
    print("Go ahead on submitting the /execution request")
error = validator.validate_inputs(request_body) if error: print(error.model_dump_json(indent=2, exclude_none=True)) else: # No violations — choose what to output. A simple success marker is fine: print("Go ahead on submitting the /execution request")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 error = validator.validate_inputs(request_body)
      2 
      3 if error:
      4     print(error.model_dump_json(indent=2, exclude_none=True))

NameError: name 'validator' is not defined
Previous Next

License CC BY-SA 4.0, by Creative Commons

Built with MkDocs using a theme provided by Read the Docs.
Terradue/assertions-mate « Previous Next »