Filter Examples¶
This tutorial demonstrates how to use the different leaf filters.
In this tutorial, we will use the wind turbine model from the pygen docs.
Note that in this model all connections are modeled as direct relations, except the relation from WindTurbin to MetMast as this goes through the Distance edge.
Setting up a cognite client¶
import os
from cognite.client import CogniteClient
from dotenv import load_dotenv
load_dotenv("../../.env")
client = CogniteClient.default_oauth_client_credentials(
project=os.environ["CDF_PROJECT"],
cdf_cluster=os.environ["CDF_CLUSTER"],
tenant_id=os.environ["IDP_TENANT_ID"],
client_id=os.environ["IDP_CLIENT_ID"],
client_secret=os.environ["IDP_CLIENT_SECRET"],
)
Nested¶
from cognite.client import data_modeling as dm
from cognite.client.data_classes import filters
Query: List wind turbines connected to a nacelle with a given yaw direction sensor
Relevant part of Nacelle type:
type Nacelle {
yaw_direction: SensorTimeSeries
...
}
Relevant part of WindTurbine type:
type WindTurbine {
name: String
nacelle: Nacelle
...
}
nacelle_view = dm.ViewId("sp_pygen_power", "Nacelle", "1")
turbine_view = dm.ViewId("sp_pygen_power", "WindTurbine", "1")
yaw_sensor = client.data_modeling.instances.list(
sources=nacelle_view,
)[0]["yaw_direction"]
yaw_sensor
{'space': 'sp_wind', 'externalId': 'V52-WindTurbine.yaw'}
Our goal is to list the wind turbines with nacelle that uses the yaw direction sensor above.
is_selected_turbine = filters.Nested(
turbine_view.as_property_ref("nacelle"), filters.Equals(nacelle_view.as_property_ref("yaw_direction"), yaw_sensor)
)
turbines = client.data_modeling.instances.list(
sources=turbine_view,
filter=is_selected_turbine,
limit=-1,
)
turbines
| space | external_id | version | last_updated_time | created_time | instance_type | name | capacity | rotor | blades | nacelle | windfarm | datasheets | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | sp_wind | hornsea_1_mill_3 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_3 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 1 | sp_wind | hornsea_1_mill_2 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_2 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 2 | sp_wind | hornsea_1_mill_1 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_1 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 3 | sp_wind | hornsea_1_mill_4 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_4 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 4 | sp_wind | hornsea_1_mill_5 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_5 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
Note In this caes there are 5 wind turbines that all have a nacelle with the given yaw_direction sensor, i.e., one sensor is used for 5 turbines.
Performance issues.¶
Nested filtering can be expensive. The performance depens on the implementation of the data model (use of indexes, container size). We note that the above query can be split into two.
Query 1: List all nacelle with a given yaw direction sensor
Query 2: List all turbines with any of the nacelle returned in the first query
yaw_sensor
{'space': 'sp_wind', 'externalId': 'V52-WindTurbine.yaw'}
nacelle_list = client.data_modeling.instances.list(
sources=nacelle_view,
filter=filters.Equals(nacelle_view.as_property_ref("yaw_direction"), yaw_sensor),
limit=-1,
)
nacelle_list.as_ids()
[NodeId(space='sp_wind', external_id='hornsea_1_mill_1_nacelle'), NodeId(space='sp_wind', external_id='hornsea_1_mill_3_nacelle'), NodeId(space='sp_wind', external_id='hornsea_1_mill_2_nacelle'), NodeId(space='sp_wind', external_id='hornsea_1_mill_4_nacelle'), NodeId(space='sp_wind', external_id='hornsea_1_mill_5_nacelle')]
turbines = client.data_modeling.instances.list(
sources=turbine_view,
filter=filters.In(
turbine_view.as_property_ref("nacelle"),
[nacelle_id.dump(include_instance_type=False) for nacelle_id in nacelle_list.as_ids()],
),
limit=-1,
)
turbines
| space | external_id | version | last_updated_time | created_time | instance_type | name | capacity | rotor | blades | nacelle | windfarm | datasheets | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | sp_wind | hornsea_1_mill_3 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_3 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 1 | sp_wind | hornsea_1_mill_2 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_2 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 2 | sp_wind | hornsea_1_mill_1 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_1 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 3 | sp_wind | hornsea_1_mill_4 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_4 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
| 4 | sp_wind | hornsea_1_mill_5 | 8 | 2024-12-17 09:57:38.908 | 2024-11-16 14:08:01.544 | node | hornsea_1_mill_5 | 7 | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | [{'space': 'sp_wind', 'externalId': 'hornsea_1... | {'space': 'sp_wind', 'externalId': 'hornsea_1_... | Hornsea 1 | [{'space': 'sp_wind', 'externalId': 'windmill_... |
The result is the same as above. Note that if the first query returns several 1000s nacelle, the In filter is likely to time out. Then, you need to split up the second query into multiple queries. For example, for example, fetch turbines by filtering the nacelle in cunks of 100.