Querying / Selecting¶
We assume that you have generated a SDK for the WindTurbine
model and have a client ready to go.
Querying/Selecting is a different way of retrieving data than doing .list()
, .retrieve()
, search()
in that it supports retrieving nested data structures.
SDKs generated by pygen
have two ways of querying/selecting which we will refer to as Python and GraphQL qurying. In this section, we will go through both of them,
and list limitations as well as when to use one over the other.
from wind_turbine import WindTurbineClient
pygen = WindTurbineClient.from_toml("config.toml")
Comparing: Python and GraphQL based Querying¶
Python Based Querying/Selecting¶
Advantages
- Discovarability through the IDE through autocompletion.
- The returning data classes have required/optional set as in the data model, which are useful for static type checking and IDE auto complete.
- Regular filtering has same syntax as querying along any edge, direct relation, or reverse direct relation.
Limitations
- You can only query along one chain of edges. For example, if we start from
WindTurbine
above we can either go to blades or metmast, not both.
GraphQL Based Querying¶
Advantages
- Flexiblity in querying. You can retrieve anything you can write up as a graphql query.
Limitations
- Difficult to write the querying, typically you would use the CDF UI to create the query.
- The returning data class has all properties as optional and is thus less structured.
Summary¶
GraphQL based querying is more flexible, but gives you less structure on the returning data classes. In addition, the query will have to be created outside of your IDE which requires context switching.
Python Based Querying¶
This approach relies on linking together multiple calls to describe how you want to filter and what data to retrieve.
Query: All turbines with nacelle:
result = pygen.wind_turbine.select().nacelle.list_full()
result
space | external_id | capacity | name | blades | datasheets | nacelle | rotor | windfarm | data_record | |
---|---|---|---|---|---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_3 | 7.0 | hornsea_1_mill_3 | [hornsea_1_mill_3_blade_A, hornsea_1_mill_3_bl... | [windmill_schematics] | {'space': 'sp_wind', 'external_id': 'hornsea_1... | hornsea_1_mill_3_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
1 | sp_wind | hornsea_1_mill_2 | 7.0 | hornsea_1_mill_2 | [hornsea_1_mill_2_blade_B, hornsea_1_mill_2_bl... | [windmill_schematics] | {'space': 'sp_wind', 'external_id': 'hornsea_1... | hornsea_1_mill_2_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
2 | sp_wind | hornsea_1_mill_1 | 7.0 | hornsea_1_mill_1 | [hornsea_1_mill_1_blade_A, hornsea_1_mill_1_bl... | [windmill_schematics] | {'space': 'sp_wind', 'external_id': 'hornsea_1... | hornsea_1_mill_1_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
3 | sp_wind | hornsea_1_mill_4 | 7.0 | hornsea_1_mill_4 | [hornsea_1_mill_4_blade_C, hornsea_1_mill_4_bl... | [windmill_schematics] | {'space': 'sp_wind', 'external_id': 'hornsea_1... | hornsea_1_mill_4_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
4 | sp_wind | hornsea_1_mill_5 | 7.0 | hornsea_1_mill_5 | [hornsea_1_mill_5_blade_B, hornsea_1_mill_5_bl... | [windmill_schematics] | {'space': 'sp_wind', 'external_id': 'hornsea_1... | hornsea_1_mill_5_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
Query: Get all blades for the windturbine named "hornsea_1_mill_1"
result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").blades.list_blade()
result
space | external_id | is_damaged | name | data_record | |
---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_1_blade_A | True | A | {'version': 1, 'last_updated_time': 2024-11-16... |
1 | sp_wind | hornsea_1_mill_1_blade_B | False | B | {'version': 1, 'last_updated_time': 2024-11-16... |
2 | sp_wind | hornsea_1_mill_1_blade_C | False | C | {'version': 1, 'last_updated_time': 2024-11-16... |
Query: Get all blades for the windmill with external ID "hornsea_1_mill_3" or "hornsea_1_mill_4" with a damaged blade
result = (
pygen.wind_turbine.select()
.external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
.blades.is_damaged.equals(True)
.list_blade()
)
result
space | external_id | is_damaged | name | data_record | |
---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_4_blade_C | True | C | {'version': 1, 'last_updated_time': 2024-11-16... |
Same query as above but return the turbines
result = (
pygen.wind_turbine.select()
.external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
.blades.is_damaged.equals(True)
.list_full()
)
result
space | external_id | capacity | name | blades | datasheets | nacelle | rotor | windfarm | data_record | |
---|---|---|---|---|---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_4 | 7.0 | hornsea_1_mill_4 | [{'space': 'sp_wind', 'external_id': 'hornsea_... | [windmill_schematics] | hornsea_1_mill_4_nacelle | hornsea_1_mill_4_rotor | Hornsea 1 | {'version': 4, 'last_updated_time': 2024-11-16... |
We can also inspect the query we are doing:
pygen.wind_turbine.query().nacelle.gearbox
Query
Call .list_full() to return a list of Windturbine and .list_gearbox() to return a list of Gearbox.
In the query above, we go from WindTurbine
node through the nacelle
direct relation to the Nacelle
node, and continue through the gearbox
direct relation to the Gearbox
node.
Warning Notebook IDE
The autocomplete shown in the screenshots below works best in a jupyter notebook IDE. For example, VS Code notebook/PyCharm notebooks/Jupyter Lite (CDF Notebook) you might not get the same autocomplete when writing .
+ tab.
This style of querying is initiated by calling the .select()
method on the place we want to start the query. Then, all properties that pygen supports filtering on are available as an attribute on the returning object of query. In addition, all types of connections (edges, direct relations, and reverse direct relations) can be traversed. Illustrated in the screenshot below. To get the list of possible options write .
and press tab.
We can select a properties, and depending on the type of property, we will get the available filters up. For example, name
is a string property which will make equals
, in
and prefix
filter available.
Ones we have input the filterin values we are back at the source node and can continue to filter on properties of this node or traverse do the next one. In the example below, we traverse to the nacelle:
To make the query more readable it is recommended that you use the paranthesis syntax. This makes the query more readable.
Finally, finish with either .list_full()
or .list_<type>()
to return the all nodes and edges in the query, or only the last node.
result = (
pygen.wind_turbine.select()
.capacity.range(6.0, 8.0)
.name.prefix("hornsea")
.blades.is_damaged.equals(True)
.list_blade()
)
result
space | external_id | is_damaged | name | data_record | |
---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_1_blade_A | True | A | {'version': 1, 'last_updated_time': 2024-11-16... |
1 | sp_wind | hornsea_1_mill_4_blade_C | True | C | {'version': 1, 'last_updated_time': 2024-11-16... |
2 | sp_wind | hornsea_1_mill_2_blade_B | True | B | {'version': 1, 'last_updated_time': 2024-11-16... |
Sorting¶
You can also sort the results returned by the .select()
method. This is done by the .sort_ascending()
and .sort_descending()
property you are sorting.
Query: Get all damaged blades connected to turbines with capacity between 6.0 and 8.0 and a names that starts with "hornsea". Sort the blades by name
result = (
pygen.wind_turbine.select()
.capacity.range(6.0, 8.0)
.name.prefix("hornsea")
.blades.is_damaged.equals(True)
.name.sort_ascending()
.list_blade()
)
result
space | external_id | is_damaged | name | data_record | |
---|---|---|---|---|---|
0 | sp_wind | hornsea_1_mill_1_blade_A | True | A | {'version': 1, 'last_updated_time': 2024-11-16... |
1 | sp_wind | hornsea_1_mill_2_blade_B | True | B | {'version': 1, 'last_updated_time': 2024-11-16... |
2 | sp_wind | hornsea_1_mill_4_blade_C | True | C | {'version': 1, 'last_updated_time': 2024-11-16... |
In addition, for properties of type TimeStamp
and Date
you can use .latest()
and .earliest()
. This is a shorthand for calling .sort_descending()
/.sort_decending()
+ setting limit=1
.
Query: Get the latest uploaded datasheet for the "hornsea_1_mill_1" turbine
result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").datasheets.uploaded_time.latest().list_data_sheet()
result
space | external_id | is_uploaded | mime_type | name | uploaded_time | data_record | |
---|---|---|---|---|---|---|---|
0 | sp_wind | windmill_schematics | True | application/pdf | windmill_schematics.pdf | 2024-11-16 14:19:28.484000+00:00 | {'version': 2, 'last_updated_time': 2024-11-16... |
GraphQL based Querying¶
When querying with GraphQL we must include __typename
of the top level items as this is used by pygen
to understand how to pase the object.
The querying method is available on the top level client as this is not particular to any of the data types in your data model
my_query = """{
listWindTurbine(first:1){
items{
__typename
name
capacity
nacelle{
externalId
}
rotor{
externalId
}
blades{
items{
name
is_damaged
}
}
}
}
}"""
result = pygen.graphql_query(my_query)
result
blades | capacity | nacelle | name | rotor | |
---|---|---|---|---|---|
0 | [{'space': None, 'external_id': None, 'data_re... | 7.0 | {'space': None, 'external_id': 'hornsea_1_mill... | hornsea_1_mill_3 | {'space': None, 'external_id': 'hornsea_1_mill... |
result[0].model_dump(exclude_none=True)
{'blades': [{'is_damaged': False, 'name': 'A'}, {'is_damaged': False, 'name': 'B'}, {'is_damaged': False, 'name': 'C'}], 'capacity': 7.0, 'nacelle': {'external_id': 'hornsea_1_mill_3_nacelle'}, 'name': 'hornsea_1_mill_3', 'rotor': {'external_id': 'hornsea_1_mill_3_rotor'}}
turbine = result[0]
turbine.nacelle.external_id
'hornsea_1_mill_3_nacelle'
Pitfalls¶
If you forget to include __typename
on the top level item pygen
will raise a Runtime
error
my_invalid_query = """
{
listWindTurbine(first:1){
items{
name
}
}
}
"""
pygen.graphql_query(my_invalid_query)
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[19], line 1 ----> 1 pygen.graphql_query(my_invalid_query) File ~\Projects\internal\pygen\examples\wind_turbine\_api_client.py:242, in WindTurbineClient.graphql_query(self, query, variables) 240 data_model_id = dm.DataModelId("sp_pygen_power", "WindTurbine", "1") 241 result = self._client.data_modeling.graphql.query(data_model_id, query, variables) --> 242 return GraphQLQueryResponse(data_model_id).parse(result) File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:567, in GraphQLQueryResponse.parse(self, response) 565 raise RuntimeError(response["errors"]) 566 _, data = list(response.items())[0] --> 567 self._parse_item(data) 568 if "pageInfo" in data: 569 self._output.page_info = PageInfo.load(data["pageInfo"]) File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:575, in GraphQLQueryResponse._parse_item(self, data) 573 if "items" in data: 574 for item in data["items"]: --> 575 self._parse_item(item) 576 elif "__typename" in data: 577 try: File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:584, in GraphQLQueryResponse._parse_item(self, data) 582 self._output.append(item) 583 else: --> 584 raise RuntimeError("Missing '__typename' in GraphQL response. Cannot determine the type of the response.") RuntimeError: Missing '__typename' in GraphQL response. Cannot determine the type of the response.
Data Classes¶
When you call .list()
, .retrieve()
and .search(),
pygen` returns ther read format of data classes. This read format matches the type/view required/optional properties.
When you do the graphql_query
above all properties are optional as pygen
cannot know which objects you included in your query, thus pygen
uses a special GraphQL format of the
data class it is returning
my_query = """{
listWindTurbine(first:1){
items{
name
__typename
}
}
}"""
result = pygen.graphql_query(my_query)
type(result)
wind_turbine.data_classes._core.base.GraphQLList
type(result[0])
wind_turbine.data_classes._wind_turbine.WindTurbineGraphQL
This data class can be converted to a regular write or read format by calling the as_write
and as_read
call.
Warning If you have not included all required properties in the your GraphQL query, pygen
will raise an ValueError
when you do this call.
result[0].as_read()
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[25], line 1 ----> 1 result[0].as_read() File ~\Projects\internal\pygen\examples\wind_turbine\data_classes\_wind_turbine.py:154, in WindTurbineGraphQL.as_read(self) 152 """Convert this GraphQL format of wind turbine to the reading format.""" 153 if self.data_record is None: --> 154 raise ValueError("This object cannot be converted to a read format because it lacks a data record.") 155 return WindTurbine( 156 space=self.space, 157 external_id=self.external_id, (...) 172 windfarm=self.windfarm, 173 ) ValueError: This object cannot be converted to a read format because it lacks a data record.
my_query = """{
listWindTurbine(first:1){
items{
name
space
externalId
createdTime
lastUpdatedTime
__typename
}
}
}"""
result = pygen.graphql_query(my_query)
windmill_read = result[0].as_read()
windmill_read
value | |
---|---|
space | sp_wind |
external_id | hornsea_1_mill_3 |
data_record | {'version': 0, 'last_updated_time': 2024-11-16... |
node_type | None |
capacity | None |
description | None |
name | hornsea_1_mill_3 |
blades | [] |
datasheets | [] |
metmast | [] |
nacelle | None |
power_curve | None |
rotor | None |
windfarm | None |
type(windmill_read)
wind_turbine.data_classes._wind_turbine.WindTurbine
windmill_write = result[0].as_write()
windmill_write
value | |
---|---|
space | sp_wind |
external_id | hornsea_1_mill_3 |
data_record | {'existing_version': 0} |
node_type | None |
capacity | None |
description | None |
name | hornsea_1_mill_3 |
blades | [] |
datasheets | [] |
metmast | [] |
nacelle | None |
power_curve | None |
rotor | None |
windfarm | None |
type(windmill_write)
wind_turbine.data_classes._wind_turbine.WindTurbineWrite
Paging¶
If we include a pageInfo
in our query this will be available directly on the result returned from the .graphql_query
method
my_query = """
{
listWindTurbine{
items{
__typename
name
}
pageInfo{
hasNextPage
hasPreviousPage
startCursor
endCursor
}
}
}"""
result = pygen.graphql_query(my_query)
result.page_info.has_next_page
True
result.page_info.end_cursor[:20]
'Z0FBQUFBQm5PTjBsRlUx'
Next section: Creating and Deleting