Querying / Selecting¶

We assume that you have generated a SDK for the WindTurbine model and have a client ready to go.

Querying/Selecting is a different way of retrieving data than doing .list(), .retrieve(), search() in that it supports retrieving nested data structures. SDKs generated by pygen have two ways of querying/selecting which we will refer to as Python and GraphQL qurying. In this section, we will go through both of them, and list limitations as well as when to use one over the other.

In [2]:

Copied!

from wind_turbine import WindTurbineClient
from wind_turbine import WindTurbineClient

In [3]:

Copied!

pygen = WindTurbineClient.from_toml("config.toml")
pygen = WindTurbineClient.from_toml("config.toml")

Comparing: Python and GraphQL based Querying¶

No description has been provided for this image

Python Based Querying/Selecting¶

Advantages

Discovarability through the IDE through autocompletion.
The returning data classes have required/optional set as in the data model, which are useful for static type checking and IDE auto complete.
Regular filtering has same syntax as querying along any edge, direct relation, or reverse direct relation.

Limitations

You can only query along one chain of edges. For example, if we start from WindTurbine above we can either go to blades or metmast, not both.

GraphQL Based Querying¶

Advantages

Flexiblity in querying. You can retrieve anything you can write up as a graphql query.

Limitations

Difficult to write the querying, typically you would use the CDF UI to create the query.
The returning data class has all properties as optional and is thus less structured.

Summary¶

GraphQL based querying is more flexible, but gives you less structure on the returning data classes. In addition, the query will have to be created outside of your IDE which requires context switching.

Python Based Querying¶

This approach relies on linking together multiple calls to describe how you want to filter and what data to retrieve.

Query: All turbines with nacelle:

In [4]:

Copied!

result = pygen.wind_turbine.select().nacelle.list_full()
result
result = pygen.wind_turbine.select().nacelle.list_full()
result

Out[4]:

	space	external_id	capacity	name	blades	datasheets	nacelle	rotor	windfarm	data_record
0	sp_wind	hornsea_1_mill_3	7.0	hornsea_1_mill_3	[hornsea_1_mill_3_blade_A, hornsea_1_mill_3_bl...	[windmill_schematics]	{'space': 'sp_wind', 'external_id': 'hornsea_1...	hornsea_1_mill_3_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...
1	sp_wind	hornsea_1_mill_2	7.0	hornsea_1_mill_2	[hornsea_1_mill_2_blade_B, hornsea_1_mill_2_bl...	[windmill_schematics]	{'space': 'sp_wind', 'external_id': 'hornsea_1...	hornsea_1_mill_2_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...
2	sp_wind	hornsea_1_mill_1	7.0	hornsea_1_mill_1	[hornsea_1_mill_1_blade_A, hornsea_1_mill_1_bl...	[windmill_schematics]	{'space': 'sp_wind', 'external_id': 'hornsea_1...	hornsea_1_mill_1_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...
3	sp_wind	hornsea_1_mill_4	7.0	hornsea_1_mill_4	[hornsea_1_mill_4_blade_C, hornsea_1_mill_4_bl...	[windmill_schematics]	{'space': 'sp_wind', 'external_id': 'hornsea_1...	hornsea_1_mill_4_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...
4	sp_wind	hornsea_1_mill_5	7.0	hornsea_1_mill_5	[hornsea_1_mill_5_blade_B, hornsea_1_mill_5_bl...	[windmill_schematics]	{'space': 'sp_wind', 'external_id': 'hornsea_1...	hornsea_1_mill_5_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...

Query: Get all blades for the windturbine named "hornsea_1_mill_1"

In [5]:

Copied!

result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").blades.list_blade()
result
result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").blades.list_blade()
result

Out[5]:

	space	external_id	is_damaged	name	data_record
0	sp_wind	hornsea_1_mill_1_blade_A	True	A	{'version': 1, 'last_updated_time': 2024-11-16...
1	sp_wind	hornsea_1_mill_1_blade_B	False	B	{'version': 1, 'last_updated_time': 2024-11-16...
2	sp_wind	hornsea_1_mill_1_blade_C	False	C	{'version': 1, 'last_updated_time': 2024-11-16...

Query: Get all blades for the windmill with external ID "hornsea_1_mill_3" or "hornsea_1_mill_4" with a damaged blade

In [6]:

Copied!





result = (
    pygen.wind_turbine.select()
    .external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
    .blades.is_damaged.equals(True)
    .list_blade()
)
result
result = (
    pygen.wind_turbine.select()
    .external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
    .blades.is_damaged.equals(True)
    .list_blade()
)
result

Out[6]:

	space	external_id	is_damaged	name	data_record
0	sp_wind	hornsea_1_mill_4_blade_C	True	C	{'version': 1, 'last_updated_time': 2024-11-16...

Same query as above but return the turbines

In [7]:

Copied!





result = (
    pygen.wind_turbine.select()
    .external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
    .blades.is_damaged.equals(True)
    .list_full()
)
result
result = (
    pygen.wind_turbine.select()
    .external_id.in_(["hornsea_1_mill_3", "hornsea_1_mill_4"])
    .blades.is_damaged.equals(True)
    .list_full()
)
result

Out[7]:

	space	external_id	capacity	name	blades	datasheets	nacelle	rotor	windfarm	data_record
0	sp_wind	hornsea_1_mill_4	7.0	hornsea_1_mill_4	[{'space': 'sp_wind', 'external_id': 'hornsea_...	[windmill_schematics]	hornsea_1_mill_4_nacelle	hornsea_1_mill_4_rotor	Hornsea 1	{'version': 4, 'last_updated_time': 2024-11-16...

We can also inspect the query we are doing:

In [8]:

Copied!

pygen.wind_turbine.query().nacelle.gearbox
pygen.wind_turbine.query().nacelle.gearbox

Out[8]:

Query

Call .list_full() to return a list of Windturbine and .list_gearbox() to return a list of Gearbox.

In the query above, we go from WindTurbine node through the nacelle direct relation to the Nacelle node, and continue through the gearbox direct relation to the Gearbox node.

Warning Notebook IDE

The autocomplete shown in the screenshots below works best in a jupyter notebook IDE. For example, VS Code notebook/PyCharm notebooks/Jupyter Lite (CDF Notebook) you might not get the same autocomplete when writing . + tab.

This style of querying is initiated by calling the .select() method on the place we want to start the query. Then, all properties that pygen supports filtering on are available as an attribute on the returning object of query. In addition, all types of connections (edges, direct relations, and reverse direct relations) can be traversed. Illustrated in the screenshot below. To get the list of possible options write . and press tab.

We can select a properties, and depending on the type of property, we will get the available filters up. For example, name is a string property which will make equals, in and prefix filter available.

Ones we have input the filterin values we are back at the source node and can continue to filter on properties of this node or traverse do the next one. In the example below, we traverse to the nacelle:

To make the query more readable it is recommended that you use the paranthesis syntax. This makes the query more readable.

Finally, finish with either .list_full() or .list_<type>() to return the all nodes and edges in the query, or only the last node.

In [11]:

Copied!





result = (
    pygen.wind_turbine.select()
    .capacity.range(6.0, 8.0)
    .name.prefix("hornsea")
    .blades.is_damaged.equals(True)
    .list_blade()
)
result
result = (
    pygen.wind_turbine.select()
    .capacity.range(6.0, 8.0)
    .name.prefix("hornsea")
    .blades.is_damaged.equals(True)
    .list_blade()
)
result

Out[11]:

	space	external_id	is_damaged	name	data_record
0	sp_wind	hornsea_1_mill_1_blade_A	True	A	{'version': 1, 'last_updated_time': 2024-11-16...
1	sp_wind	hornsea_1_mill_4_blade_C	True	C	{'version': 1, 'last_updated_time': 2024-11-16...
2	sp_wind	hornsea_1_mill_2_blade_B	True	B	{'version': 1, 'last_updated_time': 2024-11-16...

Sorting¶

You can also sort the results returned by the .select() method. This is done by the .sort_ascending() and .sort_descending() property you are sorting.

Query: Get all damaged blades connected to turbines with capacity between 6.0 and 8.0 and a names that starts with "hornsea". Sort the blades by name

In [6]:

Copied!





result = (
    pygen.wind_turbine.select()
    .capacity.range(6.0, 8.0)
    .name.prefix("hornsea")
    .blades.is_damaged.equals(True)
    .name.sort_ascending()
    .list_blade()
)
result
result = (
    pygen.wind_turbine.select()
    .capacity.range(6.0, 8.0)
    .name.prefix("hornsea")
    .blades.is_damaged.equals(True)
    .name.sort_ascending()
    .list_blade()
)
result

Out[6]:

	space	external_id	is_damaged	name	data_record
0	sp_wind	hornsea_1_mill_1_blade_A	True	A	{'version': 1, 'last_updated_time': 2024-11-16...
1	sp_wind	hornsea_1_mill_2_blade_B	True	B	{'version': 1, 'last_updated_time': 2024-11-16...
2	sp_wind	hornsea_1_mill_4_blade_C	True	C	{'version': 1, 'last_updated_time': 2024-11-16...

In addition, for properties of type TimeStampand Date you can use .latest() and .earliest(). This is a shorthand for calling .sort_descending()/.sort_decending() + setting limit=1.

Query: Get the latest uploaded datasheet for the "hornsea_1_mill_1" turbine

In [7]:

Copied!

result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").datasheets.uploaded_time.latest().list_data_sheet()
result
result = pygen.wind_turbine.select().name.equals("hornsea_1_mill_1").datasheets.uploaded_time.latest().list_data_sheet()
result

Out[7]:

	space	external_id	is_uploaded	mime_type	name	uploaded_time	data_record
0	sp_wind	windmill_schematics	True	application/pdf	windmill_schematics.pdf	2024-11-16 14:19:28.484000+00:00	{'version': 2, 'last_updated_time': 2024-11-16...

GraphQL based Querying¶

When querying with GraphQL we must include __typename of the top level items as this is used by pygen to understand how to pase the object.

The querying method is available on the top level client as this is not particular to any of the data types in your data model

In [12]:

Copied!





my_query = """{
  listWindTurbine(first:1){
    items{
      __typename
      name
      capacity
      nacelle{
        externalId
      }
      rotor{
        externalId
      }
      blades{
        items{
          name
          is_damaged
        }
      }
    }
  }
}"""
my_query = """{
  listWindTurbine(first:1){
    items{
      __typename
      name
      capacity
      nacelle{
        externalId
      }
      rotor{
        externalId
      }
      blades{
        items{
          name
          is_damaged
        }
      }
    }
  }
}"""

In [13]:

Copied!

result = pygen.graphql_query(my_query)
result = pygen.graphql_query(my_query)

In [14]:

Copied!

result
result

Out[14]:

	blades	capacity	nacelle	name	rotor
0	[{'space': None, 'external_id': None, 'data_re...	7.0	{'space': None, 'external_id': 'hornsea_1_mill...	hornsea_1_mill_3	{'space': None, 'external_id': 'hornsea_1_mill...

In [15]:

Copied!

result[0].model_dump(exclude_none=True)
result[0].model_dump(exclude_none=True)

Out[15]:

{'blades': [{'is_damaged': False, 'name': 'A'},
  {'is_damaged': False, 'name': 'B'},
  {'is_damaged': False, 'name': 'C'}],
 'capacity': 7.0,
 'nacelle': {'external_id': 'hornsea_1_mill_3_nacelle'},
 'name': 'hornsea_1_mill_3',
 'rotor': {'external_id': 'hornsea_1_mill_3_rotor'}}

In [16]:

Copied!

turbine = result[0]
turbine = result[0]

In [17]:

Copied!

turbine.nacelle.external_id
turbine.nacelle.external_id

Out[17]:

'hornsea_1_mill_3_nacelle'

In [ ]:

Pitfalls¶

If you forget to include __typename on the top level item pygen will raise a Runtime error

In [18]:

Copied!





my_invalid_query = """
{
  listWindTurbine(first:1){
    items{
      name
  }
 }
}
"""
my_invalid_query = """
{
  listWindTurbine(first:1){
    items{
      name
  }
 }
}
"""

In [19]:

Copied!

pygen.graphql_query(my_invalid_query)
pygen.graphql_query(my_invalid_query)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[19], line 1
----> 1 pygen.graphql_query(my_invalid_query)

File ~\Projects\internal\pygen\examples\wind_turbine\_api_client.py:242, in WindTurbineClient.graphql_query(self, query, variables)
    240 data_model_id = dm.DataModelId("sp_pygen_power", "WindTurbine", "1")
    241 result = self._client.data_modeling.graphql.query(data_model_id, query, variables)
--> 242 return GraphQLQueryResponse(data_model_id).parse(result)

File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:567, in GraphQLQueryResponse.parse(self, response)
    565     raise RuntimeError(response["errors"])
    566 _, data = list(response.items())[0]
--> 567 self._parse_item(data)
    568 if "pageInfo" in data:
    569     self._output.page_info = PageInfo.load(data["pageInfo"])

File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:575, in GraphQLQueryResponse._parse_item(self, data)
    573 if "items" in data:
    574     for item in data["items"]:
--> 575         self._parse_item(item)
    576 elif "__typename" in data:
    577     try:

File ~\Projects\internal\pygen\examples\wind_turbine\_api\_core.py:584, in GraphQLQueryResponse._parse_item(self, data)
    582         self._output.append(item)
    583 else:
--> 584     raise RuntimeError("Missing '__typename' in GraphQL response. Cannot determine the type of the response.")

RuntimeError: Missing '__typename' in GraphQL response. Cannot determine the type of the response.

In [ ]:

Data Classes¶

When you call .list(), .retrieve() and .search(), pygen` returns ther read format of data classes. This read format matches the type/view required/optional properties.

When you do the graphql_query above all properties are optional as pygen cannot know which objects you included in your query, thus pygen uses a special GraphQL format of the data class it is returning

In [20]:

Copied!





my_query = """{
  listWindTurbine(first:1){
    items{
      name
      __typename
  }
 }
}"""
my_query = """{
  listWindTurbine(first:1){
    items{
      name
      __typename
  }
 }
}"""

In [21]:

Copied!

result = pygen.graphql_query(my_query)
result = pygen.graphql_query(my_query)

In [22]:

Copied!

type(result)
type(result)

Out[22]:

wind_turbine.data_classes._core.base.GraphQLList

In [24]:

Copied!

type(result[0])
type(result[0])

Out[24]:

wind_turbine.data_classes._wind_turbine.WindTurbineGraphQL

This data class can be converted to a regular write or read format by calling the as_write and as_read call.

Warning If you have not included all required properties in the your GraphQL query, pygen will raise an ValueError when you do this call.

In [25]:

Copied!

result[0].as_read()
result[0].as_read()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[25], line 1
----> 1 result[0].as_read()

File ~\Projects\internal\pygen\examples\wind_turbine\data_classes\_wind_turbine.py:154, in WindTurbineGraphQL.as_read(self)
    152 """Convert this GraphQL format of wind turbine to the reading format."""
    153 if self.data_record is None:
--> 154     raise ValueError("This object cannot be converted to a read format because it lacks a data record.")
    155 return WindTurbine(
    156     space=self.space,
    157     external_id=self.external_id,
   (...)
    172     windfarm=self.windfarm,
    173 )

ValueError: This object cannot be converted to a read format because it lacks a data record.

In [27]:

Copied!





my_query = """{
  listWindTurbine(first:1){
    items{
      name
      space
      externalId
      createdTime
      lastUpdatedTime
      __typename
  }
 }
}"""
my_query = """{
  listWindTurbine(first:1){
    items{
      name
      space
      externalId
      createdTime
      lastUpdatedTime
      __typename
  }
 }
}"""

In [28]:

Copied!

result = pygen.graphql_query(my_query)
result = pygen.graphql_query(my_query)

In [29]:

Copied!

windmill_read = result[0].as_read()
windmill_read
windmill_read = result[0].as_read()
windmill_read

Out[29]:

	value
space	sp_wind
external_id	hornsea_1_mill_3
data_record	{'version': 0, 'last_updated_time': 2024-11-16...
node_type	None
capacity	None
description	None
name	hornsea_1_mill_3
blades	[]
datasheets	[]
metmast	[]
nacelle	None
power_curve	None
rotor	None
windfarm	None

In [30]:

Copied!

type(windmill_read)
type(windmill_read)

Out[30]:

wind_turbine.data_classes._wind_turbine.WindTurbine

In [31]:

Copied!

windmill_write = result[0].as_write()
windmill_write
windmill_write = result[0].as_write()
windmill_write

Out[31]:

	value
space	sp_wind
external_id	hornsea_1_mill_3
data_record	{'existing_version': 0}
node_type	None
capacity	None
description	None
name	hornsea_1_mill_3
blades	[]
datasheets	[]
metmast	[]
nacelle	None
power_curve	None
rotor	None
windfarm	None

In [32]:

Copied!

type(windmill_write)
type(windmill_write)

Out[32]:

wind_turbine.data_classes._wind_turbine.WindTurbineWrite

Paging¶

If we include a pageInfo in our query this will be available directly on the result returned from the .graphql_query method

In [33]:

Copied!





my_query = """
{
  listWindTurbine{
    items{
      __typename
      name
  }
	pageInfo{
    hasNextPage
    hasPreviousPage
    startCursor
    endCursor
  }
 }
}"""
my_query = """
{
  listWindTurbine{
    items{
      __typename
      name
  }
	pageInfo{
    hasNextPage
    hasPreviousPage
    startCursor
    endCursor
  }
 }
}"""

In [34]:

Copied!

result = pygen.graphql_query(my_query)
result = pygen.graphql_query(my_query)

In [35]:

Copied!

result.page_info.has_next_page
result.page_info.has_next_page

Out[35]:

True

In [36]:

Copied!

result.page_info.end_cursor[:20]
result.page_info.end_cursor[:20]

Out[36]:

'Z0FBQUFBQm5PTjBsRlUx'

In [ ]:

Next section: Creating and Deleting