Input Output
Code generated from the input- and output-definitions of a Studio workflow function. For a dataset input and an output the code editor would add:
@dataclass(kw_only=True)
class Inputs(BaseInputs):
dsi: In.Dataset # "dsi"
@dataclass(kw_only=True)
class Outputs(BaseOutputs):
dso: Out.Dataset # "dso"
The Inputs and Outputs classes will have all the inputs and outputs defined as attributes. The commented strings show the original names of the inputs and outputs as defined in the workflow function.
An example of a function using these generated classes:
def f(inputs: Inputs) -> Outputs:
arrow_table = inputs.dsi.arrow()
outputs = Outputs(
dso = Out.Dataset(arrow_table)
)
return outputs
Here the input dsi is to create a pyarrow.Table using the .arrow() method and
output dso is returned as a Studio Dataset using the Out.Dataset class.
The dataset input and output have an attribute multiple. Enabling this would generate In. and Out.Datasets:
@dataclass(kw_only=True)
class Inputs(BaseInputs):
mdsi: In.Datasets # "mdsi"
@dataclass(kw_only=True)
class Outputs(BaseOutputs):
mdso: Out.Datasets # "mdso"
Multi-input and outputs (also called Packages) can be used when a function need to produce or consume several files or datasets.
An example of a function using these generated classes:
def f(inputs: Inputs) -> Outputs:
tables = []
for item in inputs.mdsi.iter():
inp = item.load()
tables.append(inp.arrow())
outputs = Outputs(
mdso = tables
)
return outputs
- class workflow_utils.input_output.BaseOutputTransport
Bases:
objectBase class for output transport definitions used by the user function to upload files and datasets.
Example
The Output class can be reconfigured to inherit from
BaseOutputTransport:@dataclass(init=False) class Outputs(BaseOutputTransport): out_file: Upload.File out_files: Upload.Files out_dataset: Upload.Dataset out_datasets: Upload.Datasets
Used like this:
@dataclass(kw_only=True) class Inputs(BaseInputs): in_file: In.File in_dataset: In.Dataset def f(inputs: Inputs, outputs: Outputs) -> None: outputs.out_file.upload(inputs.in_file) outputs.out_files.upload(inputs.in_file) outputs.out_dataset.upload(inputs.in_dataset) outputs.out_datasets.upload(inputs.in_dataset)
Note that outputs has to be initialized in main:
def main(wio: WorkflowIo): outputs = Outputs() outputs.init_members(wio) inputs = Inputs(**collect_inputs(input_cls=Inputs, io_instance=wio)) f(inputs, outputs)
- init_members(wio: WorkflowIo) None
Initialize the members of this class.
- class workflow_utils.input_output.In
Bases:
objectClass used by the Studio workflow builder when generating code from user input in the function authoring tool.
- class Dataset(value: Table)
Bases:
objectClass for passing a dataset value as argument to a Studio User Function.
- arrow() Table
Use the pyarrow Table object.
- pandas() DataFrame
Convert the dataset to a pandas DataFrame.
- polars() DataFrame
Convert the dataset to a polars DataFrame.
- value: Table
- class Datasets(value: PackageInfo, presigned_base_url: str)
Bases:
PackageClass generated for a Package Dataset Input.
Use the
iter()method to iterate over the package content information represented byPackageItem. UsePackageItem.load()to load the dataset as anIn.Datasetobject like regular dataset inputs.
- class Enum(value: str)
Bases:
objectClass generated for an enum input.
value: the enum value as a string.
- value: str
- class File(file_name: str, content_type: str | None = None)
Bases:
objectClass for passing a file value as argument to a Studio User Function.
- Parameters:
file_name (str) – The path to the downloaded payload.
content_type (str) – The Content-Type header value or “application/octet-stream” if not set.
- arrow(**kwargs: Any) Table
Parse the file content as a pyarrow Table.
- bytes() bytes
Read a binary file and return the content as bytes.
- content_type: str | None = None
- file_name: str
- json() Any
Read file as JSON and return the content as a dictionary.
- pandas(**kwargs: Any) DataFrame
Parse the file content as a pandas DataFrame.
- polars(**kwargs: Any) DataFrame
Parse the file content as a polars DataFrame.
- text() str
Read file as text and return the content as a string.
- class Files(value: PackageInfo, presigned_base_url: str)
Bases:
PackageClass generated for a Package File Input.
Use the
iter()method to iterate over the package content information represented byPackageItem. UsePackageItem.load()to load the dataset as anIn.Fileobject like regular file inputs.
- class JSON(value: dict[Any, Any])
Bases:
objectClass generated for a JSON input.
value: The JSON as a dictionary.
- value: dict[Any, Any]
- class Secret(secret: str)
Bases:
objectClass generated for a secret input.
Use reveal() on this object to get the secret plain text.
- reveal()
Reveal the secret. Use this function to pass the secret to the applicable credential input.
- Returns:
The secret in clear text
- class Stream(_parameter: int, _wio: WorkflowIo)
Bases:
objectClass generated for a data stream input.
- class workflow_utils.input_output.Out
Bases:
objectClass generated to handle outputs from a workflow function.
- class Arrow(value: DataFrame | DataFrame | Series | Series | Table | Dataset, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
FileClass for returning a dataframe value with apache-arrow ipc stream serialization as a Studio File.
- content_type: str | None = 'application/vnd.apache.arrow.stream'
- class Bytes(value: bytes, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
FileClass for returning an arbitrary byte value as a Studio File.
- content_type: str | None = 'application/octet-stream'
- value: bytes
- class CSV(value: DataFrame | DataFrame | Series | Series | Table | Dataset, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
FileClass for returning a dataframe value serialized as CSV as a Studio File.
- content_type: str | None = 'text/csv;charset=UTF-8'
- class Dataset(value: DataFrame | DataFrame | Series | Series | Table | Dataset, name: str | None = None, prio: int | None = None)
Bases:
objectClass for returning a dataframe value as a Studio Dataset.
- Parameters:
name (str, optional) – The name of the uploaded dataset. If not set the output port name is used
For package item use:: :param prio: The prio of the uploaded package item, lower numbers are higher priority :type prio: int, optional
- name: str | None = None
- prio: int | None = None
- class Datasets(values: list[Dataset | DataFrame | DataFrame | Series | Series | Table | Dataset])
Bases:
objectClass for returning a number of dataframe values as a Studio Package.
Each Item can have additional info such as name, prio and position if values are passed as
Out.Datasetwithpackage_options: Out.PkgOptmember.
- class File(value: DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
objectClass for returning a value as Studio File.
- Parameters:
value – The payload to upload. Can be dataset/dataframe reference, a file-name, or bytes.
value
content_type (str, optional) – When uploading files, the Content-Type header is set to this value
name (str, optional) – The name of the uploaded dataset. If not set the output port name is used
For package item use: :param prio: The prio of the uploaded package item, lower numbers are higher priority :type prio: int, optional
- content_type: str | None = None
- name: str | None = None
- prio: int | None = None
- class Files(values: list[File | DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes])
Bases:
objectClass for returning a value as Studio File.
- class Stream(value: DataFrame | DataFrame | Series | Series | bytes | list)
Bases:
objectClass for returning a dataframe value as a Studio Stream.
- value: DataFrame | DataFrame | Series | Series | bytes | list
- class Text(value: str, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
FileClass for returning a string value as an utf-8 encoded Studio File.
- content_type: str | None = 'text/plain;charset=UTF-8'
- value: str
- class XLSX(value: DataFrame | DataFrame | Series | Series | Table | Dataset, content_type: str | None = None, name: str | None = None, prio: int | None = None)
Bases:
FileClass for returning a dataframe value serialized as Excel (xlsx) as a Studio File.
- content_type: str | None = 'application/vnd.openxmlformatsofficedocument.spreadsheetml.sheet'
- class workflow_utils.input_output.Package(value: PackageInfo, presigned_base_url: str)
Bases:
objectClass generated for a Package Input.
By contrast package outputs just take a list of things to add.
value: PackageInfo
- iter() Iterator[PackageItem]
- presigned_base_url: str
- value: PackageInfo
- class workflow_utils.input_output.PackageItem(*, order: int, position: int, prio: int | None, type: str, name: str, version_number: int, size: int, mime_type: str, _uri_rest: str, _uri_flight: str, _presigned_id: str, _item_id: str)
Bases:
object- mime_type: str
- name: str
- order: int
- position: int
- prio: int | None
- size: int
- type: str
- version_number: int
- class workflow_utils.input_output.PackageUploader(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)
Bases:
object- field_name: str
- parameter: int
- upload_dataset(value: Any, name: str | None = None, prio: int | None = None, format: str | None = None) bool
- upload_file(value: Any, name: str | None = None, prio: int | None = None, format: str | None = None) bool
- wio: WorkflowIo
- class workflow_utils.input_output.Upload
Bases:
object- class Dataset(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)
Bases:
PackageUploader- upload(value)
- class Datasets(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)
Bases:
PackageUploader- upload(value, name: str | None = None, prio: int | None = None)
- class File(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)
Bases:
PackageUploader- upload(value)
- class Files(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)
Bases:
PackageUploader- upload(value, name: str | None = None, prio: int | None = None)