Input Output

Code generated from the input- and output-definitions of a Studio workflow function. For a dataset input and an output the code editor would add:

@dataclass(kw_only=True)
class Inputs(BaseInputs):
    dsi: In.Dataset  # "dsi"

@dataclass(kw_only=True)
class Outputs(BaseOutputs):
   dso: Out.Dataset  # "dso"

The Inputs and Outputs classes will have all the inputs and outputs defined as attributes. The commented strings show the original names of the inputs and outputs as defined in the workflow function.

An example of a function using these generated classes:

def f(inputs: Inputs) -> Outputs:
  arrow_table = inputs.dsi.arrow()

  outputs = Outputs(
    dso = Out.Dataset(arrow_table)
  )

  return outputs

Here the input dsi is to create a pyarrow.Table using the .arrow() method and output dso is returned as a Studio Dataset using the Out.Dataset class.

The dataset input and output have an attribute multiple. Enabling this would generate In. and Out.Datasets:

@dataclass(kw_only=True)
class Inputs(BaseInputs):
    mdsi: In.Datasets  # "mdsi"

@dataclass(kw_only=True)
class Outputs(BaseOutputs):
   mdso: Out.Datasets  # "mdso"

Multi-input and outputs (also called Packages) can be used when a function need to produce or consume several files or datasets.

An example of a function using these generated classes:

def f(inputs: Inputs) -> Outputs:
  tables = []

  for item in inputs.mdsi.iter():
    inp = item.load()
    tables.append(inp.arrow())

  outputs = Outputs(
    mdso = tables
  )

  return outputs
class workflow_utils.input_output.BaseOutputTransport

Bases: object

Base class for output transport definitions used by the user function to upload files and datasets.

Example

The Output class can be reconfigured to inherit from BaseOutputTransport:

@dataclass(init=False)
class Outputs(BaseOutputTransport):
    out_file: Upload.File
    out_files: Upload.Files
    out_dataset: Upload.Dataset
    out_datasets: Upload.Datasets

Used like this:

@dataclass(kw_only=True)
class Inputs(BaseInputs):
    in_file: In.File
    in_dataset: In.Dataset

def f(inputs: Inputs, outputs: Outputs) -> None:
    outputs.out_file.upload(inputs.in_file)
    outputs.out_files.upload(inputs.in_file)
    outputs.out_dataset.upload(inputs.in_dataset)
    outputs.out_datasets.upload(inputs.in_dataset)

Note that outputs has to be initialized in main:

def main(wio: WorkflowIo):
    outputs = Outputs()
    outputs.init_members(wio)
    inputs = Inputs(**collect_inputs(input_cls=Inputs, io_instance=wio))
    f(inputs, outputs)
init_members(io: WorkflowIo) None

Initialize the members of this class.

class workflow_utils.input_output.In

Bases: object

Class used by the Studio workflow builder when generating code from user input in the function authoring tool.

class CSV(*, _uri: str | None = None)

Bases: Typedfile

Class generated for a CSV file input.

arrow(**kwargs: Any) Table

Parse the CSV file content as a pyarrow Table.

content_type: str = 'text/csv;charset=UTF-8'
dataset(**kwargs: Any) Dataset

Parse the CSV file content as a workflow_utils Dataset.

pandas(**kwargs: Any) DataFrame

Parse the CSV file content as a pandas DataFrame.

polars(**kwargs: Any) DataFrame

Parse the CSV file content as a polars DataFrame.

class Dataset(value: Table)

Bases: object

Class for passing a dataset value as argument to a Studio User Function.

arrow() Table

Use the pyarrow Table object.

dataset() Dataset

Used the workflow_utils Dataset.

pandas() DataFrame

Convert the dataset to a pandas DataFrame.

polars() DataFrame

Convert the dataset to a polars DataFrame.

value: Table
class Datasets(value: PackageInfo, presigned_base_url: str)

Bases: Package

Class generated for a Package Dataset Input.

Use the iter() method to iterate over the package content information represented by PackageItem. Use PackageItem.load() to load the dataset as an In.Dataset object like regular dataset inputs.

class Enum(value: str)

Bases: object

Class generated for an enum input.

value: the enum value as a string.

value: str
class File(*, content_type: str | None = None, _uri: str | None = None)

Bases: object

Class for passing a file value as argument to a Studio User Function.

The default behaviour of this class is to load the data when bytes() or any of the other converting methods (e.g. polars()) is called. If the payload needs to be accessed several times in its raw for it should be memoized by the user. The store() method can be used to load the payload and store it on disk.

arrow(**kwargs: Any) Table

Parse the file content as a pyarrow Table.

bytes() bytes

Read a binary file and return the content as bytes.

content_type: str | None = None
dataset(**kwargs: Any) Dataset

Convert the dataset to a workflow_utils.dataset.Dataset.

json() Any

Read file as JSON and return the content as a dictionary.

pandas(**kwargs: Any) DataFrame

Parse the file content as a pandas DataFrame.

polars(**kwargs: Any) DataFrame

Parse the file content as a polars DataFrame.

store() StoredFile

Convert self to a In.StoredFile which stores the content on disk.

text() str

Read file as text and return the content as a string.

class Files(value: PackageInfo, presigned_base_url: str)

Bases: Package

Class generated for a Package File Input.

Use the iter() method to iterate over the package content information represented by PackageItem. Use PackageItem.load() to load the dataset as an In.File object like regular file inputs.

class JSON(value: dict[Any, Any])

Bases: object

Class generated for a JSON input.

value: The JSON as a dictionary.

value: dict[Any, Any]
class JSONFile(*, _uri: str | None = None)

Bases: Typedfile

Class generated for a JSON file input.

content_type: str = 'application/json;charset=UTF-8'
load() Any

Read the file and parse it as JSON, returning the parsed content.

model(model_type: type[_T]) _T

Parse the JSON file and validate it into a Pydantic model.

Parameters:

model_type (type[T]) – The Pydantic model class to validate against.

Returns:

A validated instance of the given model type.

Example:

@dataclass(kw_only=True)
class Inputs(BaseInputs):
    config: In.JSONFile

class MyConfig(BaseModel):
    threshold: float
    name: str

config = inputs.config.model(MyConfig)
class Markdown(*, _uri: str | None = None)

Bases: Typedfile

Class generated for a Markdown file input.

content_type: str = 'text/markdown;charset=UTF-8'
class Python(*, _uri: str | None = None)

Bases: Typedfile

Class generated for a Python file input.

content_type: str = 'text/x-python;charset=UTF-8'
class Secret(secret: str)

Bases: object

Class generated for a secret input.

Use reveal() on this object to get the secret plain text.

reveal()

Reveal the secret. Use this function to pass the secret to the applicable credential input.

Returns:

The secret in clear text

class StoredFile(*, content_type: str | None = None, _uri: str | None = None, file_name: str | None = None)

Bases: File

Class for passing a file value as argument to a Studio User Function.

Parameters:
  • file_name (str) – The path to the downloaded payload.

  • content_type (str) – The Content-Type header value or “application/octet-stream” if not set.

arrow(**kwargs: Any) Table

Parse the file content as a pyarrow Table.

bytes() bytes

Read a binary file and return the content as bytes.

dataset(**kwargs: Any) Dataset

Convert the dataset to a workflow_utils.dataset.Dataset.

file_name: str | None = None
json() Any

Read file as JSON and return the content as a dictionary.

pandas(**kwargs: Any) DataFrame

Parse the file content as a pandas DataFrame.

polars(**kwargs: Any) DataFrame

Parse the file content as a polars DataFrame.

text() str

Read file as text and return the content as a string.

class Stream(_parameter: int, _wio: WorkflowIo)

Bases: object

Class generated for a data stream input.

values(count: int | None = None) list[Message]

Retrieve messages from the data stream.

class Text(*, _uri: str | None = None)

Bases: Typedfile

Class generated for a text file input.

content_type: str = 'text/plain;charset=UTF-8'
class Typedfile(*, _uri: str | None = None)

Bases: File

Base class for typed file inputs with a fixed content type.

Parameters:

file_name (str) – The path to the downloaded file payload.

bytes() bytes

Read the file and return its content as a string.

text() str

Read the file and return its content as a string.

class XML(*, _uri: str | None = None)

Bases: Typedfile

Class generated for an XML file input.

content_type: str = 'application/xml;charset=UTF-8'
parse() Element

Parse the file as XML and return the root element.

class workflow_utils.input_output.Out

Bases: object

Class generated to handle outputs from a workflow function.

class Arrow(value: DataFrame | DataFrame | Series | Series | Table | Dataset, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a dataframe value with apache-arrow ipc stream serialization as a Studio File.

content_type: str = 'application/vnd.apache.arrow.stream'
value: DataFrame | DataFrame | Series | Series | Table | Dataset
class Bytes(value: bytes, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning an arbitrary byte value as a Studio File.

content_type: str = 'application/octet-stream'
value: bytes
class CSV(value: DataFrame | DataFrame | Series | Series | Table | Dataset, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a dataframe value serialized as CSV as a Studio File.

content_type: str = 'text/csv;charset=UTF-8'
value: DataFrame | DataFrame | Series | Series | Table | Dataset
class Dataset(value: DataFrame | DataFrame | Series | Series | Table | Dataset, name: str | None = None, prio: int | None = None)

Bases: object

Class for returning a dataframe value as a Studio Dataset.

Parameters:

name (str, optional) – The name of the uploaded dataset. If not set the output port name is used

For package item use:: :param prio: The prio of the uploaded package item, lower numbers are higher priority :type prio: int, optional

name: str | None = None
prio: int | None = None
value: DataFrame | DataFrame | Series | Series | Table | Dataset
class Datasets(values: list[Dataset | DataFrame | DataFrame | Series | Series | Table | Dataset])

Bases: object

Class for returning a number of dataframe values as a Studio Package.

Each Item can have additional info such as name, prio and position if values are passed as Out.Dataset with package_options: Out.PkgOpt member.

values: list[Dataset | DataFrame | DataFrame | Series | Series | Table | Dataset]
class File(value: DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes, content_type: str | None = None, name: str | None = None, prio: int | None = None)

Bases: object

Class for returning a value as Studio File.

Parameters:
  • value – The payload to upload. Can be dataset/dataframe reference, a file-name, or bytes.

  • value

  • content_type (str, optional) – When uploading files, the Content-Type header is set to this value

  • name (str, optional) – The name of the uploaded dataset. If not set the output port name is used

For package item use: :param prio: The prio of the uploaded package item, lower numbers are higher priority :type prio: int, optional

content_type: str | None = None
name: str | None = None
prio: int | None = None
value: DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes
class Files(values: list[File | DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes])

Bases: object

Class for returning a value as Studio File.

values: list[File | DataFrame | DataFrame | Series | Series | Table | Dataset | str | bytes]
class JSONFile(value: str, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a string value as a JSON Studio File.

content_type: str = 'application/json;charset=UTF-8'
value: str
class Markdown(value: str, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a string value as a Markdown Studio File.

content_type: str = 'text/markdown;charset=UTF-8'
value: str
class Python(value: str, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a string value as a Python Studio File.

content_type: str = 'text/x-python;charset=UTF-8'
value: str
class Stream(value: DataFrame | DataFrame | Series | Series | bytes | list)

Bases: object

Class for returning a dataframe value as a Studio Stream.

value: DataFrame | DataFrame | Series | Series | bytes | list
class Text(value: str, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a string value as an utf-8 encoded Studio File.

content_type: str = 'text/plain;charset=UTF-8'
value: str
class XLSX(value: DataFrame | DataFrame | Series | Series | Table | Dataset, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a dataframe value serialized as Excel (xlsx) as a Studio File.

content_type: str = 'application/vnd.openxmlformatsofficedocument.spreadsheetml.sheet'
value: DataFrame | DataFrame | Series | Series | Table | Dataset
class XML(value: str, name: str | None = None, prio: int | None = None)

Bases: File

Class for returning a string value as an XML Studio File.

content_type: str = 'application/xml;charset=UTF-8'
value: str
class workflow_utils.input_output.Package(value: PackageInfo, presigned_base_url: str)

Bases: object

Class generated for a Package Input.

By contrast package outputs just take a list of things to add.

value: PackageInfo

iter() Iterator[PackageItem]
presigned_base_url: str
value: PackageInfo
class workflow_utils.input_output.PackageItem(*, order: int, position: int, prio: int | None, type: str, name: str, version_number: int, size: int, mime_type: str, _uri_rest: str, _uri_flight: str, _presigned_id: str, _item_id: str)

Bases: object

load() File | Dataset | None
mime_type: str
name: str
order: int
position: int
prio: int | None
size: int
type: str
version_number: int
class workflow_utils.input_output.PackageUploader(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)

Bases: object

field_name: str
parameter: int
upload_dataset(value: Any, name: str | None = None, prio: int | None = None, format: str | None = None) bool
upload_file(value: Any, name: str | None = None, prio: int | None = None, format: str | None = None) bool
wio: WorkflowIo
class workflow_utils.input_output.Upload

Bases: object

class Dataset(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)

Bases: PackageUploader

upload(value)
class Datasets(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)

Bases: PackageUploader

upload(value, name: str | None = None, prio: int | None = None)
class File(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)

Bases: PackageUploader

upload(value)
class Files(parameter: int, field_name: str, wio: workflow_utils.workflow_io.WorkflowIo)

Bases: PackageUploader

upload(value, name: str | None = None, prio: int | None = None)