API Calling: Python APIs

API calling for large Python APIs is currently experimental. In particular, we observe a decrease in stability with increasing number of total parameters offered to the LLM. Due to this limitation, we recommend benchmarking the stability of the calls using our benchmarking framework. If you're interested in the performance of a specific API / LLM combination, don't hesitate to get in touch.

Generic Python API ingestion

Using Pydantic parsing, we autogenerate API descriptions for tool bindings. While this allows better scaling (given suitable structure of the ingested code, particularly with respect to the docstrings), it offers less control than the manual implementation of API descriptions. For instance, it is much harder to reduce the set of parameters to the essentials.

Module for ingesting any Python module and generating a query builder.

`GenericQueryBuilder`

Bases: BaseQueryBuilder

A class for building a generic query using LLM tools.

The query builder works by ingesting a Python module and generating a list of Pydantic classes for each callable in the module. It then uses these classes to parameterise a query using LLM tool binding.

Source code in biochatter/api_agent/python/generic_agent.py

class GenericQueryBuilder(BaseQueryBuilder):
    """A class for building a generic query using LLM tools.

    The query builder works by ingesting a Python module and generating a list
    of Pydantic classes for each callable in the module. It then uses these
    classes to parameterise a query using LLM tool binding.
    """

    def create_runnable(
        self,
        query_parameters: list["BaseAPIModel"],
        conversation: Conversation,
    ) -> Callable:
        """Create a runnable object for the query builder.

        Args:
        ----
            query_parameters: The list of Pydantic classes to be used for the
                query.

            conversation: The conversation object used for parameterising the
                query.

        Returns:
        -------
            The runnable object for the query builder.

        """
        runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
        return runnable | PydanticToolsParser(tools=query_parameters)

    def parameterise_query(
        self,
        question: str,
        prompt: str,
        conversation: "Conversation",
        module: ModuleType,
        generated_classes: list[BaseAPIModel] | None = None,
    ) -> list[BaseAPIModel]:
        """Parameterise tool calls for any Python module.

        Generate a list of parameterised BaseModel instances based on the given
        question, prompt, and BioChatter conversation. Uses a Pydantic model
        to define the API fields.

        Using langchain's `bind_tools` method to allow the LLM to parameterise
        the function call, based on the functions available in the module.

        Relies on defined structure and annotation of the passed module.

        Args:
        ----
            question (str): The question to be answered.

            prompt (str): The prompt to be used for the query, instructing the
                LLM of its task and the module context.

            conversation: The conversation object used for parameterising the
                query.

            module: The Python module to be used for the query.

            generated_classes: The list of Pydantic classes to be used for the
                query. If not provided, the classes will be generated from the
                module. Allows for external injection of classes for testing
                purposes.

        Returns:
        -------
            list[BaseAPIModel]: the parameterised query object (Pydantic
                model)

        """
        if generated_classes is None:
            tools = generate_pydantic_classes(module)

        runnable = self.create_runnable(
            conversation=conversation,
            query_parameters=tools,
        )

        query = [
            ("system", prompt),
            ("human", question),
        ]

        return runnable.invoke(query)

`create_runnable(query_parameters, conversation)`

Create a runnable object for the query builder.

query_parameters: The list of Pydantic classes to be used for the
    query.

conversation: The conversation object used for parameterising the
    query.

The runnable object for the query builder.

Source code in biochatter/api_agent/python/generic_agent.py

def create_runnable(
    self,
    query_parameters: list["BaseAPIModel"],
    conversation: Conversation,
) -> Callable:
    """Create a runnable object for the query builder.

    Args:
    ----
        query_parameters: The list of Pydantic classes to be used for the
            query.

        conversation: The conversation object used for parameterising the
            query.

    Returns:
    -------
        The runnable object for the query builder.

    """
    runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
    return runnable | PydanticToolsParser(tools=query_parameters)

`parameterise_query(question, prompt, conversation, module, generated_classes=None)`

Parameterise tool calls for any Python module.

Generate a list of parameterised BaseModel instances based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields.

Using langchain's bind_tools method to allow the LLM to parameterise the function call, based on the functions available in the module.

Relies on defined structure and annotation of the passed module.

question (str): The question to be answered.

prompt (str): The prompt to be used for the query, instructing the
    LLM of its task and the module context.

conversation: The conversation object used for parameterising the
    query.

module: The Python module to be used for the query.

generated_classes: The list of Pydantic classes to be used for the
    query. If not provided, the classes will be generated from the
    module. Allows for external injection of classes for testing
    purposes.

list[BaseAPIModel]: the parameterised query object (Pydantic
    model)

Source code in biochatter/api_agent/python/generic_agent.py

def parameterise_query(
    self,
    question: str,
    prompt: str,
    conversation: "Conversation",
    module: ModuleType,
    generated_classes: list[BaseAPIModel] | None = None,
) -> list[BaseAPIModel]:
    """Parameterise tool calls for any Python module.

    Generate a list of parameterised BaseModel instances based on the given
    question, prompt, and BioChatter conversation. Uses a Pydantic model
    to define the API fields.

    Using langchain's `bind_tools` method to allow the LLM to parameterise
    the function call, based on the functions available in the module.

    Relies on defined structure and annotation of the passed module.

    Args:
    ----
        question (str): The question to be answered.

        prompt (str): The prompt to be used for the query, instructing the
            LLM of its task and the module context.

        conversation: The conversation object used for parameterising the
            query.

        module: The Python module to be used for the query.

        generated_classes: The list of Pydantic classes to be used for the
            query. If not provided, the classes will be generated from the
            module. Allows for external injection of classes for testing
            purposes.

    Returns:
    -------
        list[BaseAPIModel]: the parameterised query object (Pydantic
            model)

    """
    if generated_classes is None:
        tools = generate_pydantic_classes(module)

    runnable = self.create_runnable(
        conversation=conversation,
        query_parameters=tools,
    )

    query = [
        ("system", prompt),
        ("human", question),
    ]

    return runnable.invoke(query)

AutoGenerate Pydantic classes for each callable.

This module provides a function to generate Pydantic classes for each callable (function/method) in a given module. It extracts parameters from docstrings using docstring-parser and creates Pydantic models with fields corresponding to the parameters. If a parameter name conflicts with BaseModel attributes, it is aliased.

Examples

import scanpy as sc generated_classes = generate_pydantic_classes(sc.tl) for model in generated_classes: ... print(model.schema())

`generate_pydantic_classes(module)`

Generate Pydantic classes for each callable.

For each callable (function/method) in a given module. Extracts parameters from docstrings using docstring-parser. Each generated class has fields corresponding to the parameters of the function. If a parameter name conflicts with BaseModel attributes, it is aliased.

Params:

module : ModuleType The Python module from which to extract functions and generate models.

Returns

list[Type[BaseModel]] A list of Pydantic model classes corresponding to each function found in module.

Notes

For now, all parameter types are set to Any to avoid complications with complex or external classes that are not easily JSON-serializable.
Optional parameters (those with a None default) are represented as Optional[Any].
Required parameters (no default) use ... to indicate that the field is required.

Source code in biochatter/api_agent/python/autogenerate_model.py

def generate_pydantic_classes(module: ModuleType) -> list[type[BaseAPIModel]]:
    """Generate Pydantic classes for each callable.

    For each callable (function/method) in a given module. Extracts parameters
    from docstrings using docstring-parser. Each generated class has fields
    corresponding to the parameters of the function. If a parameter name
    conflicts with BaseModel attributes, it is aliased.

    Params:
    -------
    module : ModuleType
        The Python module from which to extract functions and generate models.

    Returns
    -------
    list[Type[BaseModel]]
        A list of Pydantic model classes corresponding to each function found in
            `module`.

    Notes
    -----
    - For now, all parameter types are set to `Any` to avoid complications with
      complex or external classes that are not easily JSON-serializable.
    - Optional parameters (those with a None default) are represented as
      `Optional[Any]`.
    - Required parameters (no default) use `...` to indicate that the field is
      required.

    """
    base_attributes = set(dir(BaseAPIModel))
    classes_list = []

    for name, func in inspect.getmembers(module, inspect.isfunction):
        # Skip private/internal functions (e.g., _something)
        if name.startswith("_"):
            continue

        # Parse docstring for parameter descriptions
        doc = inspect.getdoc(func) or ""
        parsed_doc = parse(doc)
        doc_params = {p.arg_name: p.description or "No description available." for p in parsed_doc.params}

        sig = inspect.signature(func)
        fields = {}

        for param_name, param in sig.parameters.items():
            # Skip *args and **kwargs for now
            if param_name in ("args", "kwargs"):
                continue

            # Fetch docstring description or fallback
            description = doc_params.get(param_name, "No description available.")

            # Determine default value
            # If no default, we use `...` indicating a required field
            if param.default is not inspect.Parameter.empty:
                default_value = param.default

                # Convert MappingProxyType to a dict for JSON compatibility
                if isinstance(default_value, MappingProxyType):
                    default_value = dict(default_value)

                # Handle non-JSON-compliant float values by converting to string
                if default_value in [float("inf"), float("-inf"), float("nan"), float("-nan")]:
                    default_value = str(default_value)
            else:
                default_value = ...  # No default means required

            # For now, all parameter types are Any
            annotation = Any

            # Append the original annotation as a note in the description if
            # available
            if param.annotation is not inspect.Parameter.empty:
                description += f"\nOriginal type annotation: {param.annotation}"

            # If default_value is None, parameter can be Optional
            # If not required, mark as Optional[Any]
            if default_value is None:
                annotation = Any | None

            # Prepare field kwargs
            field_kwargs = {"description": description, "default": default_value}

            # If field name conflicts with BaseModel attributes, alias it
            field_name = param_name
            if param_name in base_attributes:
                alias_name = param_name + "_param"
                field_kwargs["alias"] = param_name
                field_name = alias_name

            fields[field_name] = (annotation, Field(**field_kwargs))

        # Create the Pydantic model

        tl_parameters_model = create_model(
            name,
            **fields,
            __base__=BaseAPIModel,
        )
        classes_list.append(tl_parameters_model)
    return classes_list

Scanpy modules

We manually define the API descriptions for select Scanpy modules.

Module for generating anndata queries using LLM tools.

`AnnDataIOQueryBuilder`

Bases: BaseQueryBuilder

A class for building a AnndataIO query object.

Source code in biochatter/api_agent/python/anndata_agent.py

class AnnDataIOQueryBuilder(BaseQueryBuilder):
    """A class for building a AnndataIO query object."""

    def create_runnable(
        self,
        query_parameters: list["BaseAPIModel"],
        conversation: "Conversation",
    ) -> Callable:
        """Create a runnable object for executing queries.

        Create runnable using the LangChain `create_structured_output_runnable`
        method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
        return runnable | PydanticToolsParser(tools=query_parameters)

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> list["BaseModel"]:
        """Generate a AnnDataIOQuery object.

        Generates the object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                AnnDataIOQuery.

        Returns:
        -------
            AnnDataIOQuery: the parameterised query object (Pydantic model)

        """
        tools = [
            ReadCSV,
            ReadExcel,
            ReadH5AD,
            ReadHDF,
            ReadLoom,
            ReadMTX,
            ReadText,
            ReadZarr,
            ConcatenateAnnData,
            MapAnnData,
        ]
        runnable = self.create_runnable(
            conversation=conversation,
            query_parameters=tools,
        )
        query = [
            ("system", ANNDATA_IO_QUERY_PROMPT),
            ("human", f"{question}"),
        ]
        return runnable.invoke(
            query,
        )

`create_runnable(query_parameters, conversation)`

Create a runnable object for executing queries.

Create runnable using the LangChain create_structured_output_runnable method.

query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.

Source code in biochatter/api_agent/python/anndata_agent.py

def create_runnable(
    self,
    query_parameters: list["BaseAPIModel"],
    conversation: "Conversation",
) -> Callable:
    """Create a runnable object for executing queries.

    Create runnable using the LangChain `create_structured_output_runnable`
    method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
    return runnable | PydanticToolsParser(tools=query_parameters)

`parameterise_query(question, conversation)`

Generate a AnnDataIOQuery object.

Generates the object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    AnnDataIOQuery.

AnnDataIOQuery: the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/python/anndata_agent.py

def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> list["BaseModel"]:
    """Generate a AnnDataIOQuery object.

    Generates the object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            AnnDataIOQuery.

    Returns:
    -------
        AnnDataIOQuery: the parameterised query object (Pydantic model)

    """
    tools = [
        ReadCSV,
        ReadExcel,
        ReadH5AD,
        ReadHDF,
        ReadLoom,
        ReadMTX,
        ReadText,
        ReadZarr,
        ConcatenateAnnData,
        MapAnnData,
    ]
    runnable = self.create_runnable(
        conversation=conversation,
        query_parameters=tools,
    )
    query = [
        ("system", ANNDATA_IO_QUERY_PROMPT),
        ("human", f"{question}"),
    ]
    return runnable.invoke(
        query,
    )

`ConcatenateAnnData`

Bases: BaseAPIModel

Concatenate AnnData objects along an axis.

Source code in biochatter/api_agent/python/anndata_agent.py

class ConcatenateAnnData(BaseAPIModel):
    """Concatenate AnnData objects along an axis."""

    method_name: str = Field(default="anndata.concat", description="NEVER CHANGE")
    adatas: list | dict = Field(
        ...,
        description=(
            "The objects to be concatenated. "
            "Either a list of AnnData objects or a mapping of keys to AnnData objects."
        ),
    )
    axis: str = Field(
        default="obs",
        description="Axis to concatenate along. Can be 'obs' (0) or 'var' (1). Default is 'obs'.",
    )
    join: str = Field(
        default="inner",
        description="How to align values when concatenating. Options: 'inner' or 'outer'. Default is 'inner'.",
    )
    merge: str | Callable | None = Field(
        default=None,
        description=(
            "How to merge elements not aligned to the concatenated axis. "
            "Strategies include 'same', 'unique', 'first', 'only', or a callable function."
        ),
    )
    uns_merge: str | Callable | None = Field(
        default=None,
        description="How to merge the .uns elements. Uses the same strategies as 'merge'.",
    )
    label: str | None = Field(
        default=None,
        description="Column in axis annotation (.obs or .var) to place batch information. Default is None.",
    )
    keys: list | None = Field(
        default=None,
        description=(
            "Names for each object being concatenated. "
            "Used for column values or appended to the index if 'index_unique' is not None. "
            "Default is None."
        ),
    )
    index_unique: str | None = Field(
        default=None,
        description="Delimiter for making the index unique. When None, original indices are kept.",
    )
    fill_value: Any | None = Field(
        default=None,
        description="Value used to fill missing indices when join='outer'. Default behavior depends on array type.",
    )
    pairwise: bool = Field(
        default=False,
        description="Include pairwise elements along the concatenated dimension. Default is False.",
    )

`MapAnnData`

Bases: BaseAPIModel

Apply mapping functions to elements of AnnData.

Source code in biochatter/api_agent/python/anndata_agent.py

class MapAnnData(BaseAPIModel):
    """Apply mapping functions to elements of AnnData."""

    method_name: str = Field(
        default="anndata.obs|var['annotation_name'].map",
        description=(
            "ALWAYS ALWAYS ALWAYS REPLACE THE anndata BY THE ONE GIVEN BY THE INPUT"
            "Specifies the AnnData attribute and operation being performed. "
            "For example, 'obs.map' applies a mapping function or dictionary to the specified column in `adata.obs`. "
            "This must always include the AnnData component and the `.map` operation. "
            "Adapt the component (e.g., 'obs', 'var', etc.) to the specific use case."
        ),
    )
    dics: dict | None = Field(default=None, description="Dictionary to map over.")

`ReadCSV`

Bases: BaseAPIModel

Read .csv file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadCSV(BaseAPIModel):
    """Read .csv file."""

    method_name: str = Field(default="io.read_csv", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.csv",
        description="Path to the .csv file",
    )
    delimiter: str | None = Field(
        None,
        description="Delimiter used in the .csv file",
    )
    first_column_names: bool | None = Field(
        None,
        description="Whether the first column contains names",
    )

`ReadExcel`

Bases: BaseAPIModel

Read .xlsx (Excel) file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadExcel(BaseAPIModel):
    """Read .xlsx (Excel) file."""

    method_name: str = Field(default="io.read_excel", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.xlsx",
        description="Path to the .xlsx file",
    )
    sheet: str | None = Field(None, description="Sheet name or index to read from")
    dtype: str | None = Field(
        None,
        description="Data type for the resulting dataframe",
    )

`ReadH5AD`

Bases: BaseAPIModel

Read .h5ad-formatted hdf5 file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadH5AD(BaseAPIModel):
    """Read .h5ad-formatted hdf5 file."""

    method_name: str = Field(default="io.read_h5ad", description="NEVER CHANGE")
    filename: str = Field(default="dummy.h5ad", description="Path to the .h5ad file")
    backed: str | None = Field(
        default=None,
        description="Mode to access file: None, 'r' for read-only",
    )
    as_sparse: str | None = Field(
        default=None,
        description="Convert to sparse format: 'csr', 'csc', or None",
    )
    as_sparse_fmt: str | None = Field(
        default=None,
        description="Sparse format if converting, e.g., 'csr'",
    )
    index_unique: str | None = Field(
        default=None,
        description="Make index unique by appending suffix if needed",
    )

`ReadHDF`

Bases: BaseAPIModel

Read .h5 (hdf5) file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadHDF(BaseAPIModel):
    """Read .h5 (hdf5) file."""

    method_name: str = Field(default="io.read_hdf", description="NEVER CHANGE")
    filename: str = Field(default="placeholder.h5", description="Path to the .h5 file")
    key: str | None = Field(None, description="Group key within the .h5 file")

`ReadLoom`

Bases: BaseAPIModel

Read .loom-formatted hdf5 file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadLoom(BaseAPIModel):
    """Read .loom-formatted hdf5 file."""

    method_name: str = Field(default="io.read_loom", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.loom",
        description="Path to the .loom file",
    )
    sparse: bool | None = Field(None, description="Whether to read data as sparse")
    cleanup: bool | None = Field(None, description="Clean up invalid entries")
    X_name: str | None = Field(None, description="Name to use for X matrix")
    obs_names: str | None = Field(
        None,
        description="Column to use for observation names",
    )
    var_names: str | None = Field(
        None,
        description="Column to use for variable names",
    )

`ReadMTX`

Bases: BaseAPIModel

Read .mtx file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadMTX(BaseAPIModel):
    """Read .mtx file."""

    method_name: str = Field(default="io.read_mtx", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.mtx",
        description="Path to the .mtx file",
    )
    dtype: str | None = Field(None, description="Data type for the matrix")

`ReadText`

Bases: BaseAPIModel

Read .txt, .tab, .data (text) file.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadText(BaseAPIModel):
    """Read .txt, .tab, .data (text) file."""

    method_name: str = Field(default="io.read_text", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.txt",
        description="Path to the text file",
    )
    delimiter: str | None = Field(None, description="Delimiter used in the file")
    first_column_names: bool | None = Field(
        None,
        description="Whether the first column contains names",
    )

`ReadZarr`

Bases: BaseAPIModel

Read from a hierarchical Zarr array store.

Source code in biochatter/api_agent/python/anndata_agent.py

class ReadZarr(BaseAPIModel):
    """Read from a hierarchical Zarr array store."""

    method_name: str = Field(default="io.read_zarr", description="NEVER CHANGE")
    filename: str = Field(
        default="placeholder.zarr",
        description="Path or URL to the Zarr store",
    )

Module for interacting with the scanpy API for plotting (pl).

`ScanpyPlDrawGraphQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.draw_graph API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlDrawGraphQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.draw_graph` API."""

    method_name: str = Field(
        default="sc.pl.draw_graph",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Use `.raw` attribute of `adata` for coloring with gene expression.",
    )
    sort_order: bool = Field(
        default=True,
        description=(
            "For continuous annotations used as color parameter, "
            "plot data points with higher values on top of others."
        ),
    )
    edges: bool = Field(
        default=False,
        description="Show edges.",
    )
    edges_width: float = Field(
        default=0.1,
        description="Width of edges.",
    )
    edges_color: str | list[float] | list[str] = Field(
        default="grey",
        description="Color of edges.",
    )
    neighbors_key: str | None = Field(
        default=None,
        description="Where to look for neighbors connectivities.",
    )
    arrows: bool = Field(
        default=False,
        description="Show arrows (deprecated in favor of `scvelo.pl.velocity_embedding`).",
    )
    arrows_kwds: dict[str, Any] | None = Field(
        default=None,
        description="Arguments passed to `quiver()`.",
    )
    groups: str | list[str] | None = Field(
        default=None,
        description="Restrict to a few categories in categorical observation annotation.",
    )
    components: str | list[str] | None = Field(
        default=None,
        description="For instance, ['1,2', '2,3']. To plot all available components use components='all'.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot.",
    )
    legend_loc: str = Field(
        default="right margin",
        description="Location of legend.",
    )
    legend_fontsize: int | float | str | None = Field(
        default=None,
        description="Numeric size in pt or string describing the size.",
    )
    legend_fontweight: int | str = Field(
        default="bold",
        description="Legend font weight.",
    )
    legend_fontoutline: int | None = Field(
        default=None,
        description="Line width of the legend font outline in pt.",
    )
    colorbar_loc: str | None = Field(
        default="right",
        description="Where to place the colorbar for continuous variables.",
    )
    size: float | list[float] | None = Field(
        default=None,
        description="Point size. If None, is automatically computed as 120000 / n_cells.",
    )
    color_map: str | Any | None = Field(
        default=None,
        description="Color map to use for continuous variables.",
    )
    palette: str | list[str] | Any | None = Field(
        default=None,
        description="Colors to use for plotting categorical annotation groups.",
    )
    na_color: str | tuple[float, ...] = Field(
        default="lightgray",
        description="Color to use for null or masked values.",
    )
    na_in_legend: bool = Field(
        default=True,
        description="If there are missing values, whether they get an entry in the legend.",
    )
    frameon: bool | None = Field(
        default=None,
        description="Draw a frame around the scatter plot.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the center of the color scale.",
    )
    norm: Any | None = Field(
        default=None,
        description="Normalization for the colormap.",
    )
    add_outline: bool = Field(
        default=False,
        description="Add a thin border around groups of dots.",
    )
    outline_width: tuple[float, ...] = Field(
        default=(0.3, 0.05),
        description="Width of the outline as a fraction of the scatter dot size.",
    )
    outline_color: tuple[str, ...] = Field(
        default=("black", "white"),
        description="Colors for the outline: border color and gap color.",
    )
    ncols: int = Field(
        default=4,
        description="Number of panels per row.",
    )
    hspace: float = Field(
        default=0.25,
        description="Height of the space between multiple panels.",
    )
    wspace: float | None = Field(
        default=None,
        description="Width of the space between multiple panels.",
    )
    return_fig: bool | None = Field(
        default=None,
        description="Return the matplotlib figure.",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot; do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If `True` or a `str`, save the figure.",
    )
    ax: Any | None = Field(
        default=None,
        description="A matplotlib axes object.",
    )
    layout: str | None = Field(
        default=None,
        description="One of the `draw_graph()` layouts.",
    )
    kwargs: dict[str, Any] | None = Field(
        default=None,
        description="Additional arguments passed to `matplotlib.pyplot.scatter()`.",
    )

`ScanpyPlPcaQueryParameters`

Bases: BaseModel

Parameters for querying the scanpy pl.pca API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlPcaQueryParameters(BaseModel):
    """Parameters for querying the scanpy `pl.pca` API."""

    method_name: str = Field(
        default="sc.pl.pca",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    components: str | list[str] = Field(
        default="1,2",
        description="For example, ['1,2', '2,3']. To plot all available components use 'all'.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot.",
    )
    legend_loc: str = Field(
        default="right margin",
        description="Location of legend.",
    )
    legend_fontsize: int | float | str | None = Field(
        default=None,
        description="Font size for legend.",
    )
    legend_fontweight: int | str | None = Field(
        default=None,
        description="Font weight for legend.",
    )
    color_map: str | None = Field(
        default=None,
        description="String denoting matplotlib color map.",
    )
    palette: str | list[str] | dict | None = Field(
        default=None,
        description="Colors to use for plotting categorical annotation groups.",
    )
    frameon: bool | None = Field(
        default=None,
        description="Draw a frame around the scatter plot.",
    )
    size: int | float | None = Field(
        default=None,
        description="Point size. If `None`, is automatically computed as 120000 / n_cells.",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot, do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If `True` or a `str`, save the figure.",
    )
    ax: str | None = Field(
        default=None,
        description="A matplotlib axes object.",
    )
    return_fig: bool = Field(
        default=False,
        description="Return the matplotlib figure object.",
    )
    marker: str | None = Field(
        default=".",
        description="Marker symbol.",
    )
    annotate_var_explained: bool = Field(
        default=False,
        description="Annotate the percentage of explained variance.",
    )

`ScanpyPlQueryBuilder`

Bases: BaseQueryBuilder

A class for building a AnndataIO query object.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlQueryBuilder(BaseQueryBuilder):
    """A class for building a AnndataIO query object."""

    def create_runnable(
        self,
        query_parameters: list["BaseAPIModel"],
        conversation: "Conversation",
    ) -> Callable:
        """Create a runnable object for executing queries.

        Create runnable using the LangChain `create_structured_output_runnable`
        method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        runnable = conversation.chat.bind_tools(query_parameters)
        return runnable | PydanticToolsParser(tools=query_parameters)

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> list["BaseModel"]:
        """Generate a AnnDataIOQuery object.

        Generates the object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                ScanpyPlQuery.

        Returns:
        -------
            ScanpyPlQuery: the parameterised query object (Pydantic model)

        """
        tools = [
            ScanpyPlScatterQueryParameters,
            ScanpyPlPcaQueryParameters,
            ScanpyPlTsneQueryParameters,
            ScanpyPlUmapQueryParameters,
            ScanpyPlDrawGraphQueryParameters,
            ScanpyPlSpatialQueryParameters,
        ]
        runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
        return runnable.invoke(question)

`create_runnable(query_parameters, conversation)`

Create a runnable object for executing queries.

Create runnable using the LangChain create_structured_output_runnable method.

query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

def create_runnable(
    self,
    query_parameters: list["BaseAPIModel"],
    conversation: "Conversation",
) -> Callable:
    """Create a runnable object for executing queries.

    Create runnable using the LangChain `create_structured_output_runnable`
    method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    runnable = conversation.chat.bind_tools(query_parameters)
    return runnable | PydanticToolsParser(tools=query_parameters)

`parameterise_query(question, conversation)`

Generate a AnnDataIOQuery object.

Generates the object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    ScanpyPlQuery.

ScanpyPlQuery: the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/python/scanpy_pl_full.py

def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> list["BaseModel"]:
    """Generate a AnnDataIOQuery object.

    Generates the object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            ScanpyPlQuery.

    Returns:
    -------
        ScanpyPlQuery: the parameterised query object (Pydantic model)

    """
    tools = [
        ScanpyPlScatterQueryParameters,
        ScanpyPlPcaQueryParameters,
        ScanpyPlTsneQueryParameters,
        ScanpyPlUmapQueryParameters,
        ScanpyPlDrawGraphQueryParameters,
        ScanpyPlSpatialQueryParameters,
    ]
    runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
    return runnable.invoke(question)

`ScanpyPlScatterQueryParameters`

Bases: BaseModel

Parameters for querying the scanpy pl.scatter API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlScatterQueryParameters(BaseModel):
    """Parameters for querying the scanpy `pl.scatter` API."""

    method_name: str = Field(
        default="sc.pl.scatter",
        description="The name of the method to call.",
    )
    question_uuid: str = Field(
        default_factory=lambda: str(uuid.uuid4()),
        description="Unique identifier for the question.",
    )
    adata: str = Field(description="Annotated data matrix.")
    x: str | None = Field(default=None, description="x coordinate.")
    y: str | None = Field(default=None, description="y coordinate.")
    color: str | tuple[float, ...] | list[str | tuple[float, ...]] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes, or a hex color specification.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Whether to use raw attribute of adata. Defaults to True if .raw is present.",
    )
    layers: str | list[str] | None = Field(
        default=None,
        description="Layer(s) to use from adata's layers attribute.",
    )
    basis: str | None = Field(
        default=None,
        description="String that denotes a plotting tool that computed coordinates (e.g., 'pca', 'tsne', 'umap').",
    )
    sort_order: bool = Field(
        default=True,
        description="For continuous annotations used as color parameter, plot data points with higher values on top.",
    )
    groups: str | list[str] | None = Field(
        default=None,
        description="Restrict to specific categories in categorical observation annotation.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot ('2d' or '3d').",
    )
    legend_loc: str | None = Field(
        default="right margin",
        description="Location of legend ('none', 'right margin', 'on data', etc.).",
    )
    size: int | float | None = Field(
        default=None,
        description="Point size. If None, automatically computed as 120000 / n_cells.",
    )
    color_map: str | None = Field(
        default=None,
        description="Color map to use for continuous variables (e.g., 'magma', 'viridis').",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot, do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If True or a str, save the figure. String is appended to default filename.",
    )

`ScanpyPlSpatialQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.spatial API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlSpatialQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.spatial` API."""

    method_name: str = Field(
        default="sc.pl.spatial",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Use `.raw` attribute of `adata` for coloring with gene expression.",
    )
    layer: str | None = Field(
        default=None,
        description="Name of the AnnData object layer to plot.",
    )
    library_id: str | None = Field(
        default=None,
        description="Library ID for Visium data, e.g., key in `adata.uns['spatial']`.",
    )
    img_key: str | None = Field(
        default=None,
        description=(
            "Key for image data, used to get `img` and `scale_factor` from "
            "'images' and 'scalefactors' entries for this library."
        ),
    )
    img: Any | None = Field(
        default=None,
        description="Image data to plot, overrides `img_key`.",
    )
    scale_factor: float | None = Field(
        default=None,
        description="Scaling factor used to map from coordinate space to pixel space.",
    )
    spot_size: float | None = Field(
        default=None,
        description="Diameter of spot (in coordinate space) for each point.",
    )
    crop_coord: tuple[int, ...] | None = Field(
        default=None,
        description="Coordinates to use for cropping the image (left, right, top, bottom).",
    )
    alpha_img: float = Field(
        default=1.0,
        description="Alpha value for image.",
    )
    bw: bool = Field(
        default=False,
        description="Plot image data in grayscale.",
    )
    sort_order: bool = Field(
        default=True,
        description=(
            "For continuous annotations used as color parameter, plot data points "
            "with higher values on top of others."
        ),
    )
    groups: str | list[str] | None = Field(
        default=None,
        description="Restrict to specific categories in categorical observation annotation.",
    )
    components: str | list[str] | None = Field(
        default=None,
        description="For example, ['1,2', '2,3']. To plot all available components, use 'all'.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot.",
    )
    legend_loc: str = Field(
        default="right margin",
        description="Location of legend.",
    )
    legend_fontsize: int | float | str | None = Field(
        default=None,
        description="Numeric size in pt or string describing the size.",
    )
    legend_fontweight: int | str = Field(
        default="bold",
        description="Legend font weight.",
    )
    legend_fontoutline: int | None = Field(
        default=None,
        description="Line width of the legend font outline in pt.",
    )
    colorbar_loc: str | None = Field(
        default="right",
        description="Where to place the colorbar for continuous variables.",
    )
    size: float = Field(
        default=1.0,
        description="Point size. If None, automatically computed as 120000 / n_cells.",
    )
    color_map: str | Any | None = Field(
        default=None,
        description="Color map to use for continuous variables.",
    )
    palette: str | list[str] | Any | None = Field(
        default=None,
        description="Colors to use for plotting categorical annotation groups.",
    )
    na_color: str | tuple[float, ...] | None = Field(
        default=None,
        description="Color to use for null or masked values.",
    )
    na_in_legend: bool = Field(
        default=True,
        description="If there are missing values, whether they get an entry in the legend.",
    )
    frameon: bool | None = Field(
        default=None,
        description="Draw a frame around the scatter plot.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the center of the color scale.",
    )
    norm: Any | None = Field(
        default=None,
        description="Normalization for the colormap.",
    )
    add_outline: bool = Field(
        default=False,
        description="Add a thin border around groups of dots.",
    )
    outline_width: tuple[float, ...] = Field(
        default=(0.3, 0.05),
        description="Width of the outline as a fraction of the scatter dot size.",
    )
    outline_color: tuple[str, ...] = Field(
        default=("black", "white"),
        description="Colors for the outline: border color and gap color.",
    )
    ncols: int = Field(
        default=4,
        description="Number of panels per row.",
    )
    hspace: float = Field(
        default=0.25,
        description="Height of the space between multiple panels.",
    )
    wspace: float | None = Field(
        default=None,
        description="Width of the space between multiple panels.",
    )
    return_fig: bool | None = Field(
        default=None,
        description="Return the matplotlib figure.",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot; do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If `True` or a `str`, save the figure.",
    )
    ax: Any | None = Field(
        default=None,
        description="A matplotlib axes object.",
    )
    kwargs: dict[str, Any] | None = Field(
        default=None,
        description="Additional arguments passed to `matplotlib.pyplot.scatter()`.",
    )

`ScanpyPlTsneQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.tsne API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlTsneQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.tsne` API."""

    method_name: str = Field(
        default="sc.pl.tsne",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Use `.raw` attribute of `adata` for coloring with gene expression.",
    )
    sort_order: bool = Field(
        default=True,
        description="Plot data points with higher values on top for continuous annotations.",
    )
    edges: bool = Field(
        default=False,
        description="Show edges.",
    )
    edges_width: float = Field(
        default=0.1,
        description="Width of edges.",
    )
    edges_color: str | list[float] | list[str] = Field(
        default="grey",
        description="Color of edges.",
    )
    neighbors_key: str | None = Field(
        default=None,
        description="Key for neighbors connectivities.",
    )
    arrows: bool = Field(
        default=False,
        description="Show arrows (deprecated in favor of `scvelo.pl.velocity_embedding`).",
    )
    arrows_kwds: dict[str, Any] | None = Field(
        default=None,
        description="Arguments passed to `quiver()`.",
    )
    groups: str | None = Field(
        default=None,
        description="Restrict to specific categories in categorical observation annotation.",
    )
    components: str | list[str] | None = Field(
        default=None,
        description="Components to plot, e.g., ['1,2', '2,3']. Use 'all' to plot all available components.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot ('2d' or '3d').",
    )
    legend_loc: str = Field(
        default="right margin",
        description="Location of legend.",
    )
    legend_fontsize: int | float | str | None = Field(
        default=None,
        description="Font size for legend.",
    )
    legend_fontweight: int | str = Field(
        default="bold",
        description="Font weight for legend.",
    )
    legend_fontoutline: int | None = Field(
        default=None,
        description="Line width of the legend font outline in pt.",
    )
    size: float | list[float] | None = Field(
        default=None,
        description="Point size. If `None`, computed as 120000 / n_cells.",
    )
    color_map: str | Any | None = Field(
        default=None,
        description="Color map for continuous variables.",
    )
    palette: str | list[str] | Any | None = Field(
        default=None,
        description="Colors for plotting categorical annotation groups.",
    )
    na_color: str | tuple[float, ...] = Field(
        default="lightgray",
        description="Color for null or masked values.",
    )
    na_in_legend: bool = Field(
        default=True,
        description="Include missing values in the legend.",
    )
    frameon: bool | None = Field(
        default=None,
        description="Draw a frame around the scatter plot.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Center of the color scale, useful for diverging colormaps.",
    )
    norm: Any | None = Field(
        default=None,
        description="Normalization for the colormap.",
    )
    add_outline: bool = Field(
        default=False,
        description="Add a thin border around groups of dots.",
    )
    outline_width: tuple[float, ...] = Field(
        default=(0.3, 0.05),
        description="Width of the outline as a fraction of the scatter dot size.",
    )
    outline_color: tuple[str, ...] = Field(
        default=("black", "white"),
        description="Colors for the outline: border color and gap color.",
    )
    ncols: int = Field(
        default=4,
        description="Number of panels per row.",
    )
    hspace: float = Field(
        default=0.25,
        description="Height of the space between multiple panels.",
    )
    wspace: float | None = Field(
        default=None,
        description="Width of the space between multiple panels.",
    )
    return_fig: bool | None = Field(
        default=None,
        description="Return the matplotlib figure.",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot; do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If `True` or a `str`, save the figure.",
    )
    ax: Any | None = Field(
        default=None,
        description="A matplotlib axes object.",
    )
    kwargs: dict[str, Any] | None = Field(
        default=None,
        description="Additional arguments passed to `matplotlib.pyplot.scatter()`.",
    )

`ScanpyPlUmapQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.umap API.

Source code in biochatter/api_agent/python/scanpy_pl_full.py

class ScanpyPlUmapQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.umap` API."""

    method_name: str = Field(
        default="sc.pl.umap",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    mask_obs: str | None = Field(
        default=None,
        description="Mask for observations.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Use `.raw` attribute of `adata` for coloring with gene expression.",
    )
    sort_order: bool = Field(
        default=True,
        description="Plot data points with higher values on top for continuous annotations.",
    )
    edges: bool = Field(
        default=False,
        description="Show edges.",
    )
    edges_width: float = Field(
        default=0.1,
        description="Width of edges.",
    )
    edges_color: str | list[float] | list[str] = Field(
        default="grey",
        description="Color of edges.",
    )
    neighbors_key: str | None = Field(
        default=None,
        description="Key for neighbors connectivities.",
    )
    arrows: bool = Field(
        default=False,
        description="Show arrows (deprecated in favor of `scvelo.pl.velocity_embedding`).",
    )
    arrows_kwds: dict[str, Any] | None = Field(
        default=None,
        description="Arguments passed to `quiver()`.",
    )
    groups: str | None = Field(
        default=None,
        description="Restrict to specific categories in categorical observation annotation.",
    )
    components: str | list[str] | None = Field(
        default=None,
        description="Components to plot, e.g., ['1,2', '2,3']. Use 'all' to plot all available components.",
    )
    dimensions: int | None = Field(
        default=None,
        description="Number of dimensions to plot.",
    )
    layer: str | None = Field(
        default=None,
        description="Name of the AnnData object layer to plot.",
    )
    projection: str = Field(
        default="2d",
        description="Projection of plot ('2d' or '3d').",
    )
    scale_factor: float | None = Field(
        default=None,
        description="Scale factor for the plot.",
    )
    color_map: str | Any | None = Field(
        default=None,
        description="Color map for continuous variables.",
    )
    cmap: str | Any | None = Field(
        default=None,
        description="Alias for `color_map`.",
    )
    palette: str | list[str] | Any | None = Field(
        default=None,
        description="Colors for plotting categorical annotation groups.",
    )
    na_color: str | tuple[float, ...] = Field(
        default="lightgray",
        description="Color for null or masked values.",
    )
    na_in_legend: bool = Field(
        default=True,
        description="Include missing values in the legend.",
    )
    size: float | list[float] | None = Field(
        default=None,
        description="Point size. If `None`, computed as 120000 / n_cells.",
    )
    frameon: bool | None = Field(
        default=None,
        description="Draw a frame around the scatter plot.",
    )
    legend_fontsize: int | float | str | None = Field(
        default=None,
        description="Font size for legend.",
    )
    legend_fontweight: int | str = Field(
        default="bold",
        description="Font weight for legend.",
    )
    legend_loc: str = Field(
        default="right margin",
        description="Location of legend.",
    )
    legend_fontoutline: int | None = Field(
        default=None,
        description="Line width of the legend font outline in pt.",
    )
    colorbar_loc: str = Field(
        default="right",
        description="Location of the colorbar.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Upper limit of the color scale.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Lower limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Center of the color scale, useful for diverging colormaps.",
    )
    norm: Any | None = Field(
        default=None,
        description="Normalization for the colormap.",
    )
    add_outline: bool = Field(
        default=False,
        description="Add a thin border around groups of dots.",
    )
    outline_width: tuple[float, ...] = Field(
        default=(0.3, 0.05),
        description="Width of the outline as a fraction of the scatter dot size.",
    )
    outline_color: tuple[str, ...] = Field(
        default=("black", "white"),
        description="Colors for the outline: border color and gap color.",
    )
    ncols: int = Field(
        default=4,
        description="Number of panels per row.",
    )
    hspace: float = Field(
        default=0.25,
        description="Height of the space between multiple panels.",
    )
    wspace: float | None = Field(
        default=None,
        description="Width of the space between multiple panels.",
    )
    show: bool | None = Field(
        default=None,
        description="Show the plot; do not return axis.",
    )
    save: str | bool | None = Field(
        default=None,
        description="If `True` or a `str`, save the figure.",
    )
    ax: Any | None = Field(
        default=None,
        description="A matplotlib axes object.",
    )
    return_fig: bool | None = Field(
        default=None,
        description="Return the matplotlib figure.",
    )
    marker: str = Field(
        default=".",
        description="Marker symbol.",
    )
    kwargs: dict[str, Any] | None = Field(
        default=None,
        description="Additional arguments passed to `matplotlib.pyplot.scatter()`.",
    )

Module for interacting with the scanpy API for plotting (pl).

`ScanpyPlDrawGraphQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.draw_graph API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlDrawGraphQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.draw_graph` API."""

    method_name: str = Field(
        default="sc.pl.draw_graph",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    color_map: str | Any | None = Field(
        default=None,
        description="Color map to use for continuous variables.",
    )
    palette: str | list[str] | Any | None = Field(
        default=None,
        description="Colors to use for plotting categorical annotation groups.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the center of the color scale.",
    )

`ScanpyPlPcaQueryParameters`

Bases: BaseModel

Parameters for querying the scanpy pl.pca API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlPcaQueryParameters(BaseModel):
    """Parameters for querying the scanpy `pl.pca` API."""

    method_name: str = Field(
        default="sc.pl.pca",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    color_map: str | None = Field(
        default=None,
        description="String denoting matplotlib color map.",
    )

`ScanpyPlQueryBuilder`

Bases: BaseQueryBuilder

A class for building a AnndataIO query object.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlQueryBuilder(BaseQueryBuilder):
    """A class for building a AnndataIO query object."""

    def create_runnable(
        self,
        query_parameters: list["BaseAPIModel"],
        conversation: "Conversation",
    ) -> Callable:
        """Create a runnable object for executing queries.

        Create runnable using the LangChain `create_structured_output_runnable`
        method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
        return runnable | PydanticToolsParser(tools=query_parameters)

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> list["BaseModel"]:
        """Generate a AnnDataIOQuery object.

        Generates the object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                ScanpyPlQuery.

        Returns:
        -------
            ScanpyPlQuery: the parameterised query object (Pydantic model)

        """
        tools = [
            ScanpyPlScatterQueryParameters,
            ScanpyPlPcaQueryParameters,
            ScanpyPlTsneQueryParameters,
            ScanpyPlUmapQueryParameters,
            ScanpyPlDrawGraphQueryParameters,
            ScanpyPlSpatialQueryParameters,
        ]
        runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
        return runnable.invoke(question)

`create_runnable(query_parameters, conversation)`

Create a runnable object for executing queries.

Create runnable using the LangChain create_structured_output_runnable method.

query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

def create_runnable(
    self,
    query_parameters: list["BaseAPIModel"],
    conversation: "Conversation",
) -> Callable:
    """Create a runnable object for executing queries.

    Create runnable using the LangChain `create_structured_output_runnable`
    method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    runnable = conversation.chat.bind_tools(query_parameters, tool_choice="required")
    return runnable | PydanticToolsParser(tools=query_parameters)

`parameterise_query(question, conversation)`

Generate a AnnDataIOQuery object.

Generates the object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    ScanpyPlQuery.

ScanpyPlQuery: the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> list["BaseModel"]:
    """Generate a AnnDataIOQuery object.

    Generates the object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            ScanpyPlQuery.

    Returns:
    -------
        ScanpyPlQuery: the parameterised query object (Pydantic model)

    """
    tools = [
        ScanpyPlScatterQueryParameters,
        ScanpyPlPcaQueryParameters,
        ScanpyPlTsneQueryParameters,
        ScanpyPlUmapQueryParameters,
        ScanpyPlDrawGraphQueryParameters,
        ScanpyPlSpatialQueryParameters,
    ]
    runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
    return runnable.invoke(question)

`ScanpyPlScatterQueryParameters`

Bases: BaseModel

Parameters for querying the scanpy pl.scatter API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlScatterQueryParameters(BaseModel):
    """Parameters for querying the scanpy `pl.scatter` API."""

    method_name: str = Field(
        default="sc.pl.scatter",
        description="The name of the method to call.",
    )
    question_uuid: str = Field(
        default_factory=lambda: str(uuid.uuid4()),
        description="Unique identifier for the question.",
    )
    adata: str = Field(description="Annotated data matrix.")
    x: str | None = Field(default=None, description="x coordinate.")
    y: str | None = Field(default=None, description="y coordinate.")
    color: str | tuple[float, ...] | list[str | tuple[float, ...]] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes, or a hex color specification.",
    )
    use_raw: bool | None = Field(
        default=None,
        description="Whether to use raw attribute of adata. Defaults to True if .raw is present.",
    )
    layers: str | list[str] | None = Field(
        default=None,
        description="Layer(s) to use from adata's layers attribute.",
    )
    basis: str | None = Field(
        default=None,
        description="String that denotes a plotting tool that computed coordinates (e.g., 'pca', 'tsne', 'umap').",
    )

`ScanpyPlSpatialQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.spatial API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlSpatialQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.spatial` API."""

    method_name: str = Field(
        default="sc.pl.spatial",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    layer: str | None = Field(
        default=None,
        description="Name of the AnnData object layer to plot.",
    )
    library_id: str | None = Field(
        default=None,
        description="Library ID for Visium data, e.g., key in `adata.uns['spatial']`.",
    )
    img_key: str | None = Field(
        default=None,
        description=(
            "Key for image data, used to get `img` and `scale_factor` from "
            "'images' and 'scalefactors' entries for this library."
        ),
    )
    img: Any | None = Field(
        default=None,
        description="Image data to plot, overrides `img_key`.",
    )
    scale_factor: float | None = Field(
        default=None,
        description="Scaling factor used to map from coordinate space to pixel space.",
    )
    spot_size: float | None = Field(
        default=None,
        description="Diameter of spot (in coordinate space) for each point.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="The value representing the center of the color scale.",
    )

`ScanpyPlTsneQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.tsne API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlTsneQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.tsne` API."""

    method_name: str = Field(
        default="sc.pl.tsne",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    groups: str | None = Field(
        default=None,
        description="Restrict to specific categories in categorical observation annotation.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Lower limit of the color scale.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Upper limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Center of the color scale, useful for diverging colormaps.",
    )

`ScanpyPlUmapQueryParameters`

Bases: BaseModel

Parameters for querying the Scanpy pl.umap API.

Source code in biochatter/api_agent/python/scanpy_pl_reduced.py

class ScanpyPlUmapQueryParameters(BaseModel):
    """Parameters for querying the Scanpy `pl.umap` API."""

    method_name: str = Field(
        default="sc.pl.umap",
        description="The name of the method to call.",
    )
    question_uuid: str | None = Field(
        default=None,
        description="Unique identifier for the question.",
    )
    adata: str = Field(
        ...,
        description="Annotated data matrix.",
    )
    color: str | list[str] | None = Field(
        default=None,
        description="Keys for annotations of observations/cells or variables/genes.",
    )
    gene_symbols: str | None = Field(
        default=None,
        description="Column name in `.var` DataFrame that stores gene symbols.",
    )
    layer: str | None = Field(
        default=None,
        description="Name of the AnnData object layer to plot.",
    )
    vmax: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Upper limit of the color scale.",
    )
    vmin: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Lower limit of the color scale.",
    )
    vcenter: str | float | Any | list[str | float | Any] | None = Field(
        default=None,
        description="Center of the color scale, useful for diverging colormaps.",
    )

Scanpy Preprocessing (scanpy.pp) Query Builder.

TODO: not sure if the approach below is functional yet.

`ScanpyPpFuncs`

Bases: BaseTools

Scanpy Preprocessing (scanpy.pp) Query Builder.

Source code in biochatter/api_agent/python/scanpy_pp_reduced.py

class ScanpyPpFuncs(BaseTools):
    """Scanpy Preprocessing (scanpy.pp) Query Builder."""

    tools_params = {}

    tools_params["filter_cells"] = {
        "data": (str, Field(..., description="The (annotated) data matrix.")),
        "min_counts": (Optional[int], Field(None, description="Minimum counts per cell.")),
        "min_genes": (Optional[int], Field(None, description="Minimum genes expressed in a cell.")),
        "max_counts": (Optional[int], Field(None, description="Maximum counts per cell.")),
        "max_genes": (Optional[int], Field(None, description="Maximum genes expressed in a cell.")),
        "inplace": (bool, Field(True, description="Whether to modify the data in place.")),
    }

    tools_params["filter_genes"] = {
        "data": (str, Field(..., description="The (annotated) data matrix.")),
        "min_counts": (Optional[int], Field(None, description="Minimum counts per gene.")),
        "min_cells": (Optional[int], Field(None, description="Minimum number of cells expressing the gene.")),
        "max_counts": (Optional[int], Field(None, description="Maximum counts per gene.")),
        "max_cells": (Optional[int], Field(None, description="Maximum number of cells expressing the gene.")),
        "inplace": (bool, Field(True, description="Whether to modify the data in place.")),
    }

    tools_params["highly_variable_genes"] = {
        "adata": (str, Field(..., description="Annotated data matrix.")),
        "n_top_genes": (Optional[int], Field(None, description="Number of highly-variable genes to keep.")),
        "min_mean": (float, Field(0.0125, description="Minimum mean expression for highly-variable genes.")),
        "max_mean": (float, Field(3, description="Maximum mean expression for highly-variable genes.")),
        "flavor": (str, Field("seurat", description="Method for identifying highly-variable genes.")),
        "inplace": (bool, Field(True, description="Whether to place metrics in .var or return them.")),
    }

    tools_params["log1p"] = {
        "data": (str, Field(..., description="The data matrix.")),
        "base": (Optional[float], Field(None, description="Base of the logarithm.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
        "chunked": (Optional[bool], Field(None, description="Process data in chunks.")),
    }

    tools_params["pca"] = {
        "data": (str, Field(..., description="The (annotated) data matrix.")),
        "n_comps": (Optional[int], Field(None, description="Number of principal components to compute.")),
        "layer": (Optional[str], Field(None, description="Element of layers to use for PCA.")),
        "zero_center": (bool, Field(True, description="Whether to zero-center the data.")),
        "svd_solver": (Optional[str], Field(None, description="SVD solver to use.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
    }

    tools_params["normalize_total"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "target_sum": (Optional[float], Field(None, description="Target sum after normalization.")),
        "exclude_highly_expressed": (bool, Field(False, description="Whether to exclude highly expressed genes.")),
        "inplace": (bool, Field(True, description="Whether to update adata or return normalized data.")),
    }

    tools_params["regress_out"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "keys": (Union[str, Collection[str]], Field(..., description="Keys for regression.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
    }

    tools_params["scale"] = {
        "data": (str, Field(..., description="The data matrix.")),
        "zero_center": (bool, Field(True, description="Whether to zero-center the data.")),
        "copy": (bool, Field(False, description="Whether to perform operation inplace.")),
    }

    tools_params["subsample"] = {
        "data": (str, Field(..., description="The data matrix.")),
        "fraction": (Optional[float], Field(None, description="Fraction of observations to subsample.")),
        "n_obs": (Optional[int], Field(None, description="Number of observations to subsample.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
    }

    tools_params["downsample_counts"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "counts_per_cell": (Optional[int | str], Field(None, description="Target total counts per cell.")),
        "replace": (bool, Field(False, description="Whether to sample with replacement.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
    }

    tools_params["combat"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "key": (str, Field("batch", description="Key for batch effect removal.")),
        "inplace": (bool, Field(True, description="Whether to replace the data inplace.")),
    }

    tools_params["scrublet"] = {
        "adata": (str, Field(..., description="Annotated data matrix.")),
        "sim_doublet_ratio": (float, Field(2.0, description="Number of doublets to simulate.")),
        "threshold": (Optional[float], Field(None, description="Doublet score threshold.")),
        "copy": (bool, Field(False, description="If True, return a copy_param of the data.")),
    }

    tools_params["scrublet_simulate_doublets"] = {
        "adata": (str, Field(..., description="Annotated data matrix.")),
        "sim_doublet_ratio": (float, Field(2.0, description="Number of doublets to simulate.")),
        "random_seed": (int, Field(0, description="Random seed for reproducibility.")),
    }
    tools_params["calculate_qc_metrics"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "expr_type": (str, Field("counts", description="Name of kind of values in X.")),
        "var_type": (str, Field("genes", description="The kind of thing the variables are.")),
        "qc_vars": (
            Collection[str],
            Field(
                (),
                description="Keys for boolean columns of .var which identify variables you could want to control for (e.g., “ERCC” or “mito”).",
            ),
        ),
        "percent_top": (
            Collection[int],
            Field(
                (50, 100, 200, 500),
                description="List of ranks at which cumulative proportion of expression will be reported as a percentage.",
            ),
        ),
        "layer": (
            Optional[str],
            Field(None, description="If provided, use adata.layers[layer] for expression values instead of adata.X."),
        ),
        "use_raw": (
            bool,
            Field(False, description="If True, use adata.raw.X for expression values instead of adata.X."),
        ),
        "inplace": (bool, Field(False, description="Whether to place calculated metrics in adata’s .obs and .var.")),
        "log1p": (bool, Field(True, description="Set to False to skip computing log1p transformed annotations.")),
    }

    tools_params["recipe_zheng17"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "n_top_genes": (int, Field(1000, description="Number of genes to keep.")),
        "log": (bool, Field(True, description="Take logarithm of the data.")),
        "plot": (bool, Field(False, description="Show a plot of the gene dispersion vs. mean relation.")),
        "copy": (bool, Field(False, description="Return a copy of adata instead of updating it.")),
    }

    tools_params["recipe_weinreb17"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "log": (bool, Field(True, description="Logarithmize the data?")),
        "mean_threshold": (float, Field(0.01, description="Mean expression threshold for gene selection.")),
        "cv_threshold": (float, Field(2, description="Coefficient of variation threshold for gene selection.")),
        "n_pcs": (int, Field(50, description="Number of principal components to compute.")),
        "svd_solver": (str, Field("randomized", description="SVD solver to use for PCA.")),
        "random_state": (int, Field(0, description="Random seed for reproducibility.")),
        "copy": (
            bool,
            Field(False, description="Return a copy if true, otherwise modifies the original adata object."),
        ),
    }

    tools_params["recipe_seurat"] = {
        "adata": (str, Field(..., description="The annotated data matrix.")),
        "log": (bool, Field(True, description="Logarithmize the data?")),
        "plot": (bool, Field(False, description="Show a plot of the gene dispersion vs. mean relation.")),
        "copy": (
            bool,
            Field(False, description="Return a copy if true, otherwise modifies the original adata object."),
        ),
    }

    def __init__(self, tools_params: dict = tools_params) -> None:
        """Initialise the ScanpyPpFuncs class."""
        super().__init__()
        self.tools_params = tools_params

`init(tools_params=tools_params)`

Initialise the ScanpyPpFuncs class.

Source code in biochatter/api_agent/python/scanpy_pp_reduced.py

def __init__(self, tools_params: dict = tools_params) -> None:
    """Initialise the ScanpyPpFuncs class."""
    super().__init__()
    self.tools_params = tools_params

`ScanpyPpQueryBuilder`

Bases: BaseQueryBuilder

A class for building a ScanpyPp query object.

Source code in biochatter/api_agent/python/scanpy_pp_reduced.py

class ScanpyPpQueryBuilder(BaseQueryBuilder):
    """A class for building a ScanpyPp query object."""

    def create_runnable(
        self,
        query_parameters: list["BaseAPIModel"],
        conversation: "Conversation",
    ) -> Callable:
        """Create a runnable object for executing queries.

        Create runnable using the LangChain `create_structured_output_runnable`
        method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        runnable = conversation.chat.bind_tools(query_parameters)
        return runnable | PydanticToolsParser(tools=query_parameters)

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> list["BaseModel"]:
        """Generate a ScanpyPp query object.

        Generates the object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                ScanpyPpQuery.

        Returns:
        -------
            ScanpyPpQuery: the parameterised query object (Pydantic model)

        """
        tool_maker = ScanpyPpFuncs()
        tools = tool_maker.make_pydantic_tools()
        runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
        return runnable.invoke(
            question,
        )

`create_runnable(query_parameters, conversation)`

Create a runnable object for executing queries.

Create runnable using the LangChain create_structured_output_runnable method.

query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.

Source code in biochatter/api_agent/python/scanpy_pp_reduced.py

def create_runnable(
    self,
    query_parameters: list["BaseAPIModel"],
    conversation: "Conversation",
) -> Callable:
    """Create a runnable object for executing queries.

    Create runnable using the LangChain `create_structured_output_runnable`
    method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    runnable = conversation.chat.bind_tools(query_parameters)
    return runnable | PydanticToolsParser(tools=query_parameters)

`parameterise_query(question, conversation)`

Generate a ScanpyPp query object.

Generates the object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    ScanpyPpQuery.

ScanpyPpQuery: the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/python/scanpy_pp_reduced.py

def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> list["BaseModel"]:
    """Generate a ScanpyPp query object.

    Generates the object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            ScanpyPpQuery.

    Returns:
    -------
        ScanpyPpQuery: the parameterised query object (Pydantic model)

    """
    tool_maker = ScanpyPpFuncs()
    tools = tool_maker.make_pydantic_tools()
    runnable = self.create_runnable(conversation=conversation, query_parameters=tools)
    return runnable.invoke(
        question,
    )

API Calling: Utility functions

Formatters to parse the calls

Formatters for API calls (Pydantic models to strings).

`format_as_python_call(model)`

Convert a parameter model into a Python method call string.

model: Pydantic model containing method parameters

String representation of the Python method call

Source code in biochatter/api_agent/base/formatters.py

def format_as_python_call(model: BaseAPIModel) -> str:
    """Convert a parameter model into a Python method call string.

    Args:
    ----
        model: Pydantic model containing method parameters

    Returns:
    -------
        String representation of the Python method call

    """
    params = model.dict(exclude_none=True)
    method_name = params.pop("method_name", None)
    params.pop("question_uuid", None)
    if isinstance(model, MapAnnData):
        param_str = params.pop("dics", {})
    else:
        param_str = ", ".join(f"{k}={v!r}" for k, v in params.items())

    return f"{method_name}({param_str})"

`format_as_rest_call(model)`

Convert a parameter model (BaseModel) into a REST API call string.

model: Pydantic model containing API call parameters

String representation of the REST API call

Source code in biochatter/api_agent/base/formatters.py

def format_as_rest_call(model: BaseModel) -> str:
    """Convert a parameter model (BaseModel) into a REST API call string.

    Args:
    ----
        model: Pydantic model containing API call parameters

    Returns:
    -------
        String representation of the REST API call

    """
    params = model.dict(exclude_none=True)
    endpoint = params.pop("endpoint")
    base_url = params.pop("base_url")
    params.pop("question_uuid", None)

    full_url = f"{base_url.rstrip('/')}/{endpoint.strip('/')}"
    return f"{full_url}?{urlencode(params)}"

API Calling: Python APIs

Generic Python API ingestion

GenericQueryBuilder

create_runnable(query_parameters, conversation)

parameterise_query(question, prompt, conversation, module, generated_classes=None)

Examples

generate_pydantic_classes(module)

Params:

Returns

Notes

Scanpy modules

AnnDataIOQueryBuilder

create_runnable(query_parameters, conversation)

parameterise_query(question, conversation)

ConcatenateAnnData

MapAnnData

ReadCSV

ReadExcel

ReadH5AD

ReadHDF

ReadLoom

ReadMTX

ReadText

ReadZarr

ScanpyPlDrawGraphQueryParameters

ScanpyPlPcaQueryParameters

ScanpyPlQueryBuilder

create_runnable(query_parameters, conversation)

parameterise_query(question, conversation)

ScanpyPlScatterQueryParameters

ScanpyPlSpatialQueryParameters

ScanpyPlTsneQueryParameters

ScanpyPlUmapQueryParameters

ScanpyPlDrawGraphQueryParameters

ScanpyPlPcaQueryParameters

ScanpyPlQueryBuilder

create_runnable(query_parameters, conversation)

parameterise_query(question, conversation)

ScanpyPlScatterQueryParameters

ScanpyPlSpatialQueryParameters

ScanpyPlTsneQueryParameters

ScanpyPlUmapQueryParameters

ScanpyPpFuncs

__init__(tools_params=tools_params)

ScanpyPpQueryBuilder

create_runnable(query_parameters, conversation)

parameterise_query(question, conversation)

API Calling: Utility functions

Formatters to parse the calls

format_as_python_call(model)

format_as_rest_call(model)

`GenericQueryBuilder`

`create_runnable(query_parameters, conversation)`

`parameterise_query(question, prompt, conversation, module, generated_classes=None)`

`generate_pydantic_classes(module)`

`AnnDataIOQueryBuilder`

`create_runnable(query_parameters, conversation)`

`parameterise_query(question, conversation)`

`ConcatenateAnnData`

`MapAnnData`

`ReadCSV`

`ReadExcel`

`ReadH5AD`

`ReadHDF`

`ReadLoom`

`ReadMTX`

`ReadText`

`ReadZarr`

`ScanpyPlDrawGraphQueryParameters`

`ScanpyPlPcaQueryParameters`

`ScanpyPlQueryBuilder`

`create_runnable(query_parameters, conversation)`

`parameterise_query(question, conversation)`

`ScanpyPlScatterQueryParameters`

`ScanpyPlSpatialQueryParameters`

`ScanpyPlTsneQueryParameters`

`ScanpyPlUmapQueryParameters`

`ScanpyPlDrawGraphQueryParameters`

`ScanpyPlPcaQueryParameters`

`ScanpyPlQueryBuilder`

`create_runnable(query_parameters, conversation)`

`parameterise_query(question, conversation)`

`ScanpyPlScatterQueryParameters`

`ScanpyPlSpatialQueryParameters`

`ScanpyPlTsneQueryParameters`

`ScanpyPlUmapQueryParameters`

`ScanpyPpFuncs`

`init(tools_params=tools_params)`

`ScanpyPpQueryBuilder`

`create_runnable(query_parameters, conversation)`

`parameterise_query(question, conversation)`

`format_as_python_call(model)`

`format_as_rest_call(model)`