Skip to content

API Agent Reference

Here we handle the connection to external software tools via the parameterisation of API calls by the LLM.

The abstract base classes

BaseFetcher

Bases: ABC

Abstract base class for fetchers. A fetcher is responsible for submitting queries (in systems where submission and fetching are separate) and fetching and saving results of queries. It has to implement a fetch_results() method, which can wrap a multi-step procedure to submit and retrieve. Should implement retry method to account for connectivity issues or processing times.

Source code in biochatter/api_agent/abc.py
class BaseFetcher(ABC):
    """Abstract base class for fetchers. A fetcher is responsible for submitting
    queries (in systems where submission and fetching are separate) and fetching
    and saving results of queries. It has to implement a `fetch_results()`
    method, which can wrap a multi-step procedure to submit and retrieve. Should
    implement retry method to account for connectivity issues or processing
    times.
    """

    @abstractmethod
    def fetch_results(
        self,
        query_model: BaseModel,
        retries: int | None = 3,
    ):
        """Fetches results by submitting a query. Can implement a multi-step
        procedure if submitting and fetching are distinct processes (e.g., in
        the case of long processing times as in the case of BLAST).

        Args:
        ----
            query_model: the Pydantic model describing the parameterised query

        """

fetch_results(query_model, retries=3) abstractmethod

Fetches results by submitting a query. Can implement a multi-step procedure if submitting and fetching are distinct processes (e.g., in the case of long processing times as in the case of BLAST).


query_model: the Pydantic model describing the parameterised query
Source code in biochatter/api_agent/abc.py
@abstractmethod
def fetch_results(
    self,
    query_model: BaseModel,
    retries: int | None = 3,
):
    """Fetches results by submitting a query. Can implement a multi-step
    procedure if submitting and fetching are distinct processes (e.g., in
    the case of long processing times as in the case of BLAST).

    Args:
    ----
        query_model: the Pydantic model describing the parameterised query

    """

BaseInterpreter

Bases: ABC

Abstract base class for result interpreters. The interpreter is aware of the nature and structure of the results and can extract and summarise information from them.

Source code in biochatter/api_agent/abc.py
class BaseInterpreter(ABC):
    """Abstract base class for result interpreters. The interpreter is aware of the
    nature and structure of the results and can extract and summarise
    information from them.
    """

    @abstractmethod
    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """Summarises an answer based on the given parameters.

        Args:
        ----
            question (str): The question that was asked.

            conversation_factory (Callable): A function that creates a
                BioChatter conversation.

            response_text (str): The response.text returned from the request.

        Returns:
        -------
            A summary of the answer.

        Todo:
        ----
            Genericise (remove file path and n_lines parameters, and use a
            generic way to get the results). The child classes should manage the
            specifics of the results.

        """

summarise_results(question, conversation_factory, response_text) abstractmethod

Summarises an answer based on the given parameters.


question (str): The question that was asked.

conversation_factory (Callable): A function that creates a
    BioChatter conversation.

response_text (str): The response.text returned from the request.

A summary of the answer.
Todo:
Genericise (remove file path and n_lines parameters, and use a
generic way to get the results). The child classes should manage the
specifics of the results.
Source code in biochatter/api_agent/abc.py
@abstractmethod
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """Summarises an answer based on the given parameters.

    Args:
    ----
        question (str): The question that was asked.

        conversation_factory (Callable): A function that creates a
            BioChatter conversation.

        response_text (str): The response.text returned from the request.

    Returns:
    -------
        A summary of the answer.

    Todo:
    ----
        Genericise (remove file path and n_lines parameters, and use a
        generic way to get the results). The child classes should manage the
        specifics of the results.

    """

BaseQueryBuilder

Bases: ABC

An abstract base class for query builders.

Source code in biochatter/api_agent/abc.py
class BaseQueryBuilder(ABC):
    """An abstract base class for query builders."""

    @property
    def structured_output_prompt(self) -> ChatPromptTemplate:
        """Defines a structured output prompt template. This provides a default
        implementation for an API agent that can be overridden by subclasses to
        return a ChatPromptTemplate-compatible object.
        """
        return ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class algorithm for extracting information in structured formats.",
                ),
                (
                    "human",
                    "Use the given format to extract information from the following input: {input}",
                ),
                ("human", "Tip: Make sure to answer in the correct format"),
            ],
        )

    @abstractmethod
    def create_runnable(
        self,
        query_parameters: "BaseModel",
        conversation: "Conversation",
    ) -> Callable:
        """Creates a runnable object for executing queries. Must be implemented by
        subclasses. Should use the LangChain `create_structured_output_runnable`
        method to generate the Callable.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """

    @abstractmethod
    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> BaseModel:
        """Parameterises a query object (a Pydantic model with the fields of the
        API) based on the given question using a BioChatter conversation
        instance. Must be implemented by subclasses.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The BioChatter conversation object containing the LLM
                that should parameterise the query.

        Returns:
        -------
            A parameterised instance of the query object (Pydantic BaseModel)

        """

structured_output_prompt: ChatPromptTemplate property

Defines a structured output prompt template. This provides a default implementation for an API agent that can be overridden by subclasses to return a ChatPromptTemplate-compatible object.

create_runnable(query_parameters, conversation) abstractmethod

Creates a runnable object for executing queries. Must be implemented by subclasses. Should use the LangChain create_structured_output_runnable method to generate the Callable.


query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.
Source code in biochatter/api_agent/abc.py
@abstractmethod
def create_runnable(
    self,
    query_parameters: "BaseModel",
    conversation: "Conversation",
) -> Callable:
    """Creates a runnable object for executing queries. Must be implemented by
    subclasses. Should use the LangChain `create_structured_output_runnable`
    method to generate the Callable.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """

parameterise_query(question, conversation) abstractmethod

Parameterises a query object (a Pydantic model with the fields of the API) based on the given question using a BioChatter conversation instance. Must be implemented by subclasses.


question (str): The question to be answered.

conversation: The BioChatter conversation object containing the LLM
    that should parameterise the query.

A parameterised instance of the query object (Pydantic BaseModel)
Source code in biochatter/api_agent/abc.py
@abstractmethod
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> BaseModel:
    """Parameterises a query object (a Pydantic model with the fields of the
    API) based on the given question using a BioChatter conversation
    instance. Must be implemented by subclasses.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The BioChatter conversation object containing the LLM
            that should parameterise the query.

    Returns:
    -------
        A parameterised instance of the query object (Pydantic BaseModel)

    """

The API Agent

APIAgent

Source code in biochatter/api_agent/api_agent.py
class APIAgent:
    def __init__(
        self,
        conversation_factory: Callable,
        query_builder: "BaseQueryBuilder",
        fetcher: "BaseFetcher",
        interpreter: "BaseInterpreter",
    ):
        """API agent class to interact with a tool's API for querying and fetching
        results.  The query fields have to be defined in a Pydantic model
        (`BaseModel`) and used (i.e., parameterised by the LLM) in the query
        builder. Specific API agents are defined in submodules of this directory
        (`api_agent`). The agent's logic is implemented in the `execute` method.

        Attributes
        ----------
            conversation_factory (Callable): A function used to create a
                BioChatter conversation, providing LLM access.

            query_builder (BaseQueryBuilder): An instance of a child of the
                BaseQueryBuilder class.

            result_fetcher (BaseFetcher): An instance of a child of the
                BaseFetcher class.

            result_interpreter (BaseInterpreter): An instance of a child of the
                BaseInterpreter class.

        """
        self.conversation_factory = conversation_factory
        self.query_builder = query_builder
        self.fetcher = fetcher
        self.interpreter = interpreter
        self.final_answer = None

    def parameterise_query(self, question: str) -> BaseModel | None:
        """Use LLM to parameterise a query (a Pydantic model) based on the given
        question using a BioChatter conversation instance.
        """
        try:
            conversation = self.conversation_factory()
            return self.query_builder.parameterise_query(question, conversation)
        except Exception as e:
            print(f"Error generating query: {e}")
            return None

    def fetch_results(self, query_model: str) -> str | None:
        """Fetch the results of the query using the individual API's implementation
        (either single-step or submit-retrieve).

        Args:
        ----
            query_model: the parameterised query Pydantic model

        """
        try:
            return self.fetcher.fetch_results(query_model, 100)
        except Exception as e:
            print(f"Error fetching results: {e}")
            return None

    def summarise_results(
        self,
        question: str,
        response_text: str,
    ) -> str | None:
        """Summarise the retrieved results to extract the answer to the question."""
        try:
            return self.interpreter.summarise_results(
                question=question,
                conversation_factory=self.conversation_factory,
                response_text=response_text,
            )
        except Exception as e:
            print(f"Error extracting answer: {e}")
            return None

    def execute(self, question: str) -> str | None:
        """Wrapper that uses class methods to execute the API agent logic. Consists
        of 1) query generation, 2) query submission, 3) results fetching, and
        4) answer extraction. The final answer is stored in the final_answer
        attribute.

        Args:
        ----
            question (str): The question to be answered.

        """
        # Generate query
        try:
            query_model = self.parameterise_query(question)
            if not query_model:
                raise ValueError("Failed to generate query.")
        except ValueError as e:
            print(e)

        # Fetch results
        try:
            response_text = self.fetch_results(
                query_model=query_model,
            )
            if not response_text:
                raise ValueError("Failed to fetch results.")
        except ValueError as e:
            print(e)

        # Extract answer from results
        try:
            final_answer = self.summarise_results(question, response_text)
            if not final_answer:
                raise ValueError("Failed to extract answer from results.")
        except ValueError as e:
            print(e)

        self.final_answer = final_answer
        return final_answer

    def get_description(self, tool_name: str, tool_desc: str):
        return f"This API agent interacts with {tool_name}'s API for querying and fetching results. {tool_desc}"

__init__(conversation_factory, query_builder, fetcher, interpreter)

API agent class to interact with a tool's API for querying and fetching results. The query fields have to be defined in a Pydantic model (BaseModel) and used (i.e., parameterised by the LLM) in the query builder. Specific API agents are defined in submodules of this directory (api_agent). The agent's logic is implemented in the execute method.

Attributes
conversation_factory (Callable): A function used to create a
    BioChatter conversation, providing LLM access.

query_builder (BaseQueryBuilder): An instance of a child of the
    BaseQueryBuilder class.

result_fetcher (BaseFetcher): An instance of a child of the
    BaseFetcher class.

result_interpreter (BaseInterpreter): An instance of a child of the
    BaseInterpreter class.
Source code in biochatter/api_agent/api_agent.py
def __init__(
    self,
    conversation_factory: Callable,
    query_builder: "BaseQueryBuilder",
    fetcher: "BaseFetcher",
    interpreter: "BaseInterpreter",
):
    """API agent class to interact with a tool's API for querying and fetching
    results.  The query fields have to be defined in a Pydantic model
    (`BaseModel`) and used (i.e., parameterised by the LLM) in the query
    builder. Specific API agents are defined in submodules of this directory
    (`api_agent`). The agent's logic is implemented in the `execute` method.

    Attributes
    ----------
        conversation_factory (Callable): A function used to create a
            BioChatter conversation, providing LLM access.

        query_builder (BaseQueryBuilder): An instance of a child of the
            BaseQueryBuilder class.

        result_fetcher (BaseFetcher): An instance of a child of the
            BaseFetcher class.

        result_interpreter (BaseInterpreter): An instance of a child of the
            BaseInterpreter class.

    """
    self.conversation_factory = conversation_factory
    self.query_builder = query_builder
    self.fetcher = fetcher
    self.interpreter = interpreter
    self.final_answer = None

execute(question)

Wrapper that uses class methods to execute the API agent logic. Consists of 1) query generation, 2) query submission, 3) results fetching, and 4) answer extraction. The final answer is stored in the final_answer attribute.


question (str): The question to be answered.
Source code in biochatter/api_agent/api_agent.py
def execute(self, question: str) -> str | None:
    """Wrapper that uses class methods to execute the API agent logic. Consists
    of 1) query generation, 2) query submission, 3) results fetching, and
    4) answer extraction. The final answer is stored in the final_answer
    attribute.

    Args:
    ----
        question (str): The question to be answered.

    """
    # Generate query
    try:
        query_model = self.parameterise_query(question)
        if not query_model:
            raise ValueError("Failed to generate query.")
    except ValueError as e:
        print(e)

    # Fetch results
    try:
        response_text = self.fetch_results(
            query_model=query_model,
        )
        if not response_text:
            raise ValueError("Failed to fetch results.")
    except ValueError as e:
        print(e)

    # Extract answer from results
    try:
        final_answer = self.summarise_results(question, response_text)
        if not final_answer:
            raise ValueError("Failed to extract answer from results.")
    except ValueError as e:
        print(e)

    self.final_answer = final_answer
    return final_answer

fetch_results(query_model)

Fetch the results of the query using the individual API's implementation (either single-step or submit-retrieve).


query_model: the parameterised query Pydantic model
Source code in biochatter/api_agent/api_agent.py
def fetch_results(self, query_model: str) -> str | None:
    """Fetch the results of the query using the individual API's implementation
    (either single-step or submit-retrieve).

    Args:
    ----
        query_model: the parameterised query Pydantic model

    """
    try:
        return self.fetcher.fetch_results(query_model, 100)
    except Exception as e:
        print(f"Error fetching results: {e}")
        return None

parameterise_query(question)

Use LLM to parameterise a query (a Pydantic model) based on the given question using a BioChatter conversation instance.

Source code in biochatter/api_agent/api_agent.py
def parameterise_query(self, question: str) -> BaseModel | None:
    """Use LLM to parameterise a query (a Pydantic model) based on the given
    question using a BioChatter conversation instance.
    """
    try:
        conversation = self.conversation_factory()
        return self.query_builder.parameterise_query(question, conversation)
    except Exception as e:
        print(f"Error generating query: {e}")
        return None

summarise_results(question, response_text)

Summarise the retrieved results to extract the answer to the question.

Source code in biochatter/api_agent/api_agent.py
def summarise_results(
    self,
    question: str,
    response_text: str,
) -> str | None:
    """Summarise the retrieved results to extract the answer to the question."""
    try:
        return self.interpreter.summarise_results(
            question=question,
            conversation_factory=self.conversation_factory,
            response_text=response_text,
        )
    except Exception as e:
        print(f"Error extracting answer: {e}")
        return None

The API Agent for the BLAST tool

Module for handling BLAST API interactions.

Provides functionality for building queries, fetching results, and interpreting BLAST (Basic Local Alignment Search Tool) sequence alignment data.

BlastFetcher

Bases: BaseFetcher

A class for retrieving API results from BLAST.

Retrieves results from BLAST given a parameterised BlastQuery.

TODO add a limit of characters to be returned from the response.text?

Source code in biochatter/api_agent/blast.py
class BlastFetcher(BaseFetcher):
    """A class for retrieving API results from BLAST.

    Retrieves results from BLAST given a parameterised BlastQuery.

    TODO add a limit of characters to be returned from the response.text?
    """

    def _submit_query(self, request_data: BlastQueryParameters) -> str:
        """POST the BLAST query and retrieve the RID.

        The method submits the structured BlastQuery object and returns the RID.

        Args:
        ----
            request_data: BlastQuery object containing the BLAST query
                parameters.

        Returns:
        -------
            str: The Request ID (RID) for the submitted BLAST query.

        """
        data = {
            "CMD": request_data.cmd,
            "PROGRAM": request_data.program,
            "DATABASE": request_data.database,
            "QUERY": request_data.query,
            "FORMAT_TYPE": request_data.format_type,
            "MEGABLAST": request_data.megablast,
            "HITLIST_SIZE": request_data.max_hits,
        }
        # Include any other_params if provided
        if request_data.other_params:
            data.update(request_data.other_params)
        # Make the API call
        query_string = urlencode(data)
        # Combine base URL with the query string
        full_url = f"{request_data.url}?{query_string}"
        # Print the full URL
        request_data.full_url = full_url
        print("Full URL built by retriever:\n", request_data.full_url)
        response = requests.post(request_data.url, data=data, timeout=10)
        response.raise_for_status()
        # Extract RID from response
        print(response)
        match = re.search(r"RID = (\w+)", response.text)
        if match:
            return match.group(1)

        msg = "RID not found in BLAST submission response."
        raise ValueError(msg)

    def _fetch_results(
        self,
        rid: str,
        question_uuid: str,
        retries: int = 10000,
    ) -> str:
        """Fetch BLAST query data given RID.

        The second function to be called for a BLAST query.
        """
        ###
        ###    TO DO: Implement logging for all BLAST queries
        ###
        base_url = "https://blast.ncbi.nlm.nih.gov/Blast.cgi"
        check_status_params = {
            "CMD": "Get",
            "FORMAT_OBJECT": "SearchInfo",
            "RID": rid,
        }
        get_results_params = {
            "CMD": "Get",
            "FORMAT_TYPE": "XML",
            "RID": rid,
        }

        # Check the status of the BLAST job
        for attempt in range(retries):
            status_response = requests.get(base_url, params=check_status_params, timeout=10)
            status_response.raise_for_status()
            status_text = status_response.text
            print("evaluating status")
            if "Status=WAITING" in status_text:
                print(f"{question_uuid} results not ready, waiting...")
                time.sleep(15)
            elif "Status=FAILED" in status_text:
                msg = "BLAST query FAILED."
                raise RuntimeError(msg)
            elif "Status=UNKNOWN" in status_text:
                msg = "BLAST query expired or does not exist."
                raise RuntimeError(msg)
            elif "Status=READY" in status_text:
                if "ThereAreHits=yes" in status_text:
                    print(f"{question_uuid} results are ready, retrieving.")
                    results_response = requests.get(
                        base_url,
                        params=get_results_params,
                        timeout=10,
                    )
                    results_response.raise_for_status()
                    return results_response.text
                return "No hits found"
            if attempt == retries - 1:
                msg = "Maximum attempts reached. Results may not be ready."
                raise TimeoutError(msg)
        return None

    def fetch_results(
        self,
        query_model: BlastQueryParameters,
        retries: int = 20,
    ) -> str:
        """Submit request and fetch results from BLAST API.

        Wraps individual submission and retrieval of results.

        Args:
        ----
            query_model: the Pydantic model of the query

            retries: the number of maximum retries

        Returns:
        -------
            str: the result from the BLAST API

        """
        rid = self._submit_query(request_data=query_model)
        return self._fetch_results(
            rid=rid,
            question_uuid=query_model.question_uuid,
            retries=retries,
        )

fetch_results(query_model, retries=20)

Submit request and fetch results from BLAST API.

Wraps individual submission and retrieval of results.


query_model: the Pydantic model of the query

retries: the number of maximum retries

str: the result from the BLAST API
Source code in biochatter/api_agent/blast.py
def fetch_results(
    self,
    query_model: BlastQueryParameters,
    retries: int = 20,
) -> str:
    """Submit request and fetch results from BLAST API.

    Wraps individual submission and retrieval of results.

    Args:
    ----
        query_model: the Pydantic model of the query

        retries: the number of maximum retries

    Returns:
    -------
        str: the result from the BLAST API

    """
    rid = self._submit_query(request_data=query_model)
    return self._fetch_results(
        rid=rid,
        question_uuid=query_model.question_uuid,
        retries=retries,
    )

BlastInterpreter

Bases: BaseInterpreter

A class for interpreting BLAST results.

Source code in biochatter/api_agent/blast.py
class BlastInterpreter(BaseInterpreter):
    """A class for interpreting BLAST results."""

    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """Extract the answer from the BLAST results.

        Args:
        ----
            question (str): The question to be answered.
            conversation_factory: A BioChatter conversation object.
            response_text (str): The response.text returned by NCBI.

        Returns:
        -------
            str: The extracted answer from the BLAST results.

        """
        prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class molecular biologist who knows everything about NCBI and BLAST results.",
                ),
                ("user", "{input}"),
            ],
        )
        summary_prompt = BLAST_SUMMARY_PROMPT.format(
            question=question,
            context=response_text,
        )
        output_parser = StrOutputParser()
        conversation = conversation_factory()
        chain = prompt | conversation.chat | output_parser
        return chain.invoke({"input": {summary_prompt}})

summarise_results(question, conversation_factory, response_text)

Extract the answer from the BLAST results.


question (str): The question to be answered.
conversation_factory: A BioChatter conversation object.
response_text (str): The response.text returned by NCBI.

str: The extracted answer from the BLAST results.
Source code in biochatter/api_agent/blast.py
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """Extract the answer from the BLAST results.

    Args:
    ----
        question (str): The question to be answered.
        conversation_factory: A BioChatter conversation object.
        response_text (str): The response.text returned by NCBI.

    Returns:
    -------
        str: The extracted answer from the BLAST results.

    """
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a world class molecular biologist who knows everything about NCBI and BLAST results.",
            ),
            ("user", "{input}"),
        ],
    )
    summary_prompt = BLAST_SUMMARY_PROMPT.format(
        question=question,
        context=response_text,
    )
    output_parser = StrOutputParser()
    conversation = conversation_factory()
    chain = prompt | conversation.chat | output_parser
    return chain.invoke({"input": {summary_prompt}})

BlastQueryBuilder

Bases: BaseQueryBuilder

A class for building a BlastQuery object.

Source code in biochatter/api_agent/blast.py
class BlastQueryBuilder(BaseQueryBuilder):
    """A class for building a BlastQuery object."""

    def create_runnable(
        self,
        query_parameters: "BlastQueryParameters",
        conversation: "Conversation",
    ) -> Callable:
        """Create a runnable object for executing queries.

        Creates a runnable using the LangChain
        `create_structured_output_runnable` method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        return create_structured_output_runnable(
            output_schema=query_parameters,
            llm=conversation.chat,
            prompt=self.structured_output_prompt,
        )

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> BlastQueryParameters:
        """Generate a BlastQuery object.

        Generates the object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                BlastQuery.

        Returns:
        -------
            BlastQuery: the parameterised query object (Pydantic model)

        """
        runnable = self.create_runnable(
            query_parameters=BlastQueryParameters,
            conversation=conversation,
        )
        blast_call_obj = runnable.invoke(
            {"input": f"Answer:\n{question} based on:\n {BLAST_QUERY_PROMPT}"},
        )
        blast_call_obj.question_uuid = str(uuid.uuid4())
        return blast_call_obj

create_runnable(query_parameters, conversation)

Create a runnable object for executing queries.

Creates a runnable using the LangChain create_structured_output_runnable method.


query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.
Source code in biochatter/api_agent/blast.py
def create_runnable(
    self,
    query_parameters: "BlastQueryParameters",
    conversation: "Conversation",
) -> Callable:
    """Create a runnable object for executing queries.

    Creates a runnable using the LangChain
    `create_structured_output_runnable` method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    return create_structured_output_runnable(
        output_schema=query_parameters,
        llm=conversation.chat,
        prompt=self.structured_output_prompt,
    )

parameterise_query(question, conversation)

Generate a BlastQuery object.

Generates the object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.


question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    BlastQuery.

BlastQuery: the parameterised query object (Pydantic model)
Source code in biochatter/api_agent/blast.py
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> BlastQueryParameters:
    """Generate a BlastQuery object.

    Generates the object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            BlastQuery.

    Returns:
    -------
        BlastQuery: the parameterised query object (Pydantic model)

    """
    runnable = self.create_runnable(
        query_parameters=BlastQueryParameters,
        conversation=conversation,
    )
    blast_call_obj = runnable.invoke(
        {"input": f"Answer:\n{question} based on:\n {BLAST_QUERY_PROMPT}"},
    )
    blast_call_obj.question_uuid = str(uuid.uuid4())
    return blast_call_obj

BlastQueryParameters

Bases: BaseModel

Pydantic model for the parameters of a BLAST query request.

The class is used for configuring and sending a request to the NCBI BLAST query API. The fields are dynamically configured by the LLM based on the user's question.

Source code in biochatter/api_agent/blast.py
class BlastQueryParameters(BaseModel):
    """Pydantic model for the parameters of a BLAST query request.

    The class is used for configuring and sending a request to the NCBI BLAST
    query API. The fields are dynamically configured by the LLM based on the
    user's question.

    """

    url: str | None = Field(
        default="https://blast.ncbi.nlm.nih.gov/Blast.cgi?",
        description="ALWAYS USE DEFAULT, DO NOT CHANGE",
    )
    cmd: str | None = Field(
        default="Put",
        description="Command to execute, 'Put' for submitting query, 'Get' for retrieving results.",
    )
    program: str | None = Field(
        default="blastn",
        description=(
            "BLAST program to use, e.g., 'blastn' for nucleotide-nucleotide BLAST, "
            "'blastp' for protein-protein BLAST."
        ),
    )
    database: str | None = Field(
        default="nt",
        description=(
            "Database to search, e.g., 'nt' for nucleotide database, 'nr' for "
            "non redundant protein database, 'pdb' the Protein Data Bank "
            "database, which is used specifically for protein structures, "
            "'refseq_rna' and 'refseq_genomic': specialized databases for "
            "RNA sequences and genomic sequences"
        ),
    )
    query: str | None = Field(
        None,
        description=(
            "Nucleotide or protein sequence for the BLAST or blat query, "
            "make sure to always keep the entire sequence given."
        ),
    )
    format_type: str | None = Field(
        default="Text",
        description="Format of the BLAST results, e.g., 'Text', 'XML'.",
    )
    rid: str | None = Field(
        None,
        description="Request ID for retrieving BLAST results.",
    )
    other_params: dict | None = Field(
        default={"email": "user@example.com"},
        description="Other optional BLAST parameters, including user email.",
    )
    max_hits: int | None = Field(
        default=15,
        description="Maximum number of hits to return in the BLAST results.",
    )
    sort_by: str | None = Field(
        default="score",
        description="Criterion to sort BLAST results by, e.g., 'score', 'evalue'.",
    )
    megablast: str | None = Field(
        default="on",
        description="Set to 'on' for human genome alignemnts",
    )
    question_uuid: str | None = Field(
        default_factory=lambda: str(uuid.uuid4()),
        description="Unique identifier for the question.",
    )
    full_url: str | None = Field(
        default="TBF",
        description="Full URL to be used to submit the BLAST query",
    )

The API Agent for the OncoKB tool

OncoKBFetcher

Bases: BaseFetcher

A class for retrieving API results from OncoKB given a parameterized OncoKBQuery.

Source code in biochatter/api_agent/oncokb.py
class OncoKBFetcher(BaseFetcher):
    """A class for retrieving API results from OncoKB given a parameterized
    OncoKBQuery.
    """

    def __init__(self, api_token="demo"):
        self.headers = {
            "Authorization": f"Bearer {api_token}",
            "Accept": "application/json",
        }
        self.base_url = "https://demo.oncokb.org/api/v1"

    def fetch_results(
        self,
        request_data: OncoKBQueryParameters,
        retries: int | None = 3,
    ) -> str:
        """Function to submit the OncoKB query and fetch the results directly.
        No multi-step procedure, thus no wrapping of submission and retrieval in
        this case.

        Args:
        ----
            request_data: OncoKBQuery object (Pydantic model) containing the
                OncoKB query parameters.

        Returns:
        -------
            str: The results of the OncoKB query.

        """
        # Submit the query and get the URL
        params = request_data.dict(exclude_unset=True)
        endpoint = params.pop("endpoint")
        params.pop("question_uuid")
        full_url = f"{self.base_url}/{endpoint}"
        response = requests.get(full_url, headers=self.headers, params=params)
        response.raise_for_status()

        # Fetch the results from the URL
        results_response = requests.get(response.url, headers=self.headers)
        results_response.raise_for_status()

        return results_response.text

fetch_results(request_data, retries=3)

Function to submit the OncoKB query and fetch the results directly. No multi-step procedure, thus no wrapping of submission and retrieval in this case.


request_data: OncoKBQuery object (Pydantic model) containing the
    OncoKB query parameters.

str: The results of the OncoKB query.
Source code in biochatter/api_agent/oncokb.py
def fetch_results(
    self,
    request_data: OncoKBQueryParameters,
    retries: int | None = 3,
) -> str:
    """Function to submit the OncoKB query and fetch the results directly.
    No multi-step procedure, thus no wrapping of submission and retrieval in
    this case.

    Args:
    ----
        request_data: OncoKBQuery object (Pydantic model) containing the
            OncoKB query parameters.

    Returns:
    -------
        str: The results of the OncoKB query.

    """
    # Submit the query and get the URL
    params = request_data.dict(exclude_unset=True)
    endpoint = params.pop("endpoint")
    params.pop("question_uuid")
    full_url = f"{self.base_url}/{endpoint}"
    response = requests.get(full_url, headers=self.headers, params=params)
    response.raise_for_status()

    # Fetch the results from the URL
    results_response = requests.get(response.url, headers=self.headers)
    results_response.raise_for_status()

    return results_response.text

OncoKBInterpreter

Bases: BaseInterpreter

Source code in biochatter/api_agent/oncokb.py
class OncoKBInterpreter(BaseInterpreter):
    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """Function to extract the answer from the BLAST results.

        Args:
        ----
            question (str): The question to be answered.
            conversation_factory: A BioChatter conversation object.
            response_text (str): The response.text returned by OncoKB.

        Returns:
        -------
            str: The extracted answer from the BLAST results.

        """
        prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class molecular biologist who knows "
                    "everything about OncoKB and cancer genomics. Your task is "
                    "to interpret results from OncoKB API calls and summarise "
                    "them for the user.",
                ),
                ("user", "{input}"),
            ],
        )
        summary_prompt = ONCOKB_SUMMARY_PROMPT.format(
            question=question,
            context=response_text,
        )
        output_parser = StrOutputParser()
        conversation = conversation_factory()
        chain = prompt | conversation.chat | output_parser
        answer = chain.invoke({"input": {summary_prompt}})
        return answer

summarise_results(question, conversation_factory, response_text)

Function to extract the answer from the BLAST results.


question (str): The question to be answered.
conversation_factory: A BioChatter conversation object.
response_text (str): The response.text returned by OncoKB.

str: The extracted answer from the BLAST results.
Source code in biochatter/api_agent/oncokb.py
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """Function to extract the answer from the BLAST results.

    Args:
    ----
        question (str): The question to be answered.
        conversation_factory: A BioChatter conversation object.
        response_text (str): The response.text returned by OncoKB.

    Returns:
    -------
        str: The extracted answer from the BLAST results.

    """
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a world class molecular biologist who knows "
                "everything about OncoKB and cancer genomics. Your task is "
                "to interpret results from OncoKB API calls and summarise "
                "them for the user.",
            ),
            ("user", "{input}"),
        ],
    )
    summary_prompt = ONCOKB_SUMMARY_PROMPT.format(
        question=question,
        context=response_text,
    )
    output_parser = StrOutputParser()
    conversation = conversation_factory()
    chain = prompt | conversation.chat | output_parser
    answer = chain.invoke({"input": {summary_prompt}})
    return answer

OncoKBQueryBuilder

Bases: BaseQueryBuilder

A class for building an OncoKBQuery object.

Source code in biochatter/api_agent/oncokb.py
class OncoKBQueryBuilder(BaseQueryBuilder):
    """A class for building an OncoKBQuery object."""

    def create_runnable(
        self,
        query_parameters: "OncoKBQueryParameters",
        conversation: "Conversation",
    ) -> Callable:
        """Creates a runnable object for executing queries using the LangChain
        `create_structured_output_runnable` method.

        Args:
        ----
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
        -------
            A Callable object that can execute the query.

        """
        return create_structured_output_runnable(
            output_schema=query_parameters,
            llm=conversation.chat,
            prompt=self.structured_output_prompt,
        )

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> OncoKBQueryParameters:
        """Generates an OncoKBQuery object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
        ----
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                OncoKBQuery.

        Returns:
        -------
            OncoKBQueryParameters: the parameterised query object (Pydantic model)

        """
        runnable = self.create_runnable(
            query_parameters=OncoKBQueryParameters,
            conversation=conversation,
        )
        oncokb_call_obj = runnable.invoke(
            {"input": f"Answer:\n{question} based on:\n {ONCOKB_QUERY_PROMPT}"},
        )
        oncokb_call_obj.question_uuid = str(uuid.uuid4())
        return oncokb_call_obj

create_runnable(query_parameters, conversation)

Creates a runnable object for executing queries using the LangChain create_structured_output_runnable method.


query_parameters: A Pydantic data model that specifies the fields of
    the API that should be queried.

conversation: A BioChatter conversation object.

A Callable object that can execute the query.
Source code in biochatter/api_agent/oncokb.py
def create_runnable(
    self,
    query_parameters: "OncoKBQueryParameters",
    conversation: "Conversation",
) -> Callable:
    """Creates a runnable object for executing queries using the LangChain
    `create_structured_output_runnable` method.

    Args:
    ----
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
    -------
        A Callable object that can execute the query.

    """
    return create_structured_output_runnable(
        output_schema=query_parameters,
        llm=conversation.chat,
        prompt=self.structured_output_prompt,
    )

parameterise_query(question, conversation)

Generates an OncoKBQuery object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.


question (str): The question to be answered.

conversation: The conversation object used for parameterising the
    OncoKBQuery.

OncoKBQueryParameters: the parameterised query object (Pydantic model)
Source code in biochatter/api_agent/oncokb.py
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> OncoKBQueryParameters:
    """Generates an OncoKBQuery object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
    ----
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            OncoKBQuery.

    Returns:
    -------
        OncoKBQueryParameters: the parameterised query object (Pydantic model)

    """
    runnable = self.create_runnable(
        query_parameters=OncoKBQueryParameters,
        conversation=conversation,
    )
    oncokb_call_obj = runnable.invoke(
        {"input": f"Answer:\n{question} based on:\n {ONCOKB_QUERY_PROMPT}"},
    )
    oncokb_call_obj.question_uuid = str(uuid.uuid4())
    return oncokb_call_obj