Skip to content

API Agent Reference

Here we handle the connection to external software tools via the parameterisation of API calls by the LLM.

The abstract base classes

BaseFetcher

Bases: ABC

Abstract base class for fetchers. A fetcher is responsible for submitting queries (in systems where submission and fetching are separate) and fetching and saving results of queries. It has to implement a fetch_results() method, which can wrap a multi-step procedure to submit and retrieve. Should implement retry method to account for connectivity issues or processing times.

Source code in biochatter/api_agent/abc.py
class BaseFetcher(ABC):
    """
    Abstract base class for fetchers. A fetcher is responsible for submitting
    queries (in systems where submission and fetching are separate) and fetching
    and saving results of queries. It has to implement a `fetch_results()`
    method, which can wrap a multi-step procedure to submit and retrieve. Should
    implement retry method to account for connectivity issues or processing
    times.
    """

    @abstractmethod
    def fetch_results(
        self,
        query_model: BaseModel,
        retries: Optional[int] = 3,
    ):
        """
        Fetches results by submitting a query. Can implement a multi-step
        procedure if submitting and fetching are distinct processes (e.g., in
        the case of long processing times as in the case of BLAST).

        Args:
            query_model: the Pydantic model describing the parameterised query
        """
        pass

fetch_results(query_model, retries=3) abstractmethod

Fetches results by submitting a query. Can implement a multi-step procedure if submitting and fetching are distinct processes (e.g., in the case of long processing times as in the case of BLAST).

Parameters:

Name Type Description Default
query_model BaseModel

the Pydantic model describing the parameterised query

required
Source code in biochatter/api_agent/abc.py
@abstractmethod
def fetch_results(
    self,
    query_model: BaseModel,
    retries: Optional[int] = 3,
):
    """
    Fetches results by submitting a query. Can implement a multi-step
    procedure if submitting and fetching are distinct processes (e.g., in
    the case of long processing times as in the case of BLAST).

    Args:
        query_model: the Pydantic model describing the parameterised query
    """
    pass

BaseInterpreter

Bases: ABC

Abstract base class for result interpreters. The interpreter is aware of the nature and structure of the results and can extract and summarise information from them.

Source code in biochatter/api_agent/abc.py
class BaseInterpreter(ABC):
    """
    Abstract base class for result interpreters. The interpreter is aware of the
    nature and structure of the results and can extract and summarise
    information from them.
    """

    @abstractmethod
    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """
        Summarises an answer based on the given parameters.

        Args:
            question (str): The question that was asked.

            conversation_factory (Callable): A function that creates a
                BioChatter conversation.

            response_text (str): The response.text returned from the request.

        Returns:
            A summary of the answer.

        Todo:
            Genericise (remove file path and n_lines parameters, and use a
            generic way to get the results). The child classes should manage the
            specifics of the results.
        """
        pass

summarise_results(question, conversation_factory, response_text) abstractmethod

Summarises an answer based on the given parameters.

Parameters:

Name Type Description Default
question str

The question that was asked.

required
conversation_factory Callable

A function that creates a BioChatter conversation.

required
response_text str

The response.text returned from the request.

required

Returns:

Type Description
str

A summary of the answer.

Todo

Genericise (remove file path and n_lines parameters, and use a generic way to get the results). The child classes should manage the specifics of the results.

Source code in biochatter/api_agent/abc.py
@abstractmethod
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """
    Summarises an answer based on the given parameters.

    Args:
        question (str): The question that was asked.

        conversation_factory (Callable): A function that creates a
            BioChatter conversation.

        response_text (str): The response.text returned from the request.

    Returns:
        A summary of the answer.

    Todo:
        Genericise (remove file path and n_lines parameters, and use a
        generic way to get the results). The child classes should manage the
        specifics of the results.
    """
    pass

BaseQueryBuilder

Bases: ABC

An abstract base class for query builders.

Source code in biochatter/api_agent/abc.py
class BaseQueryBuilder(ABC):
    """
    An abstract base class for query builders.
    """

    @property
    def structured_output_prompt(self) -> ChatPromptTemplate:
        """
        Defines a structured output prompt template. This provides a default
        implementation for an API agent that can be overridden by subclasses to
        return a ChatPromptTemplate-compatible object.
        """
        return ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class algorithm for extracting information in structured formats.",
                ),
                (
                    "human",
                    "Use the given format to extract information from the following input: {input}",
                ),
                ("human", "Tip: Make sure to answer in the correct format"),
            ]
        )

    @abstractmethod
    def create_runnable(
        self,
        query_parameters: "BaseModel",
        conversation: "Conversation",
    ) -> Callable:
        """
        Creates a runnable object for executing queries. Must be implemented by
        subclasses. Should use the LangChain `create_structured_output_runnable`
        method to generate the Callable.

        Args:
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
            A Callable object that can execute the query.
        """
        pass

    @abstractmethod
    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> BaseModel:
        """

        Parameterises a query object (a Pydantic model with the fields of the
        API) based on the given question using a BioChatter conversation
        instance. Must be implemented by subclasses.

        Args:
            question (str): The question to be answered.

            conversation: The BioChatter conversation object containing the LLM
                that should parameterise the query.

        Returns:
            A parameterised instance of the query object (Pydantic BaseModel)
        """
        pass

structured_output_prompt: ChatPromptTemplate property

Defines a structured output prompt template. This provides a default implementation for an API agent that can be overridden by subclasses to return a ChatPromptTemplate-compatible object.

create_runnable(query_parameters, conversation) abstractmethod

Creates a runnable object for executing queries. Must be implemented by subclasses. Should use the LangChain create_structured_output_runnable method to generate the Callable.

Parameters:

Name Type Description Default
query_parameters BaseModel

A Pydantic data model that specifies the fields of the API that should be queried.

required
conversation Conversation

A BioChatter conversation object.

required

Returns:

Type Description
Callable

A Callable object that can execute the query.

Source code in biochatter/api_agent/abc.py
@abstractmethod
def create_runnable(
    self,
    query_parameters: "BaseModel",
    conversation: "Conversation",
) -> Callable:
    """
    Creates a runnable object for executing queries. Must be implemented by
    subclasses. Should use the LangChain `create_structured_output_runnable`
    method to generate the Callable.

    Args:
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
        A Callable object that can execute the query.
    """
    pass

parameterise_query(question, conversation) abstractmethod

Parameterises a query object (a Pydantic model with the fields of the API) based on the given question using a BioChatter conversation instance. Must be implemented by subclasses.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
conversation Conversation

The BioChatter conversation object containing the LLM that should parameterise the query.

required

Returns:

Type Description
BaseModel

A parameterised instance of the query object (Pydantic BaseModel)

Source code in biochatter/api_agent/abc.py
@abstractmethod
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> BaseModel:
    """

    Parameterises a query object (a Pydantic model with the fields of the
    API) based on the given question using a BioChatter conversation
    instance. Must be implemented by subclasses.

    Args:
        question (str): The question to be answered.

        conversation: The BioChatter conversation object containing the LLM
            that should parameterise the query.

    Returns:
        A parameterised instance of the query object (Pydantic BaseModel)
    """
    pass

The API Agent

APIAgent

Source code in biochatter/api_agent/api_agent.py
class APIAgent:
    def __init__(
        self,
        conversation_factory: Callable,
        query_builder: "BaseQueryBuilder",
        fetcher: "BaseFetcher",
        interpreter: "BaseInterpreter",
    ):
        """

        API agent class to interact with a tool's API for querying and fetching
        results.  The query fields have to be defined in a Pydantic model
        (`BaseModel`) and used (i.e., parameterised by the LLM) in the query
        builder. Specific API agents are defined in submodules of this directory
        (`api_agent`). The agent's logic is implemented in the `execute` method.

        Attributes:
            conversation_factory (Callable): A function used to create a
                BioChatter conversation, providing LLM access.

            query_builder (BaseQueryBuilder): An instance of a child of the
                BaseQueryBuilder class.

            result_fetcher (BaseFetcher): An instance of a child of the
                BaseFetcher class.

            result_interpreter (BaseInterpreter): An instance of a child of the
                BaseInterpreter class.
        """
        self.conversation_factory = conversation_factory
        self.query_builder = query_builder
        self.fetcher = fetcher
        self.interpreter = interpreter
        self.final_answer = None

    def parameterise_query(self, question: str) -> Optional[BaseModel]:
        """
        Use LLM to parameterise a query (a Pydantic model) based on the given
        question using a BioChatter conversation instance.
        """
        try:
            conversation = self.conversation_factory()
            return self.query_builder.parameterise_query(question, conversation)
        except Exception as e:
            print(f"Error generating query: {e}")
            return None

    def fetch_results(self, query_model: str) -> Optional[str]:
        """
        Fetch the results of the query using the individual API's implementation
        (either single-step or submit-retrieve).

        Args:
            query_model: the parameterised query Pydantic model
        """
        try:
            return self.fetcher.fetch_results(query_model, 100)
        except Exception as e:
            print(f"Error fetching results: {e}")
            return None

    def summarise_results(
        self, question: str, response_text: str
    ) -> Optional[str]:
        """
        Summarise the retrieved results to extract the answer to the question.
        """
        try:
            return self.interpreter.summarise_results(
                question=question,
                conversation_factory=self.conversation_factory,
                response_text=response_text,
            )
        except Exception as e:
            print(f"Error extracting answer: {e}")
            return None

    def execute(self, question: str) -> Optional[str]:
        """
        Wrapper that uses class methods to execute the API agent logic. Consists
        of 1) query generation, 2) query submission, 3) results fetching, and
        4) answer extraction. The final answer is stored in the final_answer
        attribute.

        Args:
            question (str): The question to be answered.
        """
        # Generate query
        try:
            query_model = self.parameterise_query(question)
            if not query_model:
                raise ValueError("Failed to generate query.")
        except ValueError as e:
            print(e)

        # Fetch results
        try:
            response_text = self.fetch_results(
                query_model=query_model,
            )
            if not response_text:
                raise ValueError("Failed to fetch results.")
        except ValueError as e:
            print(e)

        # Extract answer from results
        try:
            final_answer = self.summarise_results(question, response_text)
            if not final_answer:
                raise ValueError("Failed to extract answer from results.")
        except ValueError as e:
            print(e)

        self.final_answer = final_answer
        return final_answer

    def get_description(self, tool_name: str, tool_desc: str):
        return (
            f"This API agent interacts with {tool_name}'s API for querying and "
            f"fetching results. {tool_desc}"
        )

__init__(conversation_factory, query_builder, fetcher, interpreter)

API agent class to interact with a tool's API for querying and fetching results. The query fields have to be defined in a Pydantic model (BaseModel) and used (i.e., parameterised by the LLM) in the query builder. Specific API agents are defined in submodules of this directory (api_agent). The agent's logic is implemented in the execute method.

Attributes:

Name Type Description
conversation_factory Callable

A function used to create a BioChatter conversation, providing LLM access.

query_builder BaseQueryBuilder

An instance of a child of the BaseQueryBuilder class.

result_fetcher BaseFetcher

An instance of a child of the BaseFetcher class.

result_interpreter BaseInterpreter

An instance of a child of the BaseInterpreter class.

Source code in biochatter/api_agent/api_agent.py
def __init__(
    self,
    conversation_factory: Callable,
    query_builder: "BaseQueryBuilder",
    fetcher: "BaseFetcher",
    interpreter: "BaseInterpreter",
):
    """

    API agent class to interact with a tool's API for querying and fetching
    results.  The query fields have to be defined in a Pydantic model
    (`BaseModel`) and used (i.e., parameterised by the LLM) in the query
    builder. Specific API agents are defined in submodules of this directory
    (`api_agent`). The agent's logic is implemented in the `execute` method.

    Attributes:
        conversation_factory (Callable): A function used to create a
            BioChatter conversation, providing LLM access.

        query_builder (BaseQueryBuilder): An instance of a child of the
            BaseQueryBuilder class.

        result_fetcher (BaseFetcher): An instance of a child of the
            BaseFetcher class.

        result_interpreter (BaseInterpreter): An instance of a child of the
            BaseInterpreter class.
    """
    self.conversation_factory = conversation_factory
    self.query_builder = query_builder
    self.fetcher = fetcher
    self.interpreter = interpreter
    self.final_answer = None

execute(question)

Wrapper that uses class methods to execute the API agent logic. Consists of 1) query generation, 2) query submission, 3) results fetching, and 4) answer extraction. The final answer is stored in the final_answer attribute.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
Source code in biochatter/api_agent/api_agent.py
def execute(self, question: str) -> Optional[str]:
    """
    Wrapper that uses class methods to execute the API agent logic. Consists
    of 1) query generation, 2) query submission, 3) results fetching, and
    4) answer extraction. The final answer is stored in the final_answer
    attribute.

    Args:
        question (str): The question to be answered.
    """
    # Generate query
    try:
        query_model = self.parameterise_query(question)
        if not query_model:
            raise ValueError("Failed to generate query.")
    except ValueError as e:
        print(e)

    # Fetch results
    try:
        response_text = self.fetch_results(
            query_model=query_model,
        )
        if not response_text:
            raise ValueError("Failed to fetch results.")
    except ValueError as e:
        print(e)

    # Extract answer from results
    try:
        final_answer = self.summarise_results(question, response_text)
        if not final_answer:
            raise ValueError("Failed to extract answer from results.")
    except ValueError as e:
        print(e)

    self.final_answer = final_answer
    return final_answer

fetch_results(query_model)

Fetch the results of the query using the individual API's implementation (either single-step or submit-retrieve).

Parameters:

Name Type Description Default
query_model str

the parameterised query Pydantic model

required
Source code in biochatter/api_agent/api_agent.py
def fetch_results(self, query_model: str) -> Optional[str]:
    """
    Fetch the results of the query using the individual API's implementation
    (either single-step or submit-retrieve).

    Args:
        query_model: the parameterised query Pydantic model
    """
    try:
        return self.fetcher.fetch_results(query_model, 100)
    except Exception as e:
        print(f"Error fetching results: {e}")
        return None

parameterise_query(question)

Use LLM to parameterise a query (a Pydantic model) based on the given question using a BioChatter conversation instance.

Source code in biochatter/api_agent/api_agent.py
def parameterise_query(self, question: str) -> Optional[BaseModel]:
    """
    Use LLM to parameterise a query (a Pydantic model) based on the given
    question using a BioChatter conversation instance.
    """
    try:
        conversation = self.conversation_factory()
        return self.query_builder.parameterise_query(question, conversation)
    except Exception as e:
        print(f"Error generating query: {e}")
        return None

summarise_results(question, response_text)

Summarise the retrieved results to extract the answer to the question.

Source code in biochatter/api_agent/api_agent.py
def summarise_results(
    self, question: str, response_text: str
) -> Optional[str]:
    """
    Summarise the retrieved results to extract the answer to the question.
    """
    try:
        return self.interpreter.summarise_results(
            question=question,
            conversation_factory=self.conversation_factory,
            response_text=response_text,
        )
    except Exception as e:
        print(f"Error extracting answer: {e}")
        return None

The API Agent for the BLAST tool

BlastFetcher

Bases: BaseFetcher

A class for retrieving API results from BLAST given a parameterised BlastQuery.

TODO add a limit of characters to be returned from the response.text?

Source code in biochatter/api_agent/blast.py
class BlastFetcher(BaseFetcher):
    """
    A class for retrieving API results from BLAST given a parameterised
    BlastQuery.

    TODO add a limit of characters to be returned from the response.text?
    """

    def _submit_query(self, request_data: BlastQueryParameters) -> str:
        """Function to POST the BLAST query and retrieve RID.
        It submits the structured BlastQuery obj and return the RID.

        Args:
            request_data: BlastQuery object containing the BLAST query
                parameters.
        Returns:
            str: The Request ID (RID) for the submitted BLAST query.
        """
        data = {
            "CMD": request_data.cmd,
            "PROGRAM": request_data.program,
            "DATABASE": request_data.database,
            "QUERY": request_data.query,
            "FORMAT_TYPE": request_data.format_type,
            "MEGABLAST": request_data.megablast,
            "HITLIST_SIZE": request_data.max_hits,
        }
        # Include any other_params if provided
        if request_data.other_params:
            data.update(request_data.other_params)
        # Make the API call
        query_string = urlencode(data)
        # Combine base URL with the query string
        full_url = f"{request_data.url}?{query_string}"
        # Print the full URL
        request_data.full_url = full_url
        print("Full URL built by retriever:\n", request_data.full_url)
        response = requests.post(request_data.url, data=data)
        response.raise_for_status()
        # Extract RID from response
        print(response)
        match = re.search(r"RID = (\w+)", response.text)
        if match:
            return match.group(1)
        else:
            raise ValueError("RID not found in BLAST submission response.")

    def _fetch_results(
        self,
        rid: str,
        question_uuid: str,
        retries: int = 10000,
    ):
        """SECOND function to be called for a BLAST query.
        Will look for the RID to fetch the data
        """
        ###
        ###    TO DO: Implement logging for all BLAST queries
        ###
        # log_question_uuid_json(request_data.question_uuid,question, file_name, log_file_path,request_data.full_url)
        base_url = "https://blast.ncbi.nlm.nih.gov/Blast.cgi"
        check_status_params = {
            "CMD": "Get",
            "FORMAT_OBJECT": "SearchInfo",
            "RID": rid,
        }
        get_results_params = {
            "CMD": "Get",
            "FORMAT_TYPE": "XML",
            "RID": rid,
        }

        # Check the status of the BLAST job
        for attempt in range(retries):
            status_response = requests.get(base_url, params=check_status_params)
            status_response.raise_for_status()
            status_text = status_response.text
            print("evaluating status")
            if "Status=WAITING" in status_text:
                print(f"{question_uuid} results not ready, waiting...")
                time.sleep(15)
            elif "Status=FAILED" in status_text:
                raise RuntimeError("BLAST query FAILED.")
            elif "Status=UNKNOWN" in status_text:
                raise RuntimeError("BLAST query expired or does not exist.")
            elif "Status=READY" in status_text:
                if "ThereAreHits=yes" in status_text:
                    print(f"{question_uuid} results are ready, retrieving.")
                    results_response = requests.get(
                        base_url, params=get_results_params
                    )
                    results_response.raise_for_status()
                    # Save the results to a file
                    return results_response.text
                else:
                    return "No hits found"
        if attempt == retries - 1:
            raise TimeoutError(
                "Maximum attempts reached. Results may not be ready."
            )

    def fetch_results(
        self, query_model: BlastQueryParameters, retries: int = 20
    ) -> str:
        """
        Submit request and fetch results from BLAST API. Wraps individual
        submission and retrieval of results.

        Args:
            query_model: the Pydantic model of the query

            retries: the number of maximum retries

        Returns:
            str: the result from the BLAST API
        """
        rid = self._submit_query(request_data=query_model)
        return self._fetch_results(
            rid=rid,
            question_uuid=query_model.question_uuid,
            retries=retries,
        )

fetch_results(query_model, retries=20)

Submit request and fetch results from BLAST API. Wraps individual submission and retrieval of results.

Parameters:

Name Type Description Default
query_model BlastQueryParameters

the Pydantic model of the query

required
retries int

the number of maximum retries

20

Returns:

Name Type Description
str str

the result from the BLAST API

Source code in biochatter/api_agent/blast.py
def fetch_results(
    self, query_model: BlastQueryParameters, retries: int = 20
) -> str:
    """
    Submit request and fetch results from BLAST API. Wraps individual
    submission and retrieval of results.

    Args:
        query_model: the Pydantic model of the query

        retries: the number of maximum retries

    Returns:
        str: the result from the BLAST API
    """
    rid = self._submit_query(request_data=query_model)
    return self._fetch_results(
        rid=rid,
        question_uuid=query_model.question_uuid,
        retries=retries,
    )

BlastInterpreter

Bases: BaseInterpreter

Source code in biochatter/api_agent/blast.py
class BlastInterpreter(BaseInterpreter):
    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """
        Function to extract the answer from the BLAST results.

        Args:
            question (str): The question to be answered.
            conversation_factory: A BioChatter conversation object.
            response_text (str): The response.text returned by NCBI.

        Returns:
            str: The extracted answer from the BLAST results.

        """
        prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class molecular biologist who knows everything about NCBI and BLAST results.",
                ),
                ("user", "{input}"),
            ]
        )
        summary_prompt = BLAST_SUMMARY_PROMPT.format(
            question=question, context=response_text
        )
        output_parser = StrOutputParser()
        conversation = conversation_factory()
        chain = prompt | conversation.chat | output_parser
        answer = chain.invoke({"input": {summary_prompt}})
        return answer

summarise_results(question, conversation_factory, response_text)

Function to extract the answer from the BLAST results.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
conversation_factory Callable

A BioChatter conversation object.

required
response_text str

The response.text returned by NCBI.

required

Returns:

Name Type Description
str str

The extracted answer from the BLAST results.

Source code in biochatter/api_agent/blast.py
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """
    Function to extract the answer from the BLAST results.

    Args:
        question (str): The question to be answered.
        conversation_factory: A BioChatter conversation object.
        response_text (str): The response.text returned by NCBI.

    Returns:
        str: The extracted answer from the BLAST results.

    """
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a world class molecular biologist who knows everything about NCBI and BLAST results.",
            ),
            ("user", "{input}"),
        ]
    )
    summary_prompt = BLAST_SUMMARY_PROMPT.format(
        question=question, context=response_text
    )
    output_parser = StrOutputParser()
    conversation = conversation_factory()
    chain = prompt | conversation.chat | output_parser
    answer = chain.invoke({"input": {summary_prompt}})
    return answer

BlastQueryBuilder

Bases: BaseQueryBuilder

A class for building a BlastQuery object.

Source code in biochatter/api_agent/blast.py
class BlastQueryBuilder(BaseQueryBuilder):
    """A class for building a BlastQuery object."""

    def create_runnable(
        self,
        query_parameters: "BlastQueryParameters",
        conversation: "Conversation",
    ) -> Callable:
        """
        Creates a runnable object for executing queries using the LangChain
        `create_structured_output_runnable` method.

        Args:
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
            A Callable object that can execute the query.
        """
        return create_structured_output_runnable(
            output_schema=query_parameters,
            llm=conversation.chat,
            prompt=self.structured_output_prompt,
        )

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> BlastQueryParameters:
        """
        Generates a BlastQuery object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                BlastQuery.

        Returns:
            BlastQuery: the parameterised query object (Pydantic model)
        """
        runnable = self.create_runnable(
            query_parameters=BlastQueryParameters,
            conversation=conversation,
        )
        blast_call_obj = runnable.invoke(
            {"input": f"Answer:\n{question} based on:\n {BLAST_QUERY_PROMPT}"}
        )
        blast_call_obj.question_uuid = str(uuid.uuid4())
        return blast_call_obj

create_runnable(query_parameters, conversation)

Creates a runnable object for executing queries using the LangChain create_structured_output_runnable method.

Parameters:

Name Type Description Default
query_parameters BlastQueryParameters

A Pydantic data model that specifies the fields of the API that should be queried.

required
conversation Conversation

A BioChatter conversation object.

required

Returns:

Type Description
Callable

A Callable object that can execute the query.

Source code in biochatter/api_agent/blast.py
def create_runnable(
    self,
    query_parameters: "BlastQueryParameters",
    conversation: "Conversation",
) -> Callable:
    """
    Creates a runnable object for executing queries using the LangChain
    `create_structured_output_runnable` method.

    Args:
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
        A Callable object that can execute the query.
    """
    return create_structured_output_runnable(
        output_schema=query_parameters,
        llm=conversation.chat,
        prompt=self.structured_output_prompt,
    )

parameterise_query(question, conversation)

Generates a BlastQuery object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
conversation Conversation

The conversation object used for parameterising the BlastQuery.

required

Returns:

Name Type Description
BlastQuery BlastQueryParameters

the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/blast.py
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> BlastQueryParameters:
    """
    Generates a BlastQuery object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            BlastQuery.

    Returns:
        BlastQuery: the parameterised query object (Pydantic model)
    """
    runnable = self.create_runnable(
        query_parameters=BlastQueryParameters,
        conversation=conversation,
    )
    blast_call_obj = runnable.invoke(
        {"input": f"Answer:\n{question} based on:\n {BLAST_QUERY_PROMPT}"}
    )
    blast_call_obj.question_uuid = str(uuid.uuid4())
    return blast_call_obj

BlastQueryParameters

Bases: BaseModel

BlastQuery is a Pydantic model for the parameters of a BLAST query request, used for configuring and sending a request to the NCBI BLAST query API. The fields are dynamically configured by the LLM based on the user's question.

Source code in biochatter/api_agent/blast.py
class BlastQueryParameters(BaseModel):
    """

    BlastQuery is a Pydantic model for the parameters of a BLAST query request,
    used for configuring and sending a request to the NCBI BLAST query API. The
    fields are dynamically configured by the LLM based on the user's question.

    """

    url: Optional[str] = Field(
        default="https://blast.ncbi.nlm.nih.gov/Blast.cgi?",
        description="ALWAYS USE DEFAULT, DO NOT CHANGE",
    )
    cmd: Optional[str] = Field(
        default="Put",
        description="Command to execute, 'Put' for submitting query, 'Get' for retrieving results.",
    )
    program: Optional[str] = Field(
        default="blastn",
        description="BLAST program to use, e.g., 'blastn' for nucleotide-nucleotide BLAST, 'blastp' for protein-protein BLAST.",
    )
    database: Optional[str] = Field(
        default="nt",
        description="Database to search, e.g., 'nt' for nucleotide database, 'nr' for non redundant protein database, pdb the Protein Data Bank database, which is used specifically for protein structures, 'refseq_rna' and 'refseq_genomic': specialized databases for RNA sequences and genomic sequences",
    )
    query: Optional[str] = Field(
        None,
        description="Nucleotide or protein sequence for the BLAST or blat query, make sure to always keep the entire sequence given.",
    )
    format_type: Optional[str] = Field(
        default="Text",
        description="Format of the BLAST results, e.g., 'Text', 'XML'.",
    )
    rid: Optional[str] = Field(
        None, description="Request ID for retrieving BLAST results."
    )
    other_params: Optional[dict] = Field(
        default={"email": "user@example.com"},
        description="Other optional BLAST parameters, including user email.",
    )
    max_hits: Optional[int] = Field(
        default=15,
        description="Maximum number of hits to return in the BLAST results.",
    )
    sort_by: Optional[str] = Field(
        default="score",
        description="Criterion to sort BLAST results by, e.g., 'score', 'evalue'.",
    )
    megablast: Optional[str] = Field(
        default="on", description="Set to 'on' for human genome alignemnts"
    )
    question_uuid: Optional[str] = Field(
        default_factory=lambda: str(uuid.uuid4()),
        description="Unique identifier for the question.",
    )
    full_url: Optional[str] = Field(
        default="TBF",
        description="Full URL to be used to submit the BLAST query",
    )

The API Agent for the OncoKB tool

OncoKBFetcher

Bases: BaseFetcher

A class for retrieving API results from OncoKB given a parameterized OncoKBQuery.

Source code in biochatter/api_agent/oncokb.py
class OncoKBFetcher(BaseFetcher):
    """
    A class for retrieving API results from OncoKB given a parameterized
    OncoKBQuery.
    """

    def __init__(self, api_token="demo"):
        self.headers = {
            "Authorization": f"Bearer {api_token}",
            "Accept": "application/json",
        }
        self.base_url = "https://demo.oncokb.org/api/v1"

    def fetch_results(
        self, request_data: OncoKBQueryParameters, retries: Optional[int] = 3
    ) -> str:
        """Function to submit the OncoKB query and fetch the results directly.
        No multi-step procedure, thus no wrapping of submission and retrieval in
        this case.

        Args:
            request_data: OncoKBQuery object (Pydantic model) containing the
                OncoKB query parameters.

        Returns:
            str: The results of the OncoKB query.
        """
        # Submit the query and get the URL
        params = request_data.dict(exclude_unset=True)
        endpoint = params.pop("endpoint")
        params.pop("question_uuid")
        full_url = f"{self.base_url}/{endpoint}"
        response = requests.get(full_url, headers=self.headers, params=params)
        response.raise_for_status()

        # Fetch the results from the URL
        results_response = requests.get(response.url, headers=self.headers)
        results_response.raise_for_status()

        return results_response.text

fetch_results(request_data, retries=3)

Function to submit the OncoKB query and fetch the results directly. No multi-step procedure, thus no wrapping of submission and retrieval in this case.

Parameters:

Name Type Description Default
request_data OncoKBQueryParameters

OncoKBQuery object (Pydantic model) containing the OncoKB query parameters.

required

Returns:

Name Type Description
str str

The results of the OncoKB query.

Source code in biochatter/api_agent/oncokb.py
def fetch_results(
    self, request_data: OncoKBQueryParameters, retries: Optional[int] = 3
) -> str:
    """Function to submit the OncoKB query and fetch the results directly.
    No multi-step procedure, thus no wrapping of submission and retrieval in
    this case.

    Args:
        request_data: OncoKBQuery object (Pydantic model) containing the
            OncoKB query parameters.

    Returns:
        str: The results of the OncoKB query.
    """
    # Submit the query and get the URL
    params = request_data.dict(exclude_unset=True)
    endpoint = params.pop("endpoint")
    params.pop("question_uuid")
    full_url = f"{self.base_url}/{endpoint}"
    response = requests.get(full_url, headers=self.headers, params=params)
    response.raise_for_status()

    # Fetch the results from the URL
    results_response = requests.get(response.url, headers=self.headers)
    results_response.raise_for_status()

    return results_response.text

OncoKBInterpreter

Bases: BaseInterpreter

Source code in biochatter/api_agent/oncokb.py
class OncoKBInterpreter(BaseInterpreter):
    def summarise_results(
        self,
        question: str,
        conversation_factory: Callable,
        response_text: str,
    ) -> str:
        """
        Function to extract the answer from the BLAST results.

        Args:
            question (str): The question to be answered.
            conversation_factory: A BioChatter conversation object.
            response_text (str): The response.text returned by OncoKB.

        Returns:
            str: The extracted answer from the BLAST results.

        """
        prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a world class molecular biologist who knows "
                    "everything about OncoKB and cancer genomics. Your task is "
                    "to interpret results from OncoKB API calls and summarise "
                    "them for the user.",
                ),
                ("user", "{input}"),
            ]
        )
        summary_prompt = ONCOKB_SUMMARY_PROMPT.format(
            question=question, context=response_text
        )
        output_parser = StrOutputParser()
        conversation = conversation_factory()
        chain = prompt | conversation.chat | output_parser
        answer = chain.invoke({"input": {summary_prompt}})
        return answer

summarise_results(question, conversation_factory, response_text)

Function to extract the answer from the BLAST results.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
conversation_factory Callable

A BioChatter conversation object.

required
response_text str

The response.text returned by OncoKB.

required

Returns:

Name Type Description
str str

The extracted answer from the BLAST results.

Source code in biochatter/api_agent/oncokb.py
def summarise_results(
    self,
    question: str,
    conversation_factory: Callable,
    response_text: str,
) -> str:
    """
    Function to extract the answer from the BLAST results.

    Args:
        question (str): The question to be answered.
        conversation_factory: A BioChatter conversation object.
        response_text (str): The response.text returned by OncoKB.

    Returns:
        str: The extracted answer from the BLAST results.

    """
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a world class molecular biologist who knows "
                "everything about OncoKB and cancer genomics. Your task is "
                "to interpret results from OncoKB API calls and summarise "
                "them for the user.",
            ),
            ("user", "{input}"),
        ]
    )
    summary_prompt = ONCOKB_SUMMARY_PROMPT.format(
        question=question, context=response_text
    )
    output_parser = StrOutputParser()
    conversation = conversation_factory()
    chain = prompt | conversation.chat | output_parser
    answer = chain.invoke({"input": {summary_prompt}})
    return answer

OncoKBQueryBuilder

Bases: BaseQueryBuilder

A class for building an OncoKBQuery object.

Source code in biochatter/api_agent/oncokb.py
class OncoKBQueryBuilder(BaseQueryBuilder):
    """A class for building an OncoKBQuery object."""

    def create_runnable(
        self,
        query_parameters: "OncoKBQueryParameters",
        conversation: "Conversation",
    ) -> Callable:
        """
        Creates a runnable object for executing queries using the LangChain
        `create_structured_output_runnable` method.

        Args:
            query_parameters: A Pydantic data model that specifies the fields of
                the API that should be queried.

            conversation: A BioChatter conversation object.

        Returns:
            A Callable object that can execute the query.
        """
        return create_structured_output_runnable(
            output_schema=query_parameters,
            llm=conversation.chat,
            prompt=self.structured_output_prompt,
        )

    def parameterise_query(
        self,
        question: str,
        conversation: "Conversation",
    ) -> OncoKBQueryParameters:
        """
        Generates an OncoKBQuery object based on the given question, prompt, and
        BioChatter conversation. Uses a Pydantic model to define the API fields.
        Creates a runnable that can be invoked on LLMs that are qualified to
        parameterise functions.

        Args:
            question (str): The question to be answered.

            conversation: The conversation object used for parameterising the
                OncoKBQuery.

        Returns:
            OncoKBQueryParameters: the parameterised query object (Pydantic model)
        """
        runnable = self.create_runnable(
            query_parameters=OncoKBQueryParameters,
            conversation=conversation,
        )
        oncokb_call_obj = runnable.invoke(
            {"input": f"Answer:\n{question} based on:\n {ONCOKB_QUERY_PROMPT}"}
        )
        oncokb_call_obj.question_uuid = str(uuid.uuid4())
        return oncokb_call_obj

create_runnable(query_parameters, conversation)

Creates a runnable object for executing queries using the LangChain create_structured_output_runnable method.

Parameters:

Name Type Description Default
query_parameters OncoKBQueryParameters

A Pydantic data model that specifies the fields of the API that should be queried.

required
conversation Conversation

A BioChatter conversation object.

required

Returns:

Type Description
Callable

A Callable object that can execute the query.

Source code in biochatter/api_agent/oncokb.py
def create_runnable(
    self,
    query_parameters: "OncoKBQueryParameters",
    conversation: "Conversation",
) -> Callable:
    """
    Creates a runnable object for executing queries using the LangChain
    `create_structured_output_runnable` method.

    Args:
        query_parameters: A Pydantic data model that specifies the fields of
            the API that should be queried.

        conversation: A BioChatter conversation object.

    Returns:
        A Callable object that can execute the query.
    """
    return create_structured_output_runnable(
        output_schema=query_parameters,
        llm=conversation.chat,
        prompt=self.structured_output_prompt,
    )

parameterise_query(question, conversation)

Generates an OncoKBQuery object based on the given question, prompt, and BioChatter conversation. Uses a Pydantic model to define the API fields. Creates a runnable that can be invoked on LLMs that are qualified to parameterise functions.

Parameters:

Name Type Description Default
question str

The question to be answered.

required
conversation Conversation

The conversation object used for parameterising the OncoKBQuery.

required

Returns:

Name Type Description
OncoKBQueryParameters OncoKBQueryParameters

the parameterised query object (Pydantic model)

Source code in biochatter/api_agent/oncokb.py
def parameterise_query(
    self,
    question: str,
    conversation: "Conversation",
) -> OncoKBQueryParameters:
    """
    Generates an OncoKBQuery object based on the given question, prompt, and
    BioChatter conversation. Uses a Pydantic model to define the API fields.
    Creates a runnable that can be invoked on LLMs that are qualified to
    parameterise functions.

    Args:
        question (str): The question to be answered.

        conversation: The conversation object used for parameterising the
            OncoKBQuery.

    Returns:
        OncoKBQueryParameters: the parameterised query object (Pydantic model)
    """
    runnable = self.create_runnable(
        query_parameters=OncoKBQueryParameters,
        conversation=conversation,
    )
    oncokb_call_obj = runnable.invoke(
        {"input": f"Answer:\n{question} based on:\n {ONCOKB_QUERY_PROMPT}"}
    )
    oncokb_call_obj.question_uuid = str(uuid.uuid4())
    return oncokb_call_obj