API

GenericApi

Generic API class.

BsdApi

This is an API object specifically designed for the BioSamples Database.

API module. This module consists of several classes whose function is to provide with the necessary endpoints for interacting with the archival services.

The main functions are to submit, retrieve with an accession, and update with an accession; However, as many methods can be added as necessary.

Mandatory arguments:

  • authenticator: Subclass of GenericAuthenticator

  • base_uri: base uri for the archive’s API (Can be automatically set-up by subclasses, see BsdApi object for an example)

Optional arguments:

  • verbose: set to True if you want INFO and above-level logging events. If not set or set to False, only WARNING and above will be displayed

Environment variables:

  • API_ENVIRONMENT: Needs to be set up if you want to set up a ‘dev’ authenticator. Please note this environment variable is shared with the Authenticator: this is to avoid inconsistent API/Authenticator combos (And even with all these checks and constraints, there will be errors, I’m pretty sure)

Subclasses of GenericApi must define the following methods/properties:

  • _submit: Function called by submit when only one entity is sent to submit

  • _submit_multiple: Function called by submit when multiple entities are sent to submit

  • Same with retrieve and update.

class GenericApi(authenticator, base_uri, verbose=True)

Bases: object

Generic API class. This class defines the minimal functions and class properties needed for the rest of the API classes.

Parameters:
  • authenticator (GenericAuthenticator) – Authenticator object. Requests are handled through the authenticator.

  • base_uri (str) – Base (root) uri of the API.

  • verbose (bool) – Boolean indicating if the logger should be verbose.

submit(entities, **kwargs)

Generic function for submitting an iterable of entities to the archive.

Parameters:
  • entities (list[GenericEntity]) – list of GenericEntity subclasses.

  • kwargs (dict) – Keyword arguments needed for subclasses for submitting.

Return type:

list[GenericEntity]

Returns:

list of GenericEntity subclasses after archival/deposition.

_submit(entity, kwargs)

Generic function for submitting an entity to an archive.

Parameters:
  • entity (GenericEntity) – Subclass of GenericEntity

  • kwargs (dict) – Keyword arguments needed for subclasses’ method.

Return type:

GenericEntity

Returns:

Submitted GenericEntity subclass

_submit_multiple(entities, kwargs)

Generic function for submitting multiple entities to an archive.

Parameters:
  • entities (list[GenericEntity]) – List of subclasses of GenericEntity

  • kwargs (dict) – Keyword arguments needed for subclasses’ method.

Return type:

list[GenericEntity]

Returns:

Submitted GenericEntity subclasses

retrieve(accession)

Generic function for retrieving one or more entities accessing the API via a/some unique identifier/s (accession). Depending on the type of input parameter, calls _retrieve() or _retrieve_multiple().

Parameters:

accession (list[str]) – Unique identifier for the entity. Can be a string or a list of strings.

Return type:

list[GenericEntity]

Returns:

List of entities retrieved from the API. MUST always return a list for consistency.

_retrieve(accession)

Retrieve one entity via a unique identifier (accession).

Parameters:

accession (str) – Unique identifier for the entity.

Return type:

GenericEntity

Returns:

An entity retrieved from the API.

_retrieve_multiple(accession_list)

Retrieve multiple entities via a list of unique identifiers (accessions).

Parameters:

accession_list – List of unique identifiers for the entities to retrieve.

Return type:

list[GenericEntity]

Returns:

List of entities.

update(entity)

Update an entity. Should always take a list of entities as input, and each subclass decides how to handle the update.

Parameters:

entity (list[GenericEntity]) – List of GenericEntity’s subclasses

Return type:

list[GenericEntity]

Returns:

List of updated GenericEntity’s subclasses

_update(entity)

Update an entity via the API.

Parameters:

entity (GenericEntity) – GenericEntity subclass, corresponding to an entry in the database.

Return type:

GenericEntity

Returns:

An updated GenericEntity subclass

_update_multiple(entities)

Update multiple entities via the API.

Parameters:

entities (list[GenericEntity]) – list of GenericEntity subclasses, corresponding to several entries in the database.

Return type:

list[GenericEntity]

Returns:

list of updated GenericEntity’s subclasses

class BsdApi(authenticator, verbose=True)

Bases: GenericApi

This is an API object specifically designed for the BioSamples Database.

Please note: If you need to access the ‘dev’ environment, pleese set up the environment variable ‘API_ENVIRONMENT’ with the value ‘dev’. Otherwise, this API object will point to the production BioSamples archive.

Parameters:
  • authenticator (GenericAuthenticator) – Subclass instance from the authenticator module. For BioSamples, it’s recommended to use the WebinAuthenticator.

  • verbose (bool) – True if logger should be set to INFO. Default WARNING.

_submit(entity, kwargs)

Submit a single Biosample entity to BSD.

Parameters:
  • entity (Biosample) – Biosample GenericEntity subclass.

  • kwargs (dict) – Keyword argument. No use for this function.

Return type:

Biosample

Returns:

a single, archived Biosample entity.

_submit_multiple(entities, kwargs)

Submit a list of BioSample entities to biosamples, using the bulk-submit endpoint.

Parameters:
  • entities (list[Biosample]) – Iterable (List/Tuple) of BioSample objects. Must always be an iterable.

  • kwargs (dict) –

    Keyword argument:

    • ’chunk_size’: integer, may be set up to determine the size of chunks to send to BSD at once. Due to BSD technical limitations, capped at 500.

    • ’process_relationships’: bool, if set to true, after submission, updates the samples with the relationships.

Return type:

list[Biosample]

Returns:

a list of BioSample entities

_retrieve(accession)

Retrieve a sample from BioSamples by using an accession

Parameters:

accession (str) – Accession ID, in BioSamples format

Return type:

Biosample

Returns:

Biosample entity retrieved from the BioSample database

_retrieve_multiple(accession_list)

Retrieve multiple samples from BioSamples by providing a list of accessions.

Parameters:

accession_list (list[str]) – Iterable (tuple|list) with accessions

Return type:

list[Biosample]

Returns:

List of BioSample entities retrieved from BioSamples API

_update(entity)

Update a sample that is already in the BioSamples database. Samples must be updated using the FULL metadata, as per BSD specifications https://www.ebi.ac.uk/biosamples/docs/references/api/submit#_update_sample

Parameters:

entity (Biosample) – Biosample object loaded with the metadata, including the accession

Return type:

Biosample

Returns:

Updated sample contained in Biosample object

_update_multiple(entities)

Updates multiple samples in the BSD database. Since they can only be updated once at a time, calls _update() once per sample in list.

Parameters:

entities (list[Biosample]) – List of Biosample entities to update

Return type:

list[Biosample]

Returns:

List with updated Biosample entities

validate_sample(entity)

Validate a sample before submission. The errors returned are the same as the ones you get when you submit, so they are handled in the same way.

Parameters:

entity (Biosample) – Biosample entity to be validated.

Returns:

process_relationships(entities)

Process the relationships from a list of submitted entities. Assumes the relationships are defined in the metadata as characteristics.derived_from/same_as, and that the entities are linked via their name, not accession.

If multiple relationships of the same type have to be defined, please use the delimiter as the input value (e.g. same_sample1||same_sample2 under same_as property)

Parameters:

entities (list[Biosample]) – List of Biosample entities to update their relationships.

Return type:

list[Biosample]

Returns:

list of updated entities

search_samples(text='', attributes=None)

Search for samples in the Biosamples database. Can either search using free text (Can be improved using the query syntax specified here: https://www.ebi.ac.uk/ebisearch/documentation) or by attributes’ values. For the attributes, please provide them as a dictionary.

Parameters:
  • text (str) – free text for the search. Can use query syntax for search engines (AND/OR etc)

  • attributes – Attributes to filter by. Has to be provided as a dictionary {<attribute_name>: <attr. value>}

Return type:

list[Biosample]

Returns:

list of Biosamples or an empty list.

submit_structured_data(structured_data)

Submit structured data to a sample in BioSamples. The data is checked before submission. May raise: - StructuredDataError: Pre-submission errors - StructuredDataSubmissionError: Post-submission errors

Parameters:

structured_data (dict) – Structured data that’s going to be posted in BSD. Must follow the format in https://www.ebi.ac.uk/biosamples/docs/references/api/submit#_submit_structured_data

Return type:

list[Biosample]

Returns:

Biosample entity with the structured data

_check_structured_data(structured_data)

Check the structured data is correct using the data models and pydantic. Model used: :cls:`~biobroker.metadata_entity.data_model.StructuredDataModel`

Parameters:

structured_data (dict) – Structured data.

Raises:

StructuredDataError

_submit_errors(response)

Submission errors and how they should be handled. Biosamples returns non-jsonable responses sometimes so this handles the type and display of errors during submission.

Errors being raised:
Parameters:

response (Response) – response obtained during submission. Usually r.status_code > 300

Return type:

None

Returns:

None if no errors are detected.

static _build_search_query(text, attributes)

Build the search query for BSD. Attributes need to be joined. Page=0 is specified to return pagination in the BioSamples API (Non-documented behaviour)

Parameters:
  • text (str) – Free text to search by.

  • attributes (dict) – Dictionary of attributes and values to filter by.

Return type:

str

Returns:

static _is_invalid_for_update(entity)

Checks if the sample is invalid for update.

Parameters:

entity (Biosample) – Sample to be checked.

Return type:

list[str] | bool

Returns:

A list of validation errors or false

Exceptions

parse_checklist_validation_errors(validation_errors)

Parse a checklist validation error and return it in a printable state.

Parameters:

validation_errors (list[dict]) – Response list with validation error dictionaries, containing dataPath and errors.

Returns:

printable string.

exception AccessionHasIncorrectFormat(accession, logger)

Bases: Exception

exception CantBeUpdatedApiError(sample_id, response, logger)

Bases: Exception

exception CantBeUpdatedLocalError(sample_id, reasons, logger)

Bases: Exception

exception BiosamplesValidationError(response_text, logger)

Bases: Exception

exception BiosamplesNoErrorMessageError(status_code, logger)

Bases: Exception

exception ChecklistValidationError(response_text, logger)

Bases: Exception

exception StructuredDataError(logger, errors)

Bases: Exception

exception StructuredDataSubmissionError(logger, response)

Bases: Exception