{ "cells": [ { "cell_type": "markdown", "id": "a569fb6a", "metadata": {}, "source": [ "# Submit to biosamples\n", "\n", "This notebook will serve to show how to use this library, in the simplest way:\n", "\n", "- Before you start\n", " - Generate a valid input file with metadata about 1 sample (A very, very simple TSV)\n", "- What components do we need\n", "- What do we need to input to each component\n", "- How to correct our samples before submission\n", "- How to submit\n", "- See the results in Excel (or TSV, your decision)\n", "\n", "This is the simplest example; in other notebooks, we will explore how to submit multiple samples, with relationships defined amongst them, how to validate our samples against ENA checklists, and how to transform input data.\n", "\n", "## Before you start\n", "\n", "- Please make sure you have `python 3.10` or higher\n", "- Please make sure you have a webin acount set up in [webin-dev](https://wwwdev.ebi.ac.uk/ena/submit/webin/login)\n", "- Please make sure you have the latest biobroker library installed: `pip install biobroker`" ] }, { "cell_type": "code", "execution_count": 1, "id": "d0c20907", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting biobroker==0.0.4\n", " Downloading biobroker-0.0.4-py3-none-any.whl.metadata (42 kB)\n", "Requirement already satisfied: numpy~=1.23.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from biobroker==0.0.4) (1.23.1)\n", "Requirement already satisfied: openpyxl==3.1.2 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from biobroker==0.0.4) (3.1.2)\n", "Requirement already satisfied: pandas~=2.0.3 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from biobroker==0.0.4) (2.0.3)\n", "Requirement already satisfied: progressbar2~=4.4.2 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from biobroker==0.0.4) (4.4.2)\n", "Requirement already satisfied: requests>=2.31.0 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from biobroker==0.0.4) (2.31.0)\n", "Requirement already satisfied: et-xmlfile in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from openpyxl==3.1.2->biobroker==0.0.4) (1.1.0)\n", "Requirement already satisfied: python-dateutil>=2.8.2 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from pandas~=2.0.3->biobroker==0.0.4) (2.9.0.post0)\n", "Requirement already satisfied: pytz>=2020.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from pandas~=2.0.3->biobroker==0.0.4) (2024.2)\n", "Requirement already satisfied: tzdata>=2022.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from pandas~=2.0.3->biobroker==0.0.4) (2024.2)\n", "Requirement already satisfied: python-utils>=3.8.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from progressbar2~=4.4.2->biobroker==0.0.4) (3.9.0)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from requests>=2.31.0->biobroker==0.0.4) (3.3.2)\n", "Requirement already satisfied: idna<4,>=2.5 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from requests>=2.31.0->biobroker==0.0.4) (3.10)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from requests>=2.31.0->biobroker==0.0.4) (2.2.3)\n", "Requirement already satisfied: certifi>=2017.4.17 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from requests>=2.31.0->biobroker==0.0.4) (2024.8.30)\n", "Requirement already satisfied: six>=1.5 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas~=2.0.3->biobroker==0.0.4) (1.16.0)\n", "Requirement already satisfied: typing-extensions>3.10.0.2 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from python-utils>=3.8.1->progressbar2~=4.4.2->biobroker==0.0.4) (4.12.2)\n", "Downloading biobroker-0.0.4-py3-none-any.whl (55 kB)\n", "Installing collected packages: biobroker\n", "Successfully installed biobroker-0.0.4\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install --upgrade biobroker" ] }, { "cell_type": "markdown", "id": "5c5df12c", "metadata": {}, "source": [ "### Generate input file\n", "\n", "I don't want to have an example file of this kind in the examples, so, let's generate it ourselves! Let's do a simple example with 2 attributes: \"name\" and \"collected_at\"" ] }, { "cell_type": "code", "execution_count": 2, "id": "c4803020", "metadata": {}, "outputs": [], "source": [ "sample_tsv = [\n", " [\"name\", \"collected_at\"],\n", " [\"sumple\", \"noon\"] \n", "]\n", "\n", "writable_sample = \"\\n\".join([\"\\t\".join(row) for row in sample_tsv])\n", "with open(\"simple_sample_sumple.tsv\", \"w\") as f:\n", " f.write(writable_sample)" ] }, { "cell_type": "markdown", "id": "e4dd82cd", "metadata": {}, "source": [ "\"name\" is a especial property in biosamples. We'll talk about it later; for now, just remember that this property has to always be set up.\n", "\n", "## What components do we need\n", "\n", "Given that you've read the documentation in the main page (I know you've done it, I wrote it with lots of love) you will know by now that, in order to submit, we need:\n", "- An authenticator: To authenticate ourselves to the archive\n", "- An api: To store/execute the instructions to submit to the archive\n", "- At least 1 metadata entity: To store the metadata about our samples\n", "- An input processor: To process the input file with the metadata\n", "\n", "Additionally, I will also import an output processor; Not needed, but we will be able to save our brokering results to a very nice, very demure and readable excel file.\n", "\n", "Additionally number 2: Metadata entities have a very nice thing, they have a `static method` (That just means you can call the method without creating an instance) that gives you guidelines on how to fill out the metadata. Lovely, eh?" ] }, { "cell_type": "code", "execution_count": 3, "id": "cc118db6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A Biosamples entity MUST have the following properties set:\n", "\t- name: a descriptive title for the sample\n", "\t- organism: a string that validates against NCBITaxon records \n", "\t- release: date of release for the metadata of the entity, in YYYY-MM-DD format. Accepts iso format\n", "For more information, please see https://www.ebi.ac.uk/biosamples/docs/references/api/submit#_submission_minimal_fields.\n", "\n", "To indicate relationships in the samples, please use a field named after the relationshipitself: namely, 'derived_from', 'same_as', 'has_member' or 'child_of'.\n", "Please seehttps://www.ebi.ac.uk/biosamples/docs/guides/relationships\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2024-10-16 21:36:25,593 - Biosample - ERROR - Metadata content has failed validation for 'sumple':\n", "\t- root: Missing mandatory field 'release'. Provided value: '{'characteristics': {'collected_at': [{'text': 'noon'}]}, 'name': 'sumple'}'\n", "\t- characteristics: Value error, 'organism' must be set. Please use the keys 'organism', 'Organism', 'species' or 'Species'. Provided value: '{'collected_at': [{'text': 'noon'}]}'\n", "2024-10-16 21:37:42,280 - BsdApi - INFO - Set up BSD API successfully: using base uri 'https://wwwdev.ebi.ac.uk/biosamples/samples'\n" ] } ], "source": [ "from biobroker.authenticator import WebinAuthenticator # Biosamples uses the WebinAuthenticator\n", "from biobroker.api import BsdApi # BioSamples Database (BSD) API\n", "from biobroker.metadata_entity import Biosample # The metadata entity\n", "from biobroker.input_processor import TsvInputProcessor # An input processor\n", "from biobroker.output_processor import XlsxOutputProcessor # An output processor\n", "\n", "print(Biosample.guidelines())" ] }, { "cell_type": "markdown", "id": "bc0b4d25", "metadata": {}, "source": [ "Now we have imported everything! See how easy it is?\n", "\n", "You may have noticed that, aside from the `name`, there are 2 other mandatory fields:\n", "- organism: Biosamples requires for the samples to identify which is the taxonomic classification for the organism the sample comes from. This is not important now - It will throw an error later and we will correct it.\n", "- release: As with any archive, the [meta]data can be stored as private for an amount of time. This sets the release date. We need to set it up to the second, but `biobroker` provides a relaxed parser, so we'll set it for today.\n", "\n", "## How to set up each component\n", "\n", "Alrighty! Let's start setting up the components:\n", "\n", "### 1. Set up the input processor\n", "\n", "For the input processor, we just need to give the path to the input file :)" ] }, { "cell_type": "code", "execution_count": 4, "id": "76f31163", "metadata": {}, "outputs": [], "source": [ "path = \"simple_sample_sumple.tsv\" # This is the file we created previously\n", "\n", "input_processor = TsvInputProcessor(input_data=path)" ] }, { "cell_type": "markdown", "id": "a72980fb", "metadata": {}, "source": [ "Let's check it out!" ] }, { "cell_type": "code", "execution_count": 5, "id": "f57e97eb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'name': 'sumple', 'collected_at': 'noon'}]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "input_processor.input_data" ] }, { "cell_type": "markdown", "id": "649f18e5", "metadata": {}, "source": [ "There's another functionality of the input processors: the `transform` function. I will discuss it in further notebooks!\n", "\n", "For now, we have the input processor set up. Cool!\n", "\n", "### 2. Set up the samples\n", "\n", "Now that we have the data in an object... where does that go?\n", "\n", "Well, it's as simple as: the input processors have a method called `process`. You give, as an input to this function, the class of `metadata_entity`s that you want to create, and it returns a list of those entities created from the `.input_data`. If you want to see more documentation on that, refer to [ReadTheDocs](https://biobroker.readthedocs.io/en/latest/biobroker.input_processor.html#biobroker.input_processor.input_processor.GenericInputProcessor.process)" ] }, { "cell_type": "code", "execution_count": 6, "id": "01fa6a9e", "metadata": {}, "outputs": [ { "ename": "EntityValidationError", "evalue": "Metadata content has failed validation for 'sumple':\n\t- root: Missing mandatory field 'release'. Provided value: '{'characteristics': {'collected_at': [{'text': 'noon'}]}, 'name': 'sumple'}'\n\t- characteristics: Value error, 'organism' must be set. Please use the keys 'organism', 'Organism', 'species' or 'Species'. Provided value: '{'collected_at': [{'text': 'noon'}]}'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mEntityValidationError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[6], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m my_sample \u001b[38;5;241m=\u001b[39m \u001b[43minput_processor\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mprocess\u001b[49m\u001b[43m(\u001b[49m\u001b[43mBiosample\u001b[49m\u001b[43m)\u001b[49m \u001b[38;5;66;03m# We're giving it a Biosample class to process\u001b[39;00m\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28mprint\u001b[39m(my_sample)\n", "File \u001b[0;32m~/PycharmProjects/biobroker/biobroker/input_processor/input_processor.py:55\u001b[0m, in \u001b[0;36mGenericInputProcessor.process\u001b[0;34m(self, entity)\u001b[0m\n\u001b[1;32m 53\u001b[0m entities \u001b[38;5;241m=\u001b[39m []\n\u001b[1;32m 54\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m json_entity \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39minput_data:\n\u001b[0;32m---> 55\u001b[0m new_entity \u001b[38;5;241m=\u001b[39m \u001b[43mentity\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmetadata_content\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdeepcopy\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_entity\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 56\u001b[0m entities\u001b[38;5;241m.\u001b[39mappend(new_entity)\n\u001b[1;32m 57\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m entities\n", "File \u001b[0;32m~/PycharmProjects/biobroker/biobroker/metadata_entity/metadata_entity.py:154\u001b[0m, in \u001b[0;36mBiosample.__init__\u001b[0;34m(self, metadata_content, data_model, delimiter, verbose)\u001b[0m\n\u001b[1;32m 151\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__init__\u001b[39m(\u001b[38;5;28mself\u001b[39m, metadata_content: \u001b[38;5;28mdict\u001b[39m, data_model: Type[BaseModel] \u001b[38;5;241m=\u001b[39m BiosampleGeneralModel,\n\u001b[1;32m 152\u001b[0m delimiter: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m||\u001b[39m\u001b[38;5;124m\"\u001b[39m, verbose: \u001b[38;5;28mbool\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mFalse\u001b[39;00m):\n\u001b[1;32m 153\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mdelimiter \u001b[38;5;241m=\u001b[39m delimiter\n\u001b[0;32m--> 154\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mmetadata_content\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdata_model\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdata_model\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mverbose\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/PycharmProjects/biobroker/biobroker/metadata_entity/metadata_entity.py:35\u001b[0m, in \u001b[0;36mGenericEntity.__init__\u001b[0;34m(self, metadata_content, data_model, verbose)\u001b[0m\n\u001b[1;32m 33\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_entity \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 34\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mentity \u001b[38;5;241m=\u001b[39m metadata_content\n\u001b[0;32m---> 35\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidate\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata_model\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdata_model\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/PycharmProjects/biobroker/biobroker/metadata_entity/metadata_entity.py:82\u001b[0m, in \u001b[0;36mGenericEntity.validate\u001b[0;34m(self, data_model)\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mentity \u001b[38;5;241m=\u001b[39m json\u001b[38;5;241m.\u001b[39mloads(data_model(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mentity)\u001b[38;5;241m.\u001b[39mmodel_dump_json(exclude_unset\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, by_alias\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m))\n\u001b[1;32m 81\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m pydantic_core\u001b[38;5;241m.\u001b[39mValidationError \u001b[38;5;28;01mas\u001b[39;00m pydantic_error:\n\u001b[0;32m---> 82\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m EntityValidationError(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlogger, entity_id\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mid, errors\u001b[38;5;241m=\u001b[39mpydantic_error\u001b[38;5;241m.\u001b[39merrors()) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n", "\u001b[0;31mEntityValidationError\u001b[0m: Metadata content has failed validation for 'sumple':\n\t- root: Missing mandatory field 'release'. Provided value: '{'characteristics': {'collected_at': [{'text': 'noon'}]}, 'name': 'sumple'}'\n\t- characteristics: Value error, 'organism' must be set. Please use the keys 'organism', 'Organism', 'species' or 'Species'. Provided value: '{'collected_at': [{'text': 'noon'}]}'" ] } ], "source": [ "my_sample = input_processor.process(Biosample) # We're giving it a Biosample class to process\n", "print(my_sample)" ] }, { "cell_type": "markdown", "id": "587eef34-fd97-456e-8f57-7997e5b1b09f", "metadata": {}, "source": [ "As we can see, the library is already complaining; when setting up a BioSample entity, it gets validated against a BioSample general model generated with pydantic. This model requires the release date and the organism to be set up. Let's dissect one of the messages:\n", "\n", "- root: Missing mandatory field 'release'. Provided value: '{'characteristics': {'collected_at': [{'text': 'noon'}]}, 'name': 'sumple'}'\n", "\n", "This message is composed of three parts:\n", "- `root:`: This indicates where the error happened. In this case, it happened at the root of the sample metadata.\n", "- `Missing mandatory field 'release'.`: This is the error message. It's telling you it's missing a mandatory field, and the name of the field\n", "- `Provided value: '{'characteristics': {'collected_at': [{'text': 'noon'}]}, 'name': 'sumple'}'`: This is telling you what you provided at the level at which the error happened. For missing characteristics, it's not super useful, but for incorrect ones it reminds you what was the value you sent.\n", "\n", "Let's fix this and process the samples again! \n", "\n", "(Please note: it is way simpler to fix the input data from the source, but I am not going to create another section with the exact same steps and just modifying the tsv)" ] }, { "cell_type": "code", "execution_count": 7, "id": "9b1e5958-2499-47a0-8875-e0394e8a364b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n" ] } ], "source": [ "input_processor.input_data[0]['release'] = \"2024-10-13\"\n", "input_processor.input_data[0]['organism'] = \"Homo sapiens\" # Characteristics are handled automatically. Fields are set by the Biosample.\n", "\n", "my_sample = input_processor.process(Biosample)\n", "print(my_sample)" ] }, { "cell_type": "markdown", "id": "968305c3", "metadata": {}, "source": [ "It's... a list of objects?\n", "\n", "Yup! The `process` function always returns a list of objects (`Biosample` entities in this case). This makes writing against the output much easier, as you don't need to handle methods to work against a list or a single entity. Don't be lazy - Write against the list!\n", "\n", "(Also, let's see how the metadata inside has been transformed)" ] }, { "cell_type": "code", "execution_count": 8, "id": "44aa1d6a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'characteristics': {'collected_at': [{'text': 'noon'}], 'organism': [{'text': 'Homo sapiens'}]}, 'name': 'sumple', 'release': '2024-10-13T00:00:00Z'}\n" ] } ], "source": [ "print(my_sample[0].entity)" ] }, { "cell_type": "markdown", "id": "6025d8f4", "metadata": {}, "source": [ "Now the sample is set up! See how it has re-structured the metadata?\n", "\n", "This is the format that biosamples expects their metadata to be. You don't need to understand everything - Just know, there are certain keywords (e.g. name) that get treated differently, and everything else is stored under `characteristics`. You can review the list of properties in the RTD docs: [ROOT PROPERTIES](https://biobroker.readthedocs.io/en/latest/biobroker.metadata_entity.html#biobroker.metadata_entity.metadata_entity.ROOT_PROPERTIES)\n", "\n", "### 3. Setting up the authenticator + API\n", "\n", "Now, we need to set up the authenticator and the API. For this example, we're going to use BioSamples dev - the testing environment.\n", "\n", "For that, we will set up an environment variable, `API_ENVIRONMENT`, and we will provide the authenticator with our webin-dev username and password." ] }, { "cell_type": "code", "execution_count": 9, "id": "92f3d5f6", "metadata": {}, "outputs": [], "source": [ "import os\n", "os.environ['API_ENVIRONMENT'] = \"dev\" # There are multiple ways to set up environment variables\n", "\n", "username = \"\" # Your username goes here\n", "password = \"\" # Your password goes here\n", "authenticator = WebinAuthenticator(username=username, password=password)\n", "\n", "api = BsdApi(authenticator=authenticator)" ] }, { "cell_type": "markdown", "id": "49e70db7", "metadata": {}, "source": [ "For your password and username in a workflow environment, I would recommend either to set them up as environment variables and load them in your script, or use a config file that you're sure it's not going to be pushed to the repository. Be mindful!\n", "\n", "Now that we have everything set up, let's try to submit!\n", "\n", "### 4. Submitting your sample\n", "\n", "This step is very easy - Since we've done everything, we just need to hit submit on the API object and pass the samples we generated!" ] }, { "cell_type": "code", "execution_count": 10, "id": "6775d9e4", "metadata": {}, "outputs": [], "source": [ "submitted_samples = api.submit(my_sample)" ] }, { "cell_type": "markdown", "id": "b97ed239", "metadata": {}, "source": [ "One very cool thing is that `metadata_entity` objects are set up as dictionaries. The same way you would add a key:value pair to a dictionary, you can do the same with a metadata entity - And the entity will handle where and how to write it.\n", "\n", "Let's see the content!" ] }, { "cell_type": "code", "execution_count": 11, "id": "aa8e2693", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'characteristics': {'SRA accession': [{'text': 'ERS31055558'}], 'collected_at': [{'text': 'noon'}], 'organism': [{'text': 'Homo sapiens'}]}, 'name': 'sumple', 'accession': 'SAMEA131421913', 'release': '2024-10-13T00:00:00Z', 'sraAccession': 'ERS31055558', 'webinSubmissionAccountId': 'Webin-64342', 'taxId': 9606, 'status': 'PUBLIC', 'update': '2024-10-16T20:37:46.178Z', 'submitted': '2024-10-16T20:37:46.178Z', 'submittedVia': 'JSON_API', 'create': '2024-10-16T20:37:46.178Z', '_links': {'self': {'href': 'https://wwwdev.ebi.ac.uk/biosamples/samples'}, 'curationDomain': {'href': 'https://wwwdev.ebi.ac.uk/biosamples/samples{?curationdomain}', 'templated': True}, 'curationLinks': {'href': 'https://wwwdev.ebi.ac.uk/biosamples/samples/SAMEA131421913/curationlinks'}, 'curationLink': {'href': 'https://wwwdev.ebi.ac.uk/biosamples/samples/SAMEA131421913/curationlinks/{hash}', 'templated': True}, 'structuredData': {'href': 'https://wwwdev.ebi.ac.uk/biosamples/structureddata/SAMEA131421913'}}}\n" ] } ], "source": [ "print(submitted_samples[0].entity)" ] }, { "cell_type": "markdown", "id": "1b46b4b9", "metadata": {}, "source": [ "And it's submitted! if you want to see it, it's already available in biosamples dev:" ] }, { "cell_type": "code", "execution_count": 12, "id": "e6a8ebe2", "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "https://wwwdev.ebi.ac.uk/biosamples/samples/SAMEA131421913\n" ] } ], "source": [ "print(f\"https://wwwdev.ebi.ac.uk/biosamples/samples/{submitted_samples[0]['accession']}\")" ] }, { "cell_type": "markdown", "id": "a6956141", "metadata": {}, "source": [ "(You may need to wait a bit - Biosamples dev operates a bit slower, as it's normal for testing grounds. It may take a while to make the sample public)\n", "\n", "Let's create an output file so you can be happy with your local version of the metadata!" ] }, { "cell_type": "code", "execution_count": 13, "id": "bf18c202-0aee-4d73-b1ee-e67e908ba513", "metadata": {}, "outputs": [], "source": [ "from biobroker.output_processor import XlsxOutputProcessor\n", "output_processor = XlsxOutputProcessor(output_path=\"simple_sample_submitted.xlsx\", sheet_name=\"Awesome submission\")\n", "output_processor.save(submitted_samples)" ] }, { "cell_type": "markdown", "id": "bbfb7719", "metadata": {}, "source": [ "And you should see something like this! Isn't this demure?\n" ] }, { "cell_type": "markdown", "id": "d6598626-8c43-45f2-ad9c-c1f71992b1a6", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "7d1a23ad-2099-4937-9810-c2e1002b39eb", "metadata": {}, "source": [ "### 5. Enjoy!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 5 }