Output processor
Generic output processor. |
|
TSV output processor. |
|
Excel output processor. |
Output metadata processor. The goal of this module is to take an input list of GenericEntity’s, transform it into a dataframe and save it with pandas functionality into different formats. Pretty simple!
Mandatory arguments:
output_path: Path to the file to save the metadata.
Optional arguments:
- verbose: set to True if you want INFO and above-level logging events. If not set or set to False, only WARNING
and above will be displayed
Subclasses of GenericOutputProcessor must define the following methods/properties:
_save
- class GenericOutputProcessor(output_path, verbose=False)
Bases:
objectGeneric output processor. Defines the mandatory functions for the subclasses to function.
- Parameters:
output_path (
str) – path to save the file. Please include the name and extension of the file.
- save(entities)
Transform the entities into a dataframe to use pandas functionality to save.
- Parameters:
entities (
list[GenericEntity]) – Subclasses of GenericEntity.
- _save(dataframe)
Function to be overriden by subclasses. Takes a dataframe and saves the output into self.path.
- Parameters:
dataframe (
DataFrame) – Dataframe containing the flattened metadata from the GenericEntity subclasses.
- class TsvOutputProcessor(output_path)
Bases:
GenericOutputProcessorTSV output processor. Takes a list of entities and outputs a TSV with the metadata processed.
- Parameters:
output_path (
str) – Path to the file being saved. Please include tsv extension.
- _save(dataframe)
Save the resulting dataframe from
save()into a tsv, using pandas functionality. NO, the delimiter is not customizable. Create another subclass if you want that. TSV means TAB-Separated Values, not comma, not pipes, not anything else. You weirdo.- Parameters:
dataframe (
DataFrame) – Dataframe containing the flattened metadata from the GenericEntity subclasses.
- class XlsxOutputProcessor(output_path, sheet_name='Sheet1')
Bases:
GenericOutputProcessorExcel output processor. Takes a list of entities and outputs an excel file with the metadata processed.
- Parameters:
output_path – Path to the file being saved. Please include ‘.xlsx’ extension.