openspi.utils¶
Functions¶
|
Counts the number of files in a given directory. |
|
Reformat a raw string file path (path\to\folder) to contain double back- |
|
Checks if all the values in a given list are the same. |
|
Appends a note to a given list which marks the index of the subsequent |
|
Checks a given dataframe (in the form of a nested list) to determine if all |
|
Checks if the first match (highest r val) for each sample is plastic, and |
|
Saves a Pandas dataframe to an Excel file. If the Excel file does not yet |
|
Checks if an Excel spreadsheet exists within the (already existing) Excel |
Counts how many times 'empty well' appears in a given dataframe. |
|
|
Adds a 'Notes' sheet with the number of nonpolymer matches and empty wells |
|
|
|
Converts a nested list into a Pandas dataframe, which is then saved to an |
Module Contents¶
- openspi.utils.count_files(folder_path)¶
Counts the number of files in a given directory.
- Parameters:
folder_path – The path to the directory.
- Returns:
The number of files in the directory, or -1 if the directory does not exist.
- openspi.utils.reformat_path(path)¶
Reformat a raw string file path (path\to\folder) to contain double back- slashes (path\\to\\folder), as is required by the R programming language. Doesn’t do anything if the path uses forward slashes (/)
- Parameters:
path (str) – The path to be reformatted.
- Returns:
reformatted_path – All are replaced with \.
- Return type:
str
- openspi.utils.identical_list_items(lst)¶
Checks if all the values in a given list are the same.
- Parameters:
lst (list) – A list of values.
- Returns:
Returns
Trueif all the values are the same,Falseif not.- Return type:
bool
- openspi.utils.plastic_matches_checked(row, index)¶
Appends a note to a given list which marks the index of the subsequent plastic match.
- Parameters:
row (list) – A list containing data for one library match for one file.
index (int) – The index of the subsequent plastic match. (see
subsequent_matches_checked)
- Returns:
row – An updated version of the original
rowvar. It now contains an additional item.- Return type:
list
- openspi.utils.nonpolymer_matches_checked(row, nested_list)¶
Checks a given dataframe (in the form of a nested list) to determine if all matches are empty wells or not.
- Parameters:
row (list) – A list containing data for one library match for one file.
nested_list (list) – A list of lists, where each list in the outer list represents a row.
- Returns:
row – An updated version of the original``row`` var. It now contains an additional item.
- Return type:
list
- openspi.utils.subsequent_matches_checked(df, nested_list)¶
Checks if the first match (highest r val) for each sample is plastic, and if it is not, it checks the subsequent matches for any plastic matches, and if there are, it makes a note of its place (1st, 2nd, etc). If there are no plastic matches, it will check if all subsequent matches are for an empty well and make a note. If that is not the case, it will note that all subsequent matches are ‘nonpolymer.’
- Parameters:
df (df) – A Pandas dataframe.
nested_list (list) – A list of lists, where each list in the outer list represents a row.
- Returns:
nested_list – An updated version of the original``nested_list``. Each inner list is now one item longer.
- Return type:
list
- openspi.utils.save_df_to_excel(excel_path, df, sheetname)¶
Saves a Pandas dataframe to an Excel file. If the Excel file does not yet exist, it will create it and save the sheet.
- Parameters:
excel_path (str) – The full path to an .xlsx file.
df (df) – The dataframe to be saved as an Excel sheet.
sheetname (str) – The name of the Excel sheet.
- Return type:
None.
- openspi.utils.check_excel_sheet(excel_path, sheetname)¶
Checks if an Excel spreadsheet exists within the (already existing) Excel workbook and creates one if not. Note that this differs from
save_df_to_exceland is used for directly editing cells in an Excel spreadsheet.- Parameters:
excel_path (str) – The full path to an already existing .xlsx file.
sheetname (str) – The name of the Excel sheet.
- Returns:
ws – An Excel sheet with the desired
sheetname.- Return type:
openpyxl worksheet object
- openspi.utils.empty_wells_count(df)¶
Counts how many times ‘empty well’ appears in a given dataframe.
- Parameters:
df (df) – A Pandas dataframe.
- Returns:
count – The number of times
'empty well'appears in the'spectrum_identity'column.- Return type:
int
- openspi.utils.matches_checked_sheet(excel_path, nrel=False, n=5)¶
Adds a ‘Notes’ sheet with the number of nonpolymer matches and empty wells
- Parameters:
excel_path (str) – The full path to an .xlsx file.
nrel (bool) – If True, the function will also check polymer count of the first well individually.
n (int) – The number of top matches for each file. Equal to top_n in openspi_main. Default is 5.
- Return type:
None.
- openspi.utils.count_matches(df)¶
- openspi.utils.list_to_df_to_sheet(df_lst, columns_list, excel_path, sheet_name)¶
Converts a nested list into a Pandas dataframe, which is then saved to an Excel worksheet.
- Parameters:
excel_path (str) – The full path to an .xlsx file.
df_lst (list) – The nested list.
columns_list (list) – The list of column names.
sheet_name (str) – The name of the Excel sheet.
- Return type:
None.