openspi.utils ============= .. py:module:: openspi.utils Functions --------- .. autoapisummary:: openspi.utils.count_files openspi.utils.reformat_path openspi.utils.identical_list_items openspi.utils.plastic_matches_checked openspi.utils.nonpolymer_matches_checked openspi.utils.subsequent_matches_checked openspi.utils.save_df_to_excel openspi.utils.check_excel_sheet openspi.utils.empty_wells_count openspi.utils.matches_checked_sheet openspi.utils.count_matches openspi.utils.list_to_df_to_sheet Module Contents --------------- .. py:function:: count_files(folder_path) Counts the number of files in a given directory. :param folder_path: The path to the directory. :returns: The number of files in the directory, or -1 if the directory does not exist. .. py:function:: reformat_path(path) Reformat a raw string file path (path\\to\\folder) to contain double back- slashes (path\\\\to\\\\folder), as is required by the R programming language. Doesn't do anything if the path uses forward slashes (/) :param path: The path to be reformatted. :type path: str :returns: **reformatted_path** -- All \ are replaced with \\. :rtype: str .. py:function:: identical_list_items(lst) Checks if all the values in a given list are the same. :param lst: A list of values. :type lst: list :returns: Returns ``True`` if all the values are the same, ``False`` if not. :rtype: bool .. py:function:: plastic_matches_checked(row, index) Appends a note to a given list which marks the index of the subsequent plastic match. :param row: A list containing data for one library match for one file. :type row: list :param index: The index of the subsequent plastic match. (see ``subsequent_matches_checked``) :type index: int :returns: **row** -- An updated version of the original ``row`` var. It now contains an additional item. :rtype: list .. py:function:: nonpolymer_matches_checked(row, nested_list) Checks a given dataframe (in the form of a nested list) to determine if all matches are empty wells or not. :param row: A list containing data for one library match for one file. :type row: list :param nested_list: A list of lists, where each list in the outer list represents a row. :type nested_list: list :returns: **row** -- An updated version of the original``row`` var. It now contains an additional item. :rtype: list .. py:function:: subsequent_matches_checked(df, nested_list) Checks if the first match (highest r val) for each sample is plastic, and if it is not, it checks the subsequent matches for any plastic matches, and if there are, it makes a note of its place (1st, 2nd, etc). If there are no plastic matches, it will check if all subsequent matches are for an empty well and make a note. If that is not the case, it will note that all subsequent matches are 'nonpolymer.' :param df: A Pandas dataframe. :type df: df :param nested_list: A list of lists, where each list in the outer list represents a row. :type nested_list: list :returns: **nested_list** -- An updated version of the original``nested_list``. Each inner list is now one item longer. :rtype: list .. py:function:: save_df_to_excel(excel_path, df, sheetname) Saves a Pandas dataframe to an Excel file. If the Excel file does not yet exist, it will create it and save the sheet. :param excel_path: The full path to an .xlsx file. :type excel_path: str :param df: The dataframe to be saved as an Excel sheet. :type df: df :param sheetname: The name of the Excel sheet. :type sheetname: str :rtype: None. .. py:function:: check_excel_sheet(excel_path, sheetname) Checks if an Excel spreadsheet exists within the (already existing) Excel workbook and creates one if not. Note that this differs from ``save_df_to_excel`` and is used for directly editing cells in an Excel spreadsheet. :param excel_path: The full path to an already existing .xlsx file. :type excel_path: str :param sheetname: The name of the Excel sheet. :type sheetname: str :returns: **ws** -- An Excel sheet with the desired ``sheetname``. :rtype: openpyxl worksheet object .. py:function:: empty_wells_count(df) Counts how many times 'empty well' appears in a given dataframe. :param df: A Pandas dataframe. :type df: df :returns: **count** -- The number of times ``'empty well'`` appears in the ``'spectrum_identity'`` column. :rtype: int .. py:function:: matches_checked_sheet(excel_path, nrel=False, n=5) Adds a 'Notes' sheet with the number of nonpolymer matches and empty wells :param excel_path: The full path to an .xlsx file. :type excel_path: str :param nrel: If True, the function will also check polymer count of the first well individually. :type nrel: bool :param n: The number of top matches for each file. Equal to `top_n` in `openspi_main`. Default is 5. :type n: int :rtype: None. .. py:function:: count_matches(df) .. py:function:: list_to_df_to_sheet(df_lst, columns_list, excel_path, sheet_name) Converts a nested list into a Pandas dataframe, which is then saved to an Excel worksheet. :param excel_path: The full path to an .xlsx file. :type excel_path: str :param df_lst: The nested list. :type df_lst: list :param columns_list: The list of column names. :type columns_list: list :param sheet_name: The name of the Excel sheet. :type sheet_name: str :rtype: None.