openspi.utils
=============

.. py:module:: openspi.utils


Functions
---------

.. autoapisummary::

   openspi.utils.count_files
   openspi.utils.reformat_path
   openspi.utils.identical_list_items
   openspi.utils.plastic_matches_checked
   openspi.utils.nonpolymer_matches_checked
   openspi.utils.subsequent_matches_checked
   openspi.utils.save_df_to_excel
   openspi.utils.check_excel_sheet
   openspi.utils.empty_wells_count
   openspi.utils.matches_checked_sheet
   openspi.utils.count_matches
   openspi.utils.list_to_df_to_sheet


Module Contents
---------------

.. py:function:: count_files(folder_path)

   Counts the number of files in a given directory.

   :param folder_path: The path to the directory.

   :returns: The number of files in the directory, or -1 if the directory does not exist.


.. py:function:: reformat_path(path)

   Reformat a raw string file path (path\\to\\folder) to contain double back-
   slashes (path\\\\to\\\\folder), as is required by the R programming language.
   Doesn't do anything if the path uses forward slashes (/)

   :param path: The path to be reformatted.
   :type path: str

   :returns: **reformatted_path** -- All \ are replaced with \\.
   :rtype: str


.. py:function:: identical_list_items(lst)

   Checks if all the values in a given list are the same.

   :param lst: A list of values.
   :type lst: list

   :returns: Returns ``True`` if all the values are the same, ``False`` if not.
   :rtype: bool


.. py:function:: plastic_matches_checked(row, index)

   Appends a note to a given list which marks the index of the subsequent
   plastic match.

   :param row: A list containing data for one library match for one file.
   :type row: list
   :param index: The index of the subsequent plastic match.
                 (see ``subsequent_matches_checked``)
   :type index: int

   :returns: **row** -- An updated version of the original ``row`` var. It now contains an
             additional item.
   :rtype: list


.. py:function:: nonpolymer_matches_checked(row, nested_list)

   Checks a given dataframe (in the form of a nested list) to determine if all
   matches are empty wells or not.

   :param row: A list containing data for one library match for one file.
   :type row: list
   :param nested_list: A list of lists, where each list in the outer list represents a row.
   :type nested_list: list

   :returns: **row** -- An updated version of the original``row`` var. It now contains an
             additional item.
   :rtype: list


.. py:function:: subsequent_matches_checked(df, nested_list)

   Checks if the first match (highest r val) for each sample is plastic, and
   if it is not, it checks the subsequent matches for any plastic matches, and
   if there are, it makes a note of its place (1st, 2nd, etc). If there are no
   plastic matches, it will check if all subsequent matches are for an empty
   well and make a note. If that is not the case, it will note that all
   subsequent matches are 'nonpolymer.'

   :param df: A Pandas dataframe.
   :type df: df
   :param nested_list: A list of lists, where each list in the outer list represents a row.
   :type nested_list: list

   :returns: **nested_list** -- An updated version of the original``nested_list``. Each inner list is
             now one item longer.
   :rtype: list


.. py:function:: save_df_to_excel(excel_path, df, sheetname)

   Saves a Pandas dataframe to an Excel file. If the Excel file does not yet
   exist, it will create it and save the sheet.

   :param excel_path: The full path to an .xlsx file.
   :type excel_path: str
   :param df: The dataframe to be saved as an Excel sheet.
   :type df: df
   :param sheetname: The name of the Excel sheet.
   :type sheetname: str

   :rtype: None.


.. py:function:: check_excel_sheet(excel_path, sheetname)

   Checks if an Excel spreadsheet exists within the (already existing) Excel
   workbook and creates one if not. Note that this differs from
   ``save_df_to_excel`` and is used for directly editing cells in an Excel
   spreadsheet.

   :param excel_path: The full path to an already existing .xlsx file.
   :type excel_path: str
   :param sheetname: The name of the Excel sheet.
   :type sheetname: str

   :returns: **ws** -- An Excel sheet with the desired ``sheetname``.
   :rtype: openpyxl worksheet object


.. py:function:: empty_wells_count(df)

   Counts how many times 'empty well' appears in a given dataframe.

   :param df: A Pandas dataframe.
   :type df: df

   :returns: **count** -- The number of times ``'empty well'`` appears in the
             ``'spectrum_identity'`` column.
   :rtype: int


.. py:function:: matches_checked_sheet(excel_path, nrel=False, n=5)

   Adds a 'Notes' sheet with the number of nonpolymer matches and empty wells

   :param excel_path: The full path to an .xlsx file.
   :type excel_path: str
   :param nrel: If True, the function will also check polymer count of the first well
                individually.
   :type nrel: bool
   :param n: The number of top matches for each file. Equal to `top_n` in
             `openspi_main`. Default is 5.
   :type n: int

   :rtype: None.


.. py:function:: count_matches(df)

.. py:function:: list_to_df_to_sheet(df_lst, columns_list, excel_path, sheet_name)

   Converts a nested list into a Pandas dataframe, which is then saved to an
   Excel worksheet.

   :param excel_path: The full path to an .xlsx file.
   :type excel_path: str
   :param df_lst: The nested list.
   :type df_lst: list
   :param columns_list: The list of column names.
   :type columns_list: list
   :param sheet_name: The name of the Excel sheet.
   :type sheet_name: str

   :rtype: None.