Preparing
Functions to be used for preparing the experimental data for batch processing.
check_column_headers(data_dir, exception_headers=None)
Check that all files in a directory have the same column headers and that column headers don't contain spaces. A ValueError will be raised if the column headers don't match or if a column header contains a space.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dir |
str
|
Path to the directory containing the files to be checked. |
required |
exception_headers |
List[str]
|
List of column headers that are allowed to be different between files. |
None
|
Source code in paramaterial\preparing.py
check_for_duplicate_files(data_dir)
Check that there are no duplicate files in a directory by hashing the contents of the files. A ValueError will be raised if there are duplicate files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dir |
str
|
Path to the directory containing the files to be checked. |
required |
Source code in paramaterial\preparing.py
check_formatting(ds)
Check that the formatting of the data is correct. This includes checking that the column headers are the same in all files, that the column headers don't contain spaces, and that there are no duplicate files. A ValueError will be raised if any of these conditions are not met.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds |
DataSet
|
DataSet object containing the data to be checked. |
required |
Source code in paramaterial\preparing.py
convert_gleeble_output_files_to_csv(directory_path)
Convert all files in a directory from Gleeble output format to csv format.
Source code in paramaterial\preparing.py
copy_data_and_rename_by_test_id(data_in, data_out, info_table, test_id_col='test_id')
Rename files in data directory by test_id in info table and copy to new directory. The info_table must have a column named 'old_filename' containing the original filenames and a column named 'test_id'. The new filenames will be the test_ids with the extension '.csv'.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_in |
str
|
Path to the directory containing the data to be copied. |
required |
data_out |
str
|
Path to the directory where the data will be copied. |
required |
info_table |
pd.DataFrame
|
DataFrame containing the metadata for the tests. |
required |
test_id_col |
Column in the info table containing the test_ids. |
'test_id'
|
Returns:
| Type | Description |
|---|---|
None |
Source code in paramaterial\preparing.py
experimental_matrix(info_table, index, columns, as_heatmap=False, title=None, xlabel=None, ylabel=None, tick_params=None, **kwargs)
Make an experimental matrix showing the distribution of test across metadata categories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
info_table |
pd.DataFrame
|
DataFrame containing the metadata for the tests. |
required |
index |
Union[str, List[str]]
|
Column(s) of the info_table to use as the index of the matrix. |
required |
columns |
Union[str, List[str]]
|
Column(s) of the info_table to use as the columns of the matrix. |
required |
as_heatmap |
bool
|
If True, return a heatmap of the matrix. If False, return the matrix as a DataFrame. |
False
|
title |
str
|
Title of the heatmap. |
None
|
xlabel |
str
|
Label for the x-axis of the heatmap. |
None
|
ylabel |
str
|
Label for the y-axis of the heatmap. |
None
|
tick_params |
Dict
|
Parameters to pass to the ax.tick_params method of matplotlib. |
None
|
**kwargs |
Additional keyword arguments to pass to the heatmap function. |
{}
|
Returns:
| Type | Description |
|---|---|
Union[pd.DataFrame, plt.Axes]
|
If as_heatmap is False, returns a DataFrame of the experimental matrix. |
Union[pd.DataFrame, plt.Axes]
|
If as_heatmap is True, returns a heatmap of the experimental matrix. |