Aggregating
This module provides functionalities to make representative curves from data and find statistics for metadata.
Functions
- _generate_filter_permutations(info_table, group_by)- Generates filter permutations for grouping data.
- make_representative_data(ds, info_path, data_dir, repres_col, group_by_keys, interp_by, interp_res, interp_range, group_info_cols)- Creates representative curves from a dataset and saves them to a directory.
- make_representative_info(ds, group_by_keys, group_info_cols)- Creates a table of representative information for each group in a DataSet.
make_representative_data(ds, info_path, data_dir, repres_col, group_by_keys, interp_by, interp_res=200, interp_range='outer', group_info_cols=None)
  Make representative curves of the DataSet and save them to a directory.
This function takes a DataSet, groups it by specific keys, and creates representative curves. The curves are then saved to a specified directory. It is useful for generating aggregated data curves that represent groups of similar tests.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| ds | DataSet | The DataSet to make representative curves from. | required | 
| info_path | str | The path to the info file where the representative information will be saved. | required | 
| data_dir | str | The directory to save the representative curves to. | required | 
| group_by_keys | List[str] | The info columns to group the tests by. | required | 
| repres_col | str | The data column to aggregate for the y-axis of the representative curves. | required | 
| interp_by | str | The data column to interpolate for the x-axis of the representative curves. | required | 
| interp_res | int | The resolution of the interpolation. | 200 | 
| interp_range | Union[str, Tuple[float, float]] | Can be either "outer", "inner", or a tuple of floats, defining the domain on the x-axis for | 'outer' | 
| interpolation | 
 | required | |
| group_info_cols | Optional[List[str]] | The info categories to include in the aggregated info_table. | None | 
Returns:
| Type | Description | 
|---|---|
| None | 
Examples:
Imagine you have performed a series of stress tests on different materials at various temperatures. You have collected all the data in a DataSet and want to create representative stress-strain curves for each combination of material and temperature. Here's how you can use this function:
>>> import paramaterial as pam
>>> ds = pam.DataSet('info/test_info.csv','data/tests')  # Load your dataset
>>> pam.make_representative_data(ds, 'info/representative_info.xlsx', 'data/representative_curves',
>>>                              repres_col='Stress_MPa', group_by_keys=['material', 'temperature'],
interp_by='Strain')
This will create representative curves for each material and temperature group, saving them to the specified directory and information to an Excel file.
Source code in paramaterial\aggregating.py
        | 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |  | 
make_representative_info(ds, group_by_keys, group_info_cols=None)
  Make a table of representative info for each group in a DataSet.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| ds | DataSet | DataSet to make representative info from. | required | 
| group_by_keys | List[str] | Columns to group by and make representative info for. | required | 
| group_info_cols | List[str] | Columns to include in representative info table. | None | 
Returns:
| Type | Description | 
|---|---|
| pd.DataFrame | A pandas DataFrame containing the representative information table. | 
Examples:
To create a summary table that includes specific mechanical properties like Elastic Modulus (E), Proof Stress (PS), Ultimate Tensile Strength (UTS), for each temperature and material type:
>>> import paramaterial as pam
>>> table = pam.make_representative_info(ds, group_by_keys=['temperature', 'material'], group_info_cols=['E', 'PS', 'UTS'])
>>> print(table.head())
The result will be a DataFrame containing representative information for each group, including the mean, standard deviation, maximum, minimum, and 1st and 3rd quartiles of the specified columns.