tests.system.providers.amazon.aws.example_sagemaker
¶
Module Contents¶
Functions¶
generates a very simple csv dataset with headers |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Attributes¶
- tests.system.providers.amazon.aws.example_sagemaker.PREPROCESS_SCRIPT_TEMPLATE = Multiline-String[source]¶
Show Value
1import boto3 2import numpy as np 3import pandas as pd 4 5def main(): 6 # Load the dataset from {input_path}/input.csv, split it into train/test 7 # subsets, and write them to {output_path}/ for the Processing Operator. 8 9 data = pd.read_csv('{input_path}/input.csv') 10 11 # Split into test and train data 12 data_train, data_test = np.split( 13 data.sample(frac=1, random_state=np.random.RandomState()), [int(0.7 * len(data))] 14 ) 15 16 # Remove the "answers" from the test set 17 data_test.drop(['class'], axis=1, inplace=True) 18 19 # Write the splits to disk 20 data_train.to_csv('{output_path}/train.csv', index=False, header=False) 21 data_test.to_csv('{output_path}/test.csv', index=False, header=False) 22 23 print('Preprocessing Done.') 24 25if __name__ == "__main__": 26 main()
- tests.system.providers.amazon.aws.example_sagemaker.generate_data()[source]¶
generates a very simple csv dataset with headers