Clean Data¶
PreprocessResource.
iClean
(request, sub_analysis_id, **kwargs)¶Prior mandatory Steps: 1) Upload dataset 2) Create Analysis 3) Create sub analysis
At the time of the dataset upload, Intuceo performs some checks into the data and performs cleaning process. This cleaning process involves replacing any special characters with underscore(_) followed by short name (e.g. usd) and another underscore in data. Currency symbols and comma separators are removed in numeric attributes values. Special characters are any thing other than alpha-numerics except underscore which is retained.
Column name cleansing:
- Names are lower cased.
- Replace non-alpha-numeric characters into special identifiers listed below.
Arguments
sub_analysis_id Give sub analysis id Possible errors
Error message Invalid sub analysis id GET Request Example
curl -u username:password {url_prefix}/iclean/{sub_analysis_id}/Response Example
{ "error": false, "error_msg": "", "result": { "missingdata_greater_than_ten_catg": [], "missingdata_greater_than_ten_num": [], "number_of_consi_categorical_attributes": 10, "number_of_consi_numeric_attributes": 7, "number_of_considered_attributes": 17, "number_of_records_removed": 0, "removed_attributes": [], "categorical_uni_attr": {}, "date_text_id_attrs": [], "number_of_categorical_attrs": 0, "number_of_numeric_attrs": 0, "numeric_uni_attr": {}, "orig_colNames_dict": { "age": "age", "balance": "balance", "campaign": "campaign", "contact": "contact", "day": "day", "default": "default", "duration": "duration", "education": "education", "housing": "housing", "job": "job", "loan": "loan", "marital": "marital", "month": "month", "pdays": "pdays", "poutcome": "poutcome", "previous": "previous", "y": "y" }, "percent_missing_cell": 0.0, "target_type_vlaues": { "no": "4000", "yes": "521" }, "total_attributes": 17, "total_categorical_attrs": 10, "total_missing_cell_count": 0, "total_numeric_attrs": 7, "total_records": 4521, "user_defined_missing_percentage_value": 10 } }