Example Usage

To use cleanml in a project

import cleanml
print(cleanml.__version__)
0.1.0

Import the required packages

from cleanml.cleanml import make_column_names, remove_spl_chars_in_columns
import pandas as pd

Read and view the data

data = pd.read_csv("../data/ask_a_manager.csv")
data = data[['Timestamp','How old are you?','What industry do you work in?','Job title']]
data.head()
Timestamp How old are you? What industry do you work in? Job title
0 4/27/2021 11:02:10 25-34 Education (Higher Education) Research and Instruction Librarian
1 4/27/2021 11:02:22 25-34 Computing or Tech Change & Internal Communications Manager
2 4/27/2021 11:02:38 25-34 Accounting, Banking & Finance Marketing Specialist
3 4/27/2021 11:02:41 25-34 Nonprofits Program Manager

Prepare a dictionary for input to the function

columns_dict = {
    'How old are you?': 'age',
    'What industry do you work in?': "industry?",
    }

Clean column names

data = make_column_names(data,column_dict=columns_dict)
data = remove_spl_chars_in_columns(data, spl_chars_excepted=['_'])

Validate if the changes are correct

print(data.columns.to_list())
['timestamp', 'age', 'industry', 'job_title']