Example Usage¶
To use cleanml in a project
import cleanml
print(cleanml.__version__)
0.1.0
Import the required packages¶
from cleanml.cleanml import make_column_names
import pandas as pd
Read and view the data¶
data = pd.read_csv("../data/ask_a_manager.csv")
data = data[['Timestamp','How old are you?','What industry do you work in?','Job title']]
data.head()
| Timestamp | How old are you? | What industry do you work in? | Job title | |
|---|---|---|---|---|
| 0 | 4/27/2021 11:02:10 | 25-34 | Education (Higher Education) | Research and Instruction Librarian |
| 1 | 4/27/2021 11:02:22 | 25-34 | Computing or Tech | Change & Internal Communications Manager |
| 2 | 4/27/2021 11:02:38 | 25-34 | Accounting, Banking & Finance | Marketing Specialist |
| 3 | 4/27/2021 11:02:41 | 25-34 | Nonprofits | Program Manager |
Prepare a dictionary for input to the function¶
columns_dict = {
'How old are you?': 'age',
'What industry do you work in?': "industry",
}
Clean column names¶
data = make_column_names(data,column_dict=columns_dict)
Validate if the changes are correct¶
print(data.columns.to_list())
['timestamp', 'age', 'industry', 'job_title']