Add data cleansing to T-SQL scripts for peoples demographic data. Some examples to check for are format for zip code, email address, and proper capitalization. For records with errors add those to the table for InValid records with a comment of what the error is. You can add a column to that table to add the error type. For records that are valid add those to a table for valid records. I have created the tables and have the create scripts for those. I have also created some data cleansing like checking for duplicates. Make recommendations on what further data cleansing should be added. There might be a data cleansing project for AdventureWorks on the web or github you could use that and make any comments on it you think wold be better. Message me and I will send you the scripts I have so far. I'm using AdventureWorks2008R2 and AdventureWorksDW2008R2 from the Microsoft downloads. I will send you the download link if you message me.
I have some web site links that show code for some data cleansing as listed below.
Remove irrelevant data
Standardize capitalization
Convert data type
Clear formatting
Fix errors
Handle missing values
Remove irrelevant data
Remove duplicate data
Fix structural errors
Do type conversion
Handle missing data
Deal with outliers
Standardize/Normalize data
Validate data
Column operations (Select columns, Sort columns, Drop columns, Rename columns)
Handle duplicates (Distinct, Drop duplicate)
Remove spaces (Trim, Left trim, Right trim)
Cleaning Functions for Number fields (Is numeric, Convert, Round, Trunc)
Cleaning Functions for Text fields (Capitalize, Lower, Upper, Sub strings, Reverse, Split string)
Cleaning Functions for Date fields (Cast to date, Convert date, Extract)
Handle missing values (Fill NA, Drop NA, If null)
Coding verification (Encode, Decode)
Any data type (Left pad, Right pad, Length)
Advanced functions to normalize data: JSON Normalize, Pivot, Unpivot
Please reply with:
The time of day that you are available. I am available 7am to 10pm UTC-06:00 Central Time US & Canada. Do you speak English? Are you available to login remote and work on my remote machine. What remote login app do you want to use to work remote. When you can complete the project? I need the project complete in 48 hours.
Confidentiality.
a) No Use. Recipient agrees not to use the Confidential Information in any way.
b) No Disclosure. Recipient agrees to use its best efforts to prevent and protect the Confidential Information, or any part thereof.
c) Protection of Secrecy. Recipient agrees to take all steps reasonably necessary to protect the secrecy of the Confidential Information, and to prevent the Confidential Information from falling into the public domain or into the possession of unauthorized persons.
d) Scope. The scope of Confidentiality is deemed to be in all contracts present past and future with the Parties to this request
I understand you are looking for someone to add data cleansing to T-SQL scripts for peoples demographic data and give recommendations on what further data cleansing should be added. I believe I am the perfect fit for this project due to my extensive knowledge in Data Analysis, Business Intelligence, NLP, DATA WAREHOUSING, Data Visualization (Excel dashboards), SQL and web development.
I am confident that I have the skills necessary to complete this project within 48 hours including data analysis, statistical modeling (using R and Python), data visualization, SPARK with R (SparklyR) and ETL processes. My commitment to confidentiality, protection of secrecy and scope of Confidentiality make me an ideal candidate for this project. Please feel free to contact me if you have any questions or would like additional information about me or my services.
$60 USD in 7 days
0.0 (0 reviews)
0.0
0.0
3 freelancers are bidding on average $40 USD for this job
Hi ,
Thank you for inviting me to apply for this job. I am a very hard worker who’s in serious need of a job. I am willing to do anything to get this project.
I am willing to offer a cheaper price for my services and I would be extremely grateful if you could hire me.
I’ve working in this industry from 11 years and using Latest Azure technologies.
Regards,
Sachin