Help Needed: Merging 3 Datasets For Junior Data Engineer Assignment

Hi everyone,

I’m currently working on an assignment for a Junior Data Engineer role, and I could use some guidance. The task involves merging three datasets from different sources (Facebook, Google, and Company Website) into one comprehensive dataset. The columns I’m focusing on are:

Domain (most reliable) Phone Number (second most reliable) Name Category Address

I’ve mostly cleaned the datasets, but I need to merge them accurately. My main goals are to:

Merge the datasets using one or two columns (Domain and Phone Number). Ensure no overlap in information and that each row complements itself to create the most accurate and reliable data.

Could anyone suggest the best steps to take for this process? Should I use tools like Power Query or MySQL? Any recommendations for tutorials or YouTube videos would also be greatly appreciated.

Thanks in advance for your help!

submitted by /u/FortaDeMunca
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *