I hope this is the best place to ask this question. What would be the best approach to managing a large dataset of about 60 million rows where several columns would need to be manipulated to either find duplicates or to perform calculations on financial columns? The end goal would be to produce a file with no duplicate rows and final figures. Thanks in advance!
submitted by /u/MsVee21
[link] [comments]