Hey guys
I’m building a physical 3-node cluster (1 Master, 2 Workers, Docker Swarm) for a backend class. I need to distribute a heavy workload to process massive text/JSON data, but I want the final presentation to be actually funny. No boring corporate data!!!!
I’m looking for ideas on what exactly to analyze. I want to calculate crazy metrics, find weird patterns, etc
I was thinking on:
• Analyzing League of Legends chat logs but it is meh
The dataset needs to be easy to find (Kaggle, Hugging Face, APIs) but large enough to justify parallel processing on a cluster pleaaaase
Any crazy ideas or dataset links? Thanks! 😀
submitted by /u/Much_Palpitation9699
[link] [comments]