How To Improve Dataset Quality For A Machine Learning Forecast Project

I have a dataset composed by IT ticket logs from 2020 to 2023. I have structured the columns as it follows: day, month, year, holiday(0 if its not a holiday and 1 if it is) name of the day(1 to 7), hour of the day(0 to 23), bank campaign (just for July and December, bonus and finally the number of tickets per day and hour. When I organize the logs only by date, the dataset is composed by 1014 logs. If I add the hour attribute, the dataset ends with 6000 logs. I want to train ML algorithms (random forest and lstm) to forecast the number of IT tickets for a certain time (hour) and date but my metrics are underperforming. I’d like to know if there’s a way to improve my metrics? Could it be related to the algorithms? How could I improve the quality of my dataset?(if that’s even possible)

Thanks in advance for your help!

submitted by /u/CheisonVS
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *