Hi I’m a developer working on a project, not sure if this is the right place, but thought I’d ask.
This project has a core business feature where pricing is tied to a vehicle’s category. That way the user can price out packages accordingly based on vehicle type.
Here is where the problems begin. I usually use the NHTSA for vehicle data, public fast, free, but it’s not complete enough. It returns ambiguous ‘types’ like ‘mpv,bus,truck,car’ rather then sedan, suv, exotic, etc.
I then tried the EPA fuel economy dataset, as it had 12,000 rows, was in csv format for easy parsing etc. But this proved to also be too incomplete, wouldn’t have newer vehicles like a 2024 3/4 ton trucks and more.
For speed, I made my own sort of ‘source of truth’ table in my database which runs a populate job to seed, but still I need a clean reliable data source to actually run this job through. I can get by with the NHTSA data for the time being, but a more complete solution is necessary for scale.
submitted by /u/Square-Display555
[link] [comments]