Trying to help and see if I can manage to match you with people, organization or companies that might have the dataset you’re looking for 🙂
submitted by /u/nobilis_rex_
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
Trying to help and see if I can manage to match you with people, organization or companies that might have the dataset you’re looking for 🙂
submitted by /u/nobilis_rex_
[link] [comments]
I have a dataset containing Swiss zip codes. I would like to calculate statistics per canton, but this require aggregating data to the canton levels using zip codes. I understand that the zip codes do not follow cantons perfectly, but I wonder if anyone is aware of a file that allows matchting between them. A list of all zip codes (starting digits) contained in each canton would be great.
Any help in the right direction is much appreciated.
submitted by /u/AtkinsonStiglitz
[link] [comments]
I would like to understand how the data is arranged, what variables are included, the best way to extract, clean, preprocess the data for my research questions. Thank you in advance!!!
submitted by /u/Patient_Ad1095
[link] [comments]
The FBI apparently launched a Swatting database project in about 2022 (or 2021?). I see references to this online, but haven’t seen the data anywhere, so far. Does anyone have access to this data?
submitted by /u/bobbyfiend
[link] [comments]
Hello everyone! I’m doing some research on existing data search engines like Google Dataset Search, FindData, OpenAIRE, Datacite Search and so on. I’ve noticed a lack of academic papers and/or other research on how people search for datasets, what kind of features they need or don’t need. I know it’s quite common to search big data catalogues like Kaggle, Data.gov and others instead of search engines.
Personally, I miss geospatial search features like setting geospatial filters and coordinates. Only some specific data catalogues and search engines support this.
How do you search for datasets? Are existing dataset search engines good or bad, effective or ineffective? Which are the most helpful?
submitted by /u/ivan-begtin
[link] [comments]
I’m looking to find all the Direct-to-Consumer (D2C) Companies using Shopify. I’ve considered buying a list from a site like BuiltWith, but 1) they’re pretty expensive and 2) ideally I want to use an API or web scrapping where I can update the list on a somewhat regular cadence.
Any recommendations?
submitted by /u/HereToLearnArt
[link] [comments]
I’m fine tuning an LLM, and I need a database of inputs and outputs, where inputs are common novice Java programming problems and outputs are those problems tested with JUnit 5. I need to collect these from somewhere. Any ideas?
submitted by /u/curry_licker
[link] [comments]
Hi,
I am looking for a quick way to download recent and forthcoming documents from the Defense Technical Information Center (DTIC). Ideally, someone has already crawled the corpus and has it stashed somewhere, or has a running spider they can share.
submitted by /u/fredzannarbor
[link] [comments]
Hello does anyone have any good sources for anything and all things supply chain data? I’m interested in international trade, suppliers/venders, really anything I can get my hands on.
submitted by /u/drrednirgskizif
[link] [comments]
Hey guys, I am developing a livestock and field management system. One of its main features will be a calendar with all the rainfall recorded month by month for each loaded field. I need a rainfall history API by coordinates to provide this data. The data needs to be accurate and at least one year old. It would be a plus if the API were free or had a free tier.
Does anyone have any experience with an API like this or can recommend one?
EDIT: data needs to be relatively accurate in Argentine rural areas
submitted by /u/Axeloe
[link] [comments]
I’m doing a research project for university and I’m looking for data that includes Asian people and generational status as well as mental health (can be self-perceived or diagnosed) and educational (highest level or average years).
I am having trouble finding a data set that has all of these because of the condition that it must have generational status. Or is there an alternative phrasing that is more commonly measured instead of generational status?
submitted by /u/Mediocre-Tea9286
[link] [comments]
Hello everyone! I’m looking for a dataset that has cars’ model & make and prices from 2008 to 2018 (or around that time period). I’ve been looking into scraping Craigslist, but Craigslist removes old/expired listings so I couldn’t scrape past data comprehensively. Any leads would be greatly appreciated!
submitted by /u/Both_Armadillo7524
[link] [comments]
Looking for a dataset that preferably contains the effects of sleep, excersise or meditation on teenager wellbeing. I know this is specific but anything to do with effects on teenager wellbeing would also be good thanks in advance
submitted by /u/dreamsmpfanmandan
[link] [comments]
I was looking for datasets containing video game sales and reviews for my first data visualization project and I’ve found this dataset https://www.kaggle.com/datasets/thedevastator/video-game-sales-and-ratings and I thought it was great it had all I was looking for but then I realized it states that GTA V sold approximately 1 million copies for PC which is obviously wrong and dataset was created 5 years ago so that’s not really an explanation for this. So I’m wondering can you trust kaggle datasets and I’d love to ask where can I find something similiar that will provide correct data?
submitted by /u/Beneficial-Daikon202
[link] [comments]
I am trying to access a dataset on the website for an assignment, but I cannot seem to do so. I’ve checked multiple posts but their original date of posting is a few years ago. Is it already down?
submitted by /u/InfamousMarketing748
[link] [comments]
I’ve seen reports posted previously of people analyzing, say, top 500 earning YouTube videos/ channels. They go into thumbnail, video title, genre, audience, etc. I can’t find anything when I Google it though. I keep getting ‘YouTube Analytics’ for your own individual channel.
Anybody have any idea? Thanks
submitted by /u/TaoTeCha
[link] [comments]
I am looking for spare data for forecasting and optimisation. Also open to the idea of creating synthetic data. All help is appreciated
submitted by /u/Suspicious_Low7612
[link] [comments]
I’m looking for a dataset targeting technical roles regardless that includes elements such as industry, location, job title, whether the role is managerial/supervisory/has direct reports, gender, salary, company size. I’ve tried a number of places including data.world, kaggle and O*NET but haven’t been able to find something similar. My goal is to identify technical managers (regardless of job title) for further analysis. Can anyone point me at a good source, or good datasets?
submitted by /u/_AriC
[link] [comments]
1.I looked around TCIA, I couldn’t find a dataset with actual radiologist tumor segmentation (Duke,ISPY as far I checked don’t include segmenation). The thing I found is Breast_Cancer_DCE-MRI_Data from zenodo. Are there more datasets?
Are there Breast Mri dataset which are normal with no findings
submitted by /u/dark16sider
[link] [comments]
Hi , please could you suggest me where i can find an example of real datasets with small n and large p ( features more than observation ) , I am working in regression and I am searching a continues variables to predict a variable Y by high dimensional data matrix X
submitted by /u/StrongCollection1687
[link] [comments]
Hello!
I’m looking for IMF lending data to different countries yearly. All I could find until now is data, which is net or cumulative of lending accounting for how much the countries are paying back to IMF, whereas I need just the amounts lent by IMF to the countries. Any help?
Thanks!!
submitted by /u/Puzzleheaded-Pie-671
[link] [comments]
I am searching for a Large Song Dataset including mood and similarities between artists. I found the Million Song Dataset but it seems that they don’t have valence in the fields, so I would need to query Spotify.
However, it seems like there is no way currently to go from Echo Nest ID to Spotify ID.
Does anybody know a Large Dataset I could use which would have everything I need? Or a way to link the Million Song Dataset with Spotify API?
submitted by /u/MusicAIPerson
[link] [comments]
Hello, do you guys know where to find an image data set of the caucus and central asia that is labelled?
submitted by /u/ReportLess1819
[link] [comments]
I’m training a data model for electric two wheeler bikes and am wondering if I could buy the data from somewhere to make the process faster.
submitted by /u/trieuvietvuong
[link] [comments]
Dataset recommendation request:
I’m looking for any existing publicly available datasets with many examples of isolated instruments being played with no accompaniment and minimal ambient noise.
I need isolated instruments to train individual instrument source separation and detection models for [bar,ts,as,ss,tp,cl,dm,b,etc., etc.] – basically all of the most commonly found instruments in jazz sessions with the exception of piano (which I have no problem sourcing isolating recordings of).
I can probably source sufficient material from Youtube, but and hoping there are some new datasets I haven’t heard of yet with isolated instruments.
submitted by /u/returnstack
[link] [comments]
Hello,
I’m very new to this, so I’m extremely sorry for any beginner terminology used here. I plan to do my bachelor’s thesis on the use of Twitter for online mobilization during the “Dalit Lives Matter” movement (a movement based in India- very similar to the Black Lives Matter movement if not obvious tweets timeline from 2016- 2021). I am planning to do a content or sentiment analysis of the tweets.
I was looking for methods on how to access such datasets, I have heard X’s API has been put behind a paywall and the free version cannot support archival search. I contacted a third party for access to tweets and they are charging a hundred dollars for the same.
Please let me know what is the best way to go about this, if required I can connect in DMs to give out additional details.
Thank You 🙂
submitted by /u/Prestigious_Aioli140
[link] [comments]
Anything related to EDS, hEDS, vEDS, etc. would be welcome. I’m looking to do basic time series analysis and such.
submitted by /u/codyfernfan
[link] [comments]