I’m working on a code summarization project for C and for that I need a Dataset containing code snippets with corresponding explanation of that portion. It would be very helpful.
submitted by /u/TLE_champion
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I’m working on a code summarization project for C and for that I need a Dataset containing code snippets with corresponding explanation of that portion. It would be very helpful.
submitted by /u/TLE_champion
[link] [comments]
I am looking for small dataset (csv/xlxs) of around 500-100 rows with around 10 columns to visualise those data using Power BI If anyone has that kind of dataset laying around, it would be much appreciated. It is for a uni project. Thank you.
submitted by /u/FardinShafi
[link] [comments]
Hey guys hope you are having a nice day,
I am looking for a dataset that contains pointed (labeled) Ear Auricular Acupuncture points for my Graduation project,
So if anyone could help me finding some open-source labeled dataset it would be awesome <3
submitted by /u/HassanSalama
[link] [comments]
Hi, I hope someone here can help – I am looking for a messy dataset for my assignment and I am hitting a wall.
Not as simple as just any set – the assignment is to find a clean one and a messy one and then join those on a common variable, then perform analysis. So these need to be somewhat related topics wise and include a common variable.
i would like to work on the subject on gender representation and I already have several clean sets with general demographic info but i just cannot find anything messy enough (this is still a beginner level so they don’t just want me to do standardisation etc but I need something that includes observations as variables, missing data etc). I was hoping to find something on gender representation in politics by country to then join to my clean sets by country variable. Any help much much much appreciated!!
submitted by /u/MagdaMc85
[link] [comments]
My professor wants me to build a project for object detection which has real life application. The dataset should be collected manually. The idea should be very practical and useful
Any help with ideas would be helpful. My fellows already got the common ideas like potholes, bins, etc detection
submitted by /u/kthxbubye
[link] [comments]
Hello! As a training project, I want to build several demo dashboards:
– financial statements: profit and loss, cashflow, balance sheet;
– sales report.
In this regard, I’m looking for a high-quality data set. If you have data that you can provide for my purposes or information about sources where it can be found or how it can be generated, I’ll be grateful.
submitted by /u/According_Scheme_553
[link] [comments]
Hello r/datasets Community,
We’re excited to introduce the Chinese Corpora Internet (CCI) dataset v1.0.0, a high-quality Chinese internet language dataset, meticulously developed by BAAI with the support of leading institutions and tech partners. CCI is designed to be the cornerstone of AI research requiring high-quality Chinese language data.
CCI’s standout features:
Vast Scale: CCI offers an impressive 104GB of data, providing a broad spectrum of linguistic information. Time Span: The dataset encompasses over two decades of data, from January 2001 to November 2023, offering historical depth and contemporary relevance. Quality Sources: Data is sourced from trusted and authoritative Chinese internet platforms, ensuring high fidelity and relevance. Rigorous Processing: CCI has undergone extensive cleaning, deduplication, and quality checks to ensure the highest standards of data integrity. Safe and Reliable: With a focus on safety and reliability, CCI has been filtered through advanced techniques to remove any sensitive or inappropriate content. Benchmark Filtering: Unique to CCI, we’ve implemented stringent checks against mainstream Chinese benchmark datasets to prevent “teaching to the test” in model training.
Download CCI and join us in shaping the future of AI:
BAAI Open Data Repository: https://data.baai.ac.cn/details/BAAI-CCI HuggingFace: https://huggingface.co/datasets/BAAI/CCI-Data
We’re eager to see the innovative applications and research that will emerge from the community’s use of CCI. Your participation and feedback are crucial to the continuous improvement of this dataset.
Cheers,
The BAAI Team
Supported by: CSAC, Beijing Municipal Cyberspace Administration, Beijing Municipal Science & Technology Commission, Zhongguancun Administrative Committee, Haidian District Government, our tech partners TRS and Wenge.
submitted by /u/lukai-baai
[link] [comments]
Looking to create a offensive and defensive cybersecurity techniques dataset. The dataset would be used for a class project for teaching and refining an AI model chat responses. Can anyone recommend some sources and what a row/column would look like? I know the preferred method is quantitative data so how would this work with qualitative data? Also, any recommendations for web scraping application besides me developing a script? Thanks
submitted by /u/Aisechopeful
[link] [comments]
Hey guys,
Quick question – how does an individual go about selling their personal data at a strictly individual level (e.g. browsing history, shopping habits, location etc.)
Also what data can be sold at this level?
Thinking of starting a super user friendly app for individuals to sell their data and make a few extra $’s per month.
submitted by /u/AsadExec
[link] [comments]
Hi, I am currently doing a research for a small team regarding crude oil futures. I am looking for WTI (CL=F) and Brent (BZ=F) data from the past 10 years. I would need them in 15 minutes granularity. Are there any sources that I can get them? Paid sources are fine as well.
submitted by /u/theonlyQuan
[link] [comments]
It would be preferred if I had a data set looking at higher education, community education, emergencies, emergency medical technicians, films, and or anything to do with social gerontology. I am supposed to be improving my SAS, Stata and Spss skills. I’m supposed to be working with data for my research project but the data I have is either to big for me to be able to open, I can’t be approved to use it, or isn’t a big enough dataset. I am trying to get better with using datasets but I need ones that are free to use. Please save me from the failure that is writing my own dataset.
submitted by /u/Rajah_1994
[link] [comments]
Hi Everyone,
I created a platform which has aggregated and stored any data on web, and has an LLM Chat Assistant to help you find data best fitted for your use case.
I would be happy if you have any feedback to share, and let me know how that would compare to more traditional methods of finding data through a search bar.
Feel free to use it below and let me know :), hope it helps:
submitted by /u/XhoniShollaj
[link] [comments]
Hey guys,
I am looking to find or purchase a large amount of conversational data for our chatbot. We are in the presales market but also open to other conversations set around customers and their conversations with agents. Feel free to DM me if you have anything like this.
Thanks again
submitted by /u/jellydotsadventure
[link] [comments]
I need to find a data set that has variables that lend themselves to analysis by some form of multiple regression; it must have at least 15 cases per predictor; it must have at least 3 predictor variables; it should have both quantitative and categorical predictors; and it should have at least one quantitative dependent variable.
Is there a site where I can filter all these specifics?
submitted by /u/kevinalways
[link] [comments]
Hi All,I hope you have a good day let’s get to the point
this is a huge datasets from Princeton University that was used for various studies, check the image and if you are interested dm.
full data image
submitted by /u/DataExpx
[link] [comments]
Hello.
My goal is to find certain statistical information about different countries of european union(related to things like employment, crime,cost of living,immigration, social nets etc.), however im quite new to this and i have no idea where to look.
I have found two major sources of data: eurostat and UNdata, but i was wondering if there are some other sources out there that i couldn’t find on google?
submitted by /u/420-big-chungus-kean
[link] [comments]
Hello, so I’ve been searching for over an hour on various repositories. I’m looking for a dataset that has a before and after numerical results. It can be test grades before and after intervention. Blood pressure before and after intervention etc… anything like that. I feel like I just don’t know how to do properly search for this.
submitted by /u/Enochwel
[link] [comments]
As the title says need a options dataset of bitcoin for analysis. All data seems to be behind paywall, is there a free dataset?
submitted by /u/VeLVeT-_–_-ThuNdeR
[link] [comments]
I am looking for a dataset about indicators and symptoms of Celiac Disease to build a decision support system for celiac disease diagnosis. Where can I find any data related to it?
submitted by /u/hashim_qureshi
[link] [comments]
Hi everyone,
I’m currently working on an object detection task focusing on ID proof documents. Specifically, I’m looking for datasets that contain labeled objects like photos, signatures, tables, etc., within documents.
If anyone has recommendations or knows where I can find such datasets, I would greatly appreciate your help! Thank you!
submitted by /u/SmokeBeatRepeat
[link] [comments]
Is there a dataset with the names and locations (XY coordinates) of jails/prisons in the US? Or a way to find them? Thanks
submitted by /u/DDPDietDoctorPepper
[link] [comments]
Hello, I am searching for datasets regarding prescription dosages in regards to a persons height, weight, age, etc. I’ve been struggling to find any and honestly I am not sure where to search beyond the initial google search. Thanks for the help!
submitted by /u/SomethingOrAnyone
[link] [comments]
Hi all,
I need a non aggregated dataset, individual level, non synthesized, in english and from a credible source. A combination of qualitative and quantitative data.
This is for an assignment and the lecturer is not amenable to any deviations from the above.
I thought I could use census data but a lot of the data I found is aggregated. Surveys are often simulated.
Any help at all would be appreciated. Thank you!
submitted by /u/reader20not
[link] [comments]
Hey guys,
I’m looking for a finished survey with over 100 questions. It doesn’t have to have a lot of participants, but the more, the better of course. It’s for my thesis in mathematics. There is a new theory we are trying to use in practice. So I don’t care what field it is in or how old it is. Any hint or Dataset would be appreciated.
Thanks
submitted by /u/juggerjaxen
[link] [comments]
Hello,
I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?
submitted by /u/Turbulent_Setting_59
[link] [comments]
Hello,
I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?
submitted by /u/Turbulent_Setting_59
[link] [comments]
We are currently looking for a retail dataset ( ex :walmart, target etc ) that contains sales information, some dummy customer information , store information so that we can do some analytics around the same. We are looking for data above 500mb so we can present it as big data project
submitted by /u/maximus_deUX
[link] [comments]
So, the thing is, I want a little bit of code that will check what’s the user is inputting their name as the player character. And, if it matches an offensive word, the game will throw a secret easter egg commenting funny things and then basically saying, you can’t do that bro.
I have the code set up and working. But the thing is it’s so hard to just manually inputting everything in i can think of.
i just need a list of those words. I found a list on the internet. but the sad thing is… well…….. according to the list, ‘arab’ is an offensive word. so is ‘black’ or ‘whites’
i just need a good list, with solid words, that will NOT cause any controversies.
submitted by /u/INGENAREL
[link] [comments]
Using a dataset on books sales from Kaggle, I did my best to do a simple analysis of the date, with charts and calculations, etc. What am I missing? What should I include? Be brutal, I have thick skin.
https://github.com/Blion6868/Data_Visualization/blob/main/Book_Analysis.ipynb
submitted by /u/Tyron_Slothrop
[link] [comments]