Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Free Platform For Finding Any Data Using LLM

Hi Everyone,

I created a platform which has aggregated and stored any data on web, and has an LLM Chat Assistant to help you find data best fitted for your use case.

I would be happy if you have any feedback to share, and let me know how that would compare to more traditional methods of finding data through a search bar.

Feel free to use it below and let me know :), hope it helps:

https://www.cognidex.net/

submitted by /u/XhoniShollaj
[link] [comments]

Looking For Specific Data Set For Multiple Regression

I need to find a data set that has variables that lend themselves to analysis by some form of multiple regression; it must have at least 15 cases per predictor; it must have at least 3 predictor variables; it should have both quantitative and categorical predictors; and it should have at least one quantitative dependent variable.

Is there a site where I can filter all these specifics?

submitted by /u/kevinalways
[link] [comments]

Data On EU Countries. Something Other Than UNdata And Eurostat?

Hello.
My goal is to find certain statistical information about different countries of european union(related to things like employment, crime,cost of living,immigration, social nets etc.), however im quite new to this and i have no idea where to look.
I have found two major sources of data: eurostat and UNdata, but i was wondering if there are some other sources out there that i couldn’t find on google?

submitted by /u/420-big-chungus-kean
[link] [comments]

Is Searching For Datasets The Hardest Part? Looking For CSV Paired Real-world Dataset So I Can Run Some Python Analysis.

Hello, so I’ve been searching for over an hour on various repositories. I’m looking for a dataset that has a before and after numerical results. It can be test grades before and after intervention. Blood pressure before and after intervention etc… anything like that. I feel like I just don’t know how to do properly search for this.

submitted by /u/Enochwel
[link] [comments]

Non Aggregated Individual Level Dataset Needed Urgently

Hi all,

I need a non aggregated dataset, individual level, non synthesized, in english and from a credible source. A combination of qualitative and quantitative data.

This is for an assignment and the lecturer is not amenable to any deviations from the above.

I thought I could use census data but a lot of the data I found is aggregated. Surveys are often simulated.

Any help at all would be appreciated. Thank you!

submitted by /u/reader20not
[link] [comments]

Looking For Survey With +100 Questions

Hey guys,

I’m looking for a finished survey with over 100 questions. It doesn’t have to have a lot of participants, but the more, the better of course. It’s for my thesis in mathematics. There is a new theory we are trying to use in practice. So I don’t care what field it is in or how old it is. Any hint or Dataset would be appreciated.

Thanks

submitted by /u/juggerjaxen
[link] [comments]

Looking For Australian Stock Dataset

Hello,

I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?

submitted by /u/Turbulent_Setting_59
[link] [comments]

Looking For Australian Stock Dataset

Hello,

I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?

submitted by /u/Turbulent_Setting_59
[link] [comments]

I Need A List Of Offensive Words And Slurs

So, the thing is, I want a little bit of code that will check what’s the user is inputting their name as the player character. And, if it matches an offensive word, the game will throw a secret easter egg commenting funny things and then basically saying, you can’t do that bro.
I have the code set up and working. But the thing is it’s so hard to just manually inputting everything in i can think of.
i just need a list of those words. I found a list on the internet. but the sad thing is… well…….. according to the list, ‘arab’ is an offensive word. so is ‘black’ or ‘whites’
i just need a good list, with solid words, that will NOT cause any controversies.

submitted by /u/INGENAREL
[link] [comments]

Trying To Create A SAS Dataset And Not Making Any Progress

This is not homework; this is for my job. I am working on creating a SAS database for research. I am out of options and almost out of time. I am losing my mind trying to make this data set work. I am editing this because I am hoping that I am creating a good data set but right now I can’t run any of the analysis that my employer needs by Tuesday. Last week he told me that it was not up to par, and I had to rewrite the dataset. I did but now of course it does not work I could lose my job does anyone have any ideas for something like this

data ListA;

input Category MovieTitle $50. Mental_Change Sickness_Disease Relationships_Timeline Family_Based_Relationships Health_Wellness Intimacy Employment Money Maslow Government Computers Multicultural Animals Diversity Children Miscellaneous;

Datalines;

1 “Movie_A” 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0

2 “Movie_B” 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0

3 “Movie_C” 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0

4 “Movie_D” 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0

5 “Movie_E” 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

7 “Movie_F” 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0

8 “Movie_G” 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1

9 “Movie_F” 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0

;

run;

I have also tried running this:

data ListA;

input Category MovieTitle $50. Mental_Change=1 Sickness_Disease=2 Relationships_Timeline =3 Family_Based_Relationships=4 Health_Wellness=5 Intimacy=6 Employment=7 Money=8 Maslow=9 Government=10 Computers=11 Multicultural=12 Animals=13 Diversity=14 Children=14 Miscellaneous=15

Datalines;

1 “Movie_A” 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0

2 “Movie_B” 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0

3 “Movie_C” 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0

4 “Movie_D” 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0

5 “Movie_E” 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

7 “Movie_F” 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0

8 “Movie_G” 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1

9 “Movie_F” 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0

;

run;

I have no idea why I can’t write this dataset and run the statistical tests on it. I am debating doing them by hand. Does anyone have any ideas? This is very important for my career. I spend all of my time watching videos on SAS and have spent hundreds on books. I have tried writing this data set ten times. The last dataset I wrote worked just fine but it wasn’t what they were looking for. It was much smaller! I have over 200 movies that I need to run on these themes by tomorrow. I will take any SAS ideas or tips or even tricks that I can do it by hand at this point.

I have the SAS dataset written out alll the way with each movie with each category but nothing works. It also does not help that when I am using SAS on one of the provided devices

NOTE: Unable to open SASUSER.REGSTRY. WORK.REGSTRY will be opened instead.

NOTE: All registry changes will be lost at the end of the session.

WARNING: Unable to copy SASUSER registry to WORK registry. Because of this,

WARNING: you will not see registry customizations during this session.

NOTE: Unable to open SASUSER.PROFILE. WORK.PROFILE will be opened instead.

NOTE: All profile changes will be lost at the end of the session.

NOTE: This SAS session is using a registry in WORK. All changes will be lost at the end of this

NOTE: session.

NOTE: Unable to open SASUSER.PROFILE. WORK.PROFILE will be opened instead.

NOTE: All profile changes will be lost at the end of the session.

It also does not help that SAS does not save on any of the devices I have so I can no longer turn off either of my computers.

submitted by /u/Rajah_1994
[link] [comments]

Data Set For A Fitness Related Project

Hello

I have to do a project for a data science class at uni and wanted to investigate whether there is a correlation between going to the gym (or doing any kind of sport) and leading a happy (or satisfied) life. I am new to data science and I’m having a hard time finding fitting data sets.

Can anybody help me find a data set which covers both topics? Maybe also something that compares different types of sport with how happy people feel.

I would really appreciate it, if somebody could help me out!

submitted by /u/WaiBouNj
[link] [comments]

Dataset For Programming Mistakes From All Experience Levels

I am building a project and I want to fine-tune an LLM to incorporate it as a ChatBot.

The ChatBot will deliver feedback to students who submit programming solutions for exercises they are solving. I want to train the ChatBot on a specific way to give feedback like not giving the correct answer explicitly and not answering questions unrelated to the domain, and also being able to give hints when a student asks for it.

I couldn’t find a dataset close to what I need. Obviously I will need to clean any dataset that I find to match my needs perfectly.

If you know of any dataset that might help me with this, or any way that I can automate the generation of a mock dataset, because ChatGPT has limitions and I wasn’t able to make it generate the number of examples I need.

submitted by /u/iTsObserv
[link] [comments]

Looking For A Holistic Picture Of Government Funds To Higher Educational Institutions

Writing a paper on the financing of higher educational institutions in the US. I’m able to find good data on grants federal and state governments issue, but not data which is inclusive of loans disbursed to students which are used to pay for tuition. Really I’m looking to evaluate just how much American higher ed relies on the government for operating revenue.

Any suggestions? I know this ask isn’t very specific but I’m still in the research and surveying portion of this paper so really any good sources on federal student loans/university income streams would be helpful.

submitted by /u/throwaway1819181972
[link] [comments]

How Should I Handle The Fact That This Dataset Is Super Unbalanced?

I want to train Bot Iot dataset to train a model to predict data exfiltration attacks but I am having some issues.

I have for versions of this dataset

Full dataset with around 46 million rows but only 126 rows are categorized as data exfiltration and in those 126 only 8 are not classified as attack. 5% of the Full dataset with around 3.6 million rows but only 6 rows are categorized as data exfiltration and in those 6 none is classified as non attack. Partial dataset with 1 around 1 million rows and only the 10 best features but there is no row that is categorized as data exfiltration. A subset of the full dataset with only the lines categorized as data exfiltration. The major problem in this subset is its size since it only has 126 lines and the fact that it is unbalanced since only 8 rows are classifies as non attack.

Which of this datasets should I use and how do I make the data treatment?

submitted by /u/fabiopires10
[link] [comments]

Auto-theft Numbers (I Am Looking For Historical Data Sets For Auto-Theft In Ontario)

Hello,

I am working on a school project and I need help with finding data on the number of auto thefts by city. I am trying to use police websites, but they usually only have the last 12 months available or an aggregated number. The only place I could find historical data was in the City of Toronto (https://data.torontopolice.on.ca/pages/auto-theft) Does anybody know where or how I can get more data for other cities in Ontario (Ones with a population of a couple thousand). Any help is greatly appreciated.

Cheers

submitted by /u/One_Assumption6321
[link] [comments]