Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Looking For Survey With +100 Questions

Hey guys,

I’m looking for a finished survey with over 100 questions. It doesn’t have to have a lot of participants, but the more, the better of course. It’s for my thesis in mathematics. There is a new theory we are trying to use in practice. So I don’t care what field it is in or how old it is. Any hint or Dataset would be appreciated.

Thanks

submitted by /u/juggerjaxen
[link] [comments]

Looking For Australian Stock Dataset

Hello,

I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?

submitted by /u/Turbulent_Setting_59
[link] [comments]

Looking For Australian Stock Dataset

Hello,

I am looking for Australian Stock Market dataset for all companies that’s for a client project. They provided me the link of Yahoo finance website as they need stock company data from there. At first thought of scraping but it may change and I need dynamic data. Is there any API for all the company stock data of Australia?

submitted by /u/Turbulent_Setting_59
[link] [comments]

I Need A List Of Offensive Words And Slurs

So, the thing is, I want a little bit of code that will check what’s the user is inputting their name as the player character. And, if it matches an offensive word, the game will throw a secret easter egg commenting funny things and then basically saying, you can’t do that bro.
I have the code set up and working. But the thing is it’s so hard to just manually inputting everything in i can think of.
i just need a list of those words. I found a list on the internet. but the sad thing is… well…….. according to the list, ‘arab’ is an offensive word. so is ‘black’ or ‘whites’
i just need a good list, with solid words, that will NOT cause any controversies.

submitted by /u/INGENAREL
[link] [comments]

Trying To Create A SAS Dataset And Not Making Any Progress

This is not homework; this is for my job. I am working on creating a SAS database for research. I am out of options and almost out of time. I am losing my mind trying to make this data set work. I am editing this because I am hoping that I am creating a good data set but right now I can’t run any of the analysis that my employer needs by Tuesday. Last week he told me that it was not up to par, and I had to rewrite the dataset. I did but now of course it does not work I could lose my job does anyone have any ideas for something like this

data ListA;

input Category MovieTitle $50. Mental_Change Sickness_Disease Relationships_Timeline Family_Based_Relationships Health_Wellness Intimacy Employment Money Maslow Government Computers Multicultural Animals Diversity Children Miscellaneous;

Datalines;

1 “Movie_A” 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0

2 “Movie_B” 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0

3 “Movie_C” 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0

4 “Movie_D” 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0

5 “Movie_E” 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

7 “Movie_F” 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0

8 “Movie_G” 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1

9 “Movie_F” 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0

;

run;

I have also tried running this:

data ListA;

input Category MovieTitle $50. Mental_Change=1 Sickness_Disease=2 Relationships_Timeline =3 Family_Based_Relationships=4 Health_Wellness=5 Intimacy=6 Employment=7 Money=8 Maslow=9 Government=10 Computers=11 Multicultural=12 Animals=13 Diversity=14 Children=14 Miscellaneous=15

Datalines;

1 “Movie_A” 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0

2 “Movie_B” 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0

3 “Movie_C” 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0

4 “Movie_D” 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0

5 “Movie_E” 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

7 “Movie_F” 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0

8 “Movie_G” 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1

9 “Movie_F” 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0

;

run;

I have no idea why I can’t write this dataset and run the statistical tests on it. I am debating doing them by hand. Does anyone have any ideas? This is very important for my career. I spend all of my time watching videos on SAS and have spent hundreds on books. I have tried writing this data set ten times. The last dataset I wrote worked just fine but it wasn’t what they were looking for. It was much smaller! I have over 200 movies that I need to run on these themes by tomorrow. I will take any SAS ideas or tips or even tricks that I can do it by hand at this point.

I have the SAS dataset written out alll the way with each movie with each category but nothing works. It also does not help that when I am using SAS on one of the provided devices

NOTE: Unable to open SASUSER.REGSTRY. WORK.REGSTRY will be opened instead.

NOTE: All registry changes will be lost at the end of the session.

WARNING: Unable to copy SASUSER registry to WORK registry. Because of this,

WARNING: you will not see registry customizations during this session.

NOTE: Unable to open SASUSER.PROFILE. WORK.PROFILE will be opened instead.

NOTE: All profile changes will be lost at the end of the session.

NOTE: This SAS session is using a registry in WORK. All changes will be lost at the end of this

NOTE: session.

NOTE: Unable to open SASUSER.PROFILE. WORK.PROFILE will be opened instead.

NOTE: All profile changes will be lost at the end of the session.

It also does not help that SAS does not save on any of the devices I have so I can no longer turn off either of my computers.

submitted by /u/Rajah_1994
[link] [comments]

Data Set For A Fitness Related Project

Hello

I have to do a project for a data science class at uni and wanted to investigate whether there is a correlation between going to the gym (or doing any kind of sport) and leading a happy (or satisfied) life. I am new to data science and I’m having a hard time finding fitting data sets.

Can anybody help me find a data set which covers both topics? Maybe also something that compares different types of sport with how happy people feel.

I would really appreciate it, if somebody could help me out!

submitted by /u/WaiBouNj
[link] [comments]

Dataset For Programming Mistakes From All Experience Levels

I am building a project and I want to fine-tune an LLM to incorporate it as a ChatBot.

The ChatBot will deliver feedback to students who submit programming solutions for exercises they are solving. I want to train the ChatBot on a specific way to give feedback like not giving the correct answer explicitly and not answering questions unrelated to the domain, and also being able to give hints when a student asks for it.

I couldn’t find a dataset close to what I need. Obviously I will need to clean any dataset that I find to match my needs perfectly.

If you know of any dataset that might help me with this, or any way that I can automate the generation of a mock dataset, because ChatGPT has limitions and I wasn’t able to make it generate the number of examples I need.

submitted by /u/iTsObserv
[link] [comments]

Looking For A Holistic Picture Of Government Funds To Higher Educational Institutions

Writing a paper on the financing of higher educational institutions in the US. I’m able to find good data on grants federal and state governments issue, but not data which is inclusive of loans disbursed to students which are used to pay for tuition. Really I’m looking to evaluate just how much American higher ed relies on the government for operating revenue.

Any suggestions? I know this ask isn’t very specific but I’m still in the research and surveying portion of this paper so really any good sources on federal student loans/university income streams would be helpful.

submitted by /u/throwaway1819181972
[link] [comments]

How Should I Handle The Fact That This Dataset Is Super Unbalanced?

I want to train Bot Iot dataset to train a model to predict data exfiltration attacks but I am having some issues.

I have for versions of this dataset

Full dataset with around 46 million rows but only 126 rows are categorized as data exfiltration and in those 126 only 8 are not classified as attack. 5% of the Full dataset with around 3.6 million rows but only 6 rows are categorized as data exfiltration and in those 6 none is classified as non attack. Partial dataset with 1 around 1 million rows and only the 10 best features but there is no row that is categorized as data exfiltration. A subset of the full dataset with only the lines categorized as data exfiltration. The major problem in this subset is its size since it only has 126 lines and the fact that it is unbalanced since only 8 rows are classifies as non attack.

Which of this datasets should I use and how do I make the data treatment?

submitted by /u/fabiopires10
[link] [comments]

Auto-theft Numbers (I Am Looking For Historical Data Sets For Auto-Theft In Ontario)

Hello,

I am working on a school project and I need help with finding data on the number of auto thefts by city. I am trying to use police websites, but they usually only have the last 12 months available or an aggregated number. The only place I could find historical data was in the City of Toronto (https://data.torontopolice.on.ca/pages/auto-theft) Does anybody know where or how I can get more data for other cities in Ontario (Ones with a population of a couple thousand). Any help is greatly appreciated.

Cheers

submitted by /u/One_Assumption6321
[link] [comments]

Population Growth Datasets For Math Modelling

Hello there. I am doing a school assignment where I will try modelling two types of populations, one with logistic functions, the other with lotka volterra equations. Trying to find datasets has proven to be unexpectedly difficult, so I thought I might ask for help here

I am thus looking for two types of datasets:

One is the sort which can be modelled using logistic functions, that is a type of population increasing exponentially, but with some sort of carrying capacity. For example bacteria population growth when left alone in a confined area.
The other is a predator prey dataset to use lotka volterra equations on. For example how the population of deer and wolves have changed together in the same area over time.

Any help would be greatly appreciated.

submitted by /u/AblazeOwl26
[link] [comments]

[self-promotion] US Federal Disaster Declarations And Flood Policies/claims Dataset From FEMA And NFIP

Our team at Cybersyn just released new data from the Federal Emergency Management Agency (FEMA) and the National Flood Insurance Program (NFIP) to Weather and Environmental Essentials on Snowflake Marketplace.

Added US federal disaster declarations from the Federal Emergency Management Agency (FEMA) including disaster name, type, date, public assistance funding, impacted geographies, and work orders issued by FEMA to other government agencies for emergency response support. Added flood policies and claims from the National Flood Insurance Program (NFIP), an insurance program managed by FEMA that provides flood insurance to property owners, renters and businesses. Includes features of the covered property, flood event, cost of damage, insurance payouts, building and contents insurance coverage, and policy deductibles, rates and duration.

submitted by /u/aiatco2
[link] [comments]