Any up-to-date datasets where I can get all posts and comments with upvotes from Reddit sub?
I saw pushshift but it doesn’t seem to be up to date.
submitted by /u/Pan000
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
Any up-to-date datasets where I can get all posts and comments with upvotes from Reddit sub?
I saw pushshift but it doesn’t seem to be up to date.
submitted by /u/Pan000
[link] [comments]
New data, 2012-2023. Based on about 10 million active US twitter accounts per year.
Peer-reviewed, open-access article describing the previous version: https://doi.org/10.1371/journal.pone.0260185
Blog: https://jasonjones.ninja/ipseology-central/blog/introducing-jjjitv2.html
submitted by /u/jasonjonesresearch
[link] [comments]
I accidentally calculated one sample t-test for my before and after variables, but I realise I was meant to do a paired sample t-test.
Is it possible to get one sample difference from the two calculated t-tests?
submitted by /u/PinkTequilaaa
[link] [comments]
Could anyone help me find a dataset of daily tourism levels in Budapest in the period January-March 2015?
If too specific, tourism levels in Hungary as a whole works too.
submitted by /u/noraa_g
[link] [comments]
Please comment the link of dataset if you have one !! ( Any help would be appreciated )
submitted by /u/IMPuzzled2
[link] [comments]
I have been studying a paper and I noticed that they were using video from a dataset called I2R. I tried searching for this dataset but wasn’t able to find it. Does it have a different name or is this dataset not available publicly?
Specifically, the paper mentioned the WaterSurface dataset, Campus, Waving trees, fountain, curtain and switch light datasets.
I am looking for these datasets to apply a background/foreground separation algorithm.
submitted by /u/Curious_Analyst986
[link] [comments]
For my thesis, I need salmon fillet images that are fresh, stale, and rotten for training my machine learning model. Do any of you know where I can find images for these?
submitted by /u/TheGreatJuan12
[link] [comments]
The Daily Treasury Statement (DTS) dataset contains a series of tables showing the daily cash and debt operations of the U.S. Treasury. The data includes operating cash balance, deposits and withdrawals of cash, public debt transactions, federal tax deposits, income tax refunds issued (by check and electronic funds transfer (EFT)), short-term cash investments, and issues and redemptions of securities. All figures are rounded to the nearest million.
Explore the data online: https://app.gigasheet.com/spreadsheet/U-S–Treasury-Daily-Cash-Debt–Oct-2005–Apr-2023-/820a1527_c8f0_4ae6_a8a6_b841d327c093
submitted by /u/n1nja5h03s
[link] [comments]
Hello, I am working on a project for graduate school on Reddit as a social network from 2013 to 2023. I am using a previous database of 2,500 subreddits and the top 1000 posts from each from 2013 and I am recollecting it for 2023. I have the uploader, post score, list of all commenters, and their collective score for each commenter in that post
Each node will be a subreddit and the ties will be based on the commenters they have in common. How should I measure this?
Each tie is unidirectional and weighted based on the number of commenters who have ever left comments on both of those subreddits. Each tie is unidirectional and weighted based on the total score of all comments in which the commenter has posted in either subreddit
^ This one sounds more substantial but raises a few concerns such as what if Sub A is a huge subreddit and Sub B is a relatively small subreddit? In Sub A the same commenter has say 2K upvotes but in Sub B they have 300 upvotes, which is more than anyone else on that sub.
submitted by /u/admaciaszek
[link] [comments]
This Shopify Benchmarks data includes a cohort of Shopify store sales, website engagement, and advertising metrics at the store category and subcategory level. This eCommerce data is made up of aggregated sales and web analytics for thousands of Shopify stores globally. Additionally, the dataset includes stores’ total Google Ad spend on search ads, embedded display ads, and more from Google Ad Manager.
Sales and engagement metrics:
Revenue Transaction count Website sessions Website page views
Advertising metrics:
Ad spending Ad clicks Ad views (impressions)
Free trial available if you have a Snowflake account.
submitted by /u/aiatco2
[link] [comments]
We’ve developed an AI voice bot app that can mimic the voices of celebrities like Joe Biden, Donald Trump, Alex Jones, Elon Musk, and Scarlett Johansson. While we’re proud of the technology, we’re also concerned about the potential ethical issues when users have conversations that might trick real people and lead to controversial outcomes.
For example, a user recently shared their experience using the app to imitate Joe Biden in a conversation with a friend. The AI-generated “Biden” endorsed a bizarre policy, like replacing the U.S. national anthem with the “Baby Shark” song. The friend was genuinely convinced they were speaking with the President, and in shock, they shared the call on social media. The story quickly gained traction, leading to heated debates and confusion among people who didn’t realize it was an AI-generated conversation.
These incidents have raised questions about the ethical implications of using such an accurate AI technology to impersonate living people, particularly when it can deceive others and potentially create controversial situations.
As AI enthusiasts, we’re eager to hear your thoughts on the ethical boundaries of AI-generated celebrity voices. We want to ensure that we’re using this technology responsibly and respecting the boundaries of both users and the individuals being impersonated.
TL;DR: Our AI voice bot app can convincingly mimic celebrity voices and has caused controversial situations by fooling real people in conversations, raising ethical concerns.
What are your opinions on the ethical limits of AI when it comes to impersonating living people and potentially creating controversial situations?
submitted by /u/malaika109
[link] [comments]
Hey guys, do you know if there is any volumetric dataset for medical report generation?
submitted by /u/Suspicious-Spend-415
[link] [comments]
I recent did a long engagement with a small data provider in the real estate space. I was surprised how much asset managers pay for data sets.
Is there a resource or site that details the value of different data sets?
submitted by /u/bsbing
[link] [comments]
Does anyone understand census data enough to help me pull income and crime rates at the zip code level or even at the census tract level? I’m writing a paper on the relationship between crime and income and I want the data to be as granular as possible.
Alternatively, does the Census Bureau have a department to help with these kinds of requests? Thanks!
submitted by /u/Guavifo
[link] [comments]
Greetings all! I’m working on something that requires me to look up a specific search term and track how that term has grown in popularity over time. Google Trends makes something like that very easy, but I’m wondering if there is something that I can use to look at the popularity of a search term over time in social media or in press articles in the same manner (e.g. tweets per day/week on orange juice, or number of articles published daily on telescopes). Thank you all!
submitted by /u/rocket__man_
[link] [comments]
Many states make millions of dollars selling their database of drivers licensee’s and car registrations. Some databases can be purchased for +$100K.
My brother is doing some research for uni and got a grant and he’s interested in getting some purchasable data from his local DMV. He tried calling the DMV but got nowhere, he’ll try again next week. The process is definitely not transparent and there’s a lack of instructions on how the process goes.
Does anyone have some experience purchasing data from the DMV?
submitted by /u/nobilis_rex_
[link] [comments]
Hi, I need a dataset for my econometrics project, and I’d like to analyze the effects of the 4 days work week experiment that took place in the last semester in UK. It seems impossible to find any data online, only reports.
In case the ds isn’t publicly available thus making my research impossible, do you have similar ds to suggest me? Like about reduction in the work week. Big thanks to anyone!
submitted by /u/Nikkibraga
[link] [comments]
Hi guys! Anybody here has some experience with pulling data from Factset ?
I have difficulties with the identifiers and running reports: 1. I have a list of ISIN, which is super long so I cannot copy and paste one by one, so I copy all of them into the identifier search function and Factset returns a list of company. However, the number of companies recorded by Factset is much smaller than my ISIN list, so I need to find out which ones are missing. However I cannot get the ISIN of the company list that Factset gave me to compare with my original ISIN list. Can we get the ISIN list from given ticker/company name list ? Or do you have any suggestions to solve this problem?
I need to run a report of ownership of many companies. I don’t mind to do it many times (for example, divide my company list into 100 small parts and run the report one by one), but of course I don’t think I can repeat that for thousands of times. So is there any way to run such report from Factset ?
Many thanks in advance!
submitted by /u/tunglth
[link] [comments]
Hi everyone!
I am looking into ways of (preferrably visually) analizing the Robintrack dataset: https://robintrack.net/data-download , as unfortunately, they closed down their API in 2020 and therefore no longer provide visualizations on their webpage, i.e, how many people were buying a stock at a certain time and how the value of the stock was changing during that time.
Looking at the data from the dataset, are you aware of a relatively easy way of how I can approach this? It would be ideal to arrive a graphs with two lines, i.e., people buying stocks, stock price. Ideally with googles/drop down lists to play around. Like it used to be on the webpage.
Thank you for your help, it is really appreciated! 🙂
submitted by /u/obeseelk
[link] [comments]
You can find it here: https://doi.org/10.5281/zenodo.7786485
It is associated with the work: https://www.mdpi.com/2076-3425/13/4/589
submitted by /u/kostpapar
[link] [comments]
Does anyone have the Cost Charge files for the National inpatient sample database years 2012 and 2013? I can’t find my files for those years unfortunately.
submitted by /u/to_go_far
[link] [comments]
Im looking for data to make something like a “choose your next tweet” recommend program. To do so I need something to recommend to a user but I somehow can’t find datasets of this type. Maybe you guys have an idea?
submitted by /u/Turbulent-Usual-352
[link] [comments]
Hey. I’m looking for a CSV data set with the number of flights per day in Canada in 2020 to 2021. Or if there’s none, is there a data set of number of flights per day for Air Canada in 2021 to 2021. Thanks a lot, I need it for a project.
submitted by /u/ssldm
[link] [comments]