Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Brica Has Available A Unique Database Of 17 Years Of Information Collection On All Cyber Security Issues And Risks

By far the most comprehensive dataset available anywhere. Complete and completely independent of any cyber security company. In other words, 100% complete, not one instance omitted. Including all product software and hardware vulnerabilities, zero days, p.o.c’s etc All reported with as much as possible detail.

Each of the millions of items verified, categorized into one or more of 19,000 tags, correlated and provided with a criticality indication by human risk analysts. Fully up-to-date, with daily between 100 and 400 new risk items inserted. 24x7x365x17. Not a single day missed. All information about just about every risk event, organized and readable in english, text format.

Items with non-english text are translated. Not only all cyber-security risks, threats and events, also all industry-specific business risks, such as banking, casinos, industrial computing, energy, military, healthcare, transportation, critical infra-structure, energy, ICS, OT, IOT, etc. specific.

In addition, all global threats, complete information on all malware, ransomware, all APTS from nation states – all collected, sorted and categorized, new threats to human health, also all illegal drugs and developments therein, all global product recalls (issues with vehicles, medical devices, risks to families and children, and so on.

Not just 1 item per event, but all available information on each topic, organized by date, latest first and on correlated to one or more tags – Scams, Geo-political issues, global and government intelligence departments, initiatives, and much more.

We can also make specific sections, or the entire database, available as a unique AI Machine Learning platform. The most complete and comprehensive dataset for cyber and business risks available worldwide.

For more information:

[more-info@brica.de](mailto:more-info@brica.de)
https://brica.de

submitted by /u/CologicNZ
[link] [comments]

A Dataset For Detecting Powerlines From A Drone

Title basically says it. Does anyone know of a dataset to detect powerlines from aerial imagery? Thats basically the requirement. An additional requirement would be that the powerlines would be labeled by voltage but it’s fine without.

I’m trying to create a drone that can avoid powerlines. When and if I get that working, I want to create a drone that charges from powelines using induction. An university team did this and I want to replicate this. I have a good amount of experience so I think its doable. At least getting a drone to avoid / control well around the powerlines I think is doable.

Thanks!

submitted by /u/AapoL092
[link] [comments]

Do You Know Of Where I Can Get Data Sets For Soccer.

I am making a data base (for uni) and we’ll it’s schema is this:

I need data for this and I reaaally don’t quite know where to get the specific shmuck that I need.

Tournaments

TournamentID (Primary Key) TournamentName TournamentYear TournamentCountry TournamentType TournamentFormat TournamentPrizeMoney

Leagues

LeagueID (Primary Key) LeagueName LeagueCountry LeagueWebsite LeagueSponsor LeaguePromotion LeagueRelegation

Teams

TeamID (Primary Key) TeamName TeamCity TeamCountry TeamLogo TeamFounded TeamStadium TeamCaptain LeagueID (Foreign Key)

Players

PlayerID (Primary Key) PlayerName PlayerAge PlayerNationality PlayerHeight PlayerWeight PlayerMarketValue PlayerPosition TeamID (Foreign Key)

Stadiums

StadiumID (Primary Key) StadiumName StadiumCity StadiumCountry StadiumAddress StadiumSurface StadiumRoof StadiumCapacity

Matches

MatchID (Primary Key) HomeTeamID (Foreign Key) AwayTeamID (Foreign Key) MatchDate MatchTime MatchReferee MatchAttendance MatchWeather HomeTeamScore AwayTeamScore TournamentID (Foreign Key) StadiumID (Foreign Key)

PlayerMatches

PlayerMatchID (Primary Key) PlayerID (Foreign Key) MatchID (Foreign Key) MinutesPlayed GoalsScored Assists ShotsOnTarget ShotsOffTarget Saves Tackles PassesCompleted YellowCards RedCards PlayerRating PlayerManOfTheMatch PlayerSubstitute

submitted by /u/No_Secretary1128
[link] [comments]

What Indicators/datasets Would I Look For To Analyze Bilateral Trade?

Hello everyone, I hope this is the correct place to ask.

As part of a university project I am looking at how Dutch trade with both Japan and China has been impacted positively / negatively by the Japan-China territorial disputes. I want to just get a very general overview of how the trade has varied over time.

But for the life of me, I can’t figure out what indicators or datasets to use for something so seemingly simple. I found BACI and UNCOM, but don’t know which one would be most useful or if they are even relevant.

Thank you very much in advance, and warm regards.

submitted by /u/Special_Bite6093
[link] [comments]

Downloading With Wget From Kaggle Difficulties

Hey guys, I am trying to download the datasets from this link https://www.kaggle.com/datasets/debashis74017/stock-market-data-nifty-50-stocks-1-min-data/data/ACC_minute_data_with_indicators.csv for a school project, but can’t use the Download button since I need to download through terminal onto another machine. I’m trying to use wget <link> but it keeps downloading the html over view page. How can I download this properly? Any help would be appreciated!!

submitted by /u/Aggressive_Drink_530
[link] [comments]

How To Predict From Dataset(text Based)

Hi, for my final year project at university I am using data set which contains jobs postings and all related data of LinkedIn I’ve used powerbi for dashboards and visualisations now I want to predict which job is in most demand by selecting the industries giving in dataset. It’s in text like English I don’t know how to do it which model I should use. I have learned about some ml models in my ml course but they all deal with numbers how I can do prediction from text. Regards

submitted by /u/Parking-Sun-8979
[link] [comments]

Dataset Of Images Of Brain Tumors Or Brain Damage

Hi! I am a final year student of computer engineering and I want to do a TFG related to artificial intelligence applied to a more “medical” field in order to make a model of recognition and prediction of brain tumors, brain damage from head injuries, brain disorders or diseases from images. However, I have been investigating in platforms like Kaggle but I can’t get datasets for this purpose. Do you know of any resource to obtain images of this type?

submitted by /u/Chard_5151
[link] [comments]

[Synthetic] [self-promotion] Releasing High Quality Text -> SQL Dataset To Help Improve LLM Performance W/SQL Tasks

Hey all- co-founder at Gretel.ai here. We are thrilled to release a high quality synthetic dataset aimed at helping LLMs improve performance working with SQL data and queries. Details and links below, we would love to hear any feedback!

Our blog: https://gretel.ai/blog/synthetic-text-to-sql-dataset
Get the dataset on Hugging Face: https://huggingface.co/datasets/gretelai/synthetic_text_to_sql

The dataset includes:
* 105,851 records partitioned into 100,000 train and 5,851 test records
* ~23M total tokens, including ~12M SQL tokens
* Coverage across 100 distinct domains/verticals
* Comprehensive array of SQL tasks: data definition, retrieval, manipulation, analytics & reporting
* Wide range of SQL complexity levels, including subqueries, single joins, multiple joins, aggregations, window functions, set operations
* Database context, including table and view create statements
* Natural language explanations of what the SQL query is doing
* Contextual tags to optimize model training

submitted by /u/meowterspace42
[link] [comments]

Best Way To Learn About Data Analytics

Hi, I’m graduating this year I’ve good grip on sql,python and all computer science fundamentals I’ve also made two projects with power bi using already available ready to use datasets. I wanted to get into data engineering but I’ve heard from many people data engineering is not beginners role I need to start as a data analyst. If it’s correct. Which certification is best for learning about data analytics google, ibm, or Microsoft. I know the best way is to learn by making projects but I think in job interviews they ask about tools and techniques in depth so that’s why preferring certification or course. Regards

submitted by /u/Parking-Sun-8979
[link] [comments]

Dataset Of US Weather Across 15 US Cities, First Three Months Of 2024 And 2023. Max Temp And Precipitation Counts. Would Anyone Have A Best Rec?

Howdy folks,

Im looking for a data set to comprise of about 15 US cities or so, and looking for max temperature and precipitation measurements for the first three months of 2023 and 2024. I know I can use https://www.ncei.noaa.gov/, but its a pain in the rear end to try to go city by city and then extract em all out one by one, year over year and then synthensize and transform 15 or 30 more sets altogether.

Would anyone know if this currently exists somewhere in a CSV format possibly?

submitted by /u/WhatsTheAnswerDude
[link] [comments]

Dataset With # Of Employees For US Healthcare Facilities?

For my research I’m looking for a database that has the # of employees at each healthcare facility in the US. I’ve been using the CMS healthcare facilities dataset through HRSA, but unfortunately it doesn’t seem to have data for all facilities. Any suggestions on other database that may be helpful?
I’m also looking for a data on number of in & outpatient visits for each healthcare facility in the US, and would appreciate suggestion for that as well.

submitted by /u/Dapper_Willow731
[link] [comments]

LinkedIn Dataset – Exploring Career Paths, Educational Backgrounds (How To Obtain?)

Hello All,

As the title suggests, I am looking for a way to get data on specific career paths, and what background/years of experience individuals had to get them there.

Data I will need:

All individuals in US who held positions at target firms (see below for list) in last 10 years. All companies (past & present) All positions held + length of time Educational background and dates

Target is individuals who currently hold or in the past held Associate, Engagement Manager, Associate Partner, or above positions at the MBB firms:

McKinsey Boston Consulting Group Bain & Co

Purpose: Decide on where to get my MBA (online) in order to maximize my chance enter these firms within a given timeframe.

Intended Analysis Methods: Determine % of individuals who attended Ivy league, vs top 25, vs other schools, % of individuals with MBAs. Determine breakdown by industry background. Determine distribution for years of experience under two conditions – entering at that level and rising to that level from within.

Also, will need to do the same thing for Tech (M7 companies, Nvidia, Tesla, Microsoft, Google, Apple, Meta, Amazon). Would also like to cross check and see how many from consulting ended up in Tech.

From what I can tell, there are a few ways I can do this:

Write code accessing the LinkedIn API and figure out the limitations. Purchase software that will scrape for me through my account. Pay for another company to scrape the data for me. Pay for an existing data set. Find a free publicly available dataset.

Any help would be greatly appreciated.

submitted by /u/typeIIcivilization
[link] [comments]