{"id":32437,"date":"2025-01-31T08:27:04","date_gmt":"2025-01-31T07:27:04","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/open-malsec-v0-1-open-source-cybersecurity-analysis-samples\/"},"modified":"2025-01-31T08:27:04","modified_gmt":"2025-01-31T07:27:04","slug":"open-malsec-v0-1-open-source-cybersecurity-analysis-samples","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/open-malsec-v0-1-open-source-cybersecurity-analysis-samples\/","title":{"rendered":"Open-MalSec V0.1 \u2013 Open-Source Cybersecurity \/ Analysis Samples"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>Evening! \ud83e\udee1 <\/p>\n<p>Just uploaded <strong>Open-MalSec v0.1<\/strong>, an early-stage <strong>open-source cybersecurity dataset<\/strong> focused on <strong>phishing, scams, and malware-related text samples<\/strong>. <\/p>\n<p>\ud83d\udcc2 <strong>This is the base version (v0.1)<\/strong>\u2014just a few structured sample files. Full dataset builds will come over the next few weeks. <\/p>\n<p>\ud83d\udd17 <strong>Dataset link:<\/strong> <a href=\"https:\/\/huggingface.co\/datasets\/tegridydev\/open-malsec\">huggingface.co\/datasets\/tegridydev\/open-malsec<\/a> <\/p>\n<p>\ud83d\udd0d What\u2019s in v0.1? <\/p>\n<p>  A <strong>few structured scam examples<\/strong> (text-based)<br \/> Covers <strong>DeFi, crypto, phishing, and social engineering<\/strong><br \/> <strong>Initial labelling format<\/strong> for scam classification  <\/p>\n<p>\u26a0\ufe0f <strong>This is not a full dataset yet.<\/strong> Just establishing the structure + getting feedback. <\/p>\n<h2>\ud83d\udcc2 Current Schema &amp; Labelling Approach<\/h2>\n<p>Each entry follows a <strong>structured JSON format<\/strong> with: <\/p>\n<p>  &#8220;instruction&#8221; \u2192 Task prompt (e.g., &#8220;Evaluate this message for scams&#8221;)<br \/> &#8220;input&#8221; \u2192 Source &amp; message details (e.g., Telegram post, Tweet)<br \/> &#8220;output&#8221; \u2192 Scam classification &amp; risk indicators  <\/p>\n<h3><strong>Sample Entry<\/strong><\/h3>\n<p>json { &#8220;instruction&#8221;: &#8220;Analyze this tweet about a new dog-themed crypto token. Determine scam indicators if any.&#8221;, &#8220;input&#8221;: { &#8220;source&#8221;: &#8220;Twitter&#8221;, &#8220;handle&#8221;: &#8220;@DogLoverCrypto&#8221;, &#8220;tweet_content&#8221;: &#8220;DOGGIEINU just launched! Invest now for instant 500% gains. Dev is ex-Binance staff. #memecrypto #moonshot&#8221; }, &#8220;output&#8221;: { &#8220;classification&#8221;: &#8220;malicious&#8221;, &#8220;description&#8221;: &#8220;Tweet claims insider connections and extreme gains for a newly launched dog-themed token.&#8221;, &#8220;indicators&#8221;: [ &#8220;Overblown profit claims (500% &#8216;instant&#8217;)&#8221;, &#8220;False or unverifiable dev background&#8221;, &#8220;Hype-based marketing with no substance&#8221;, &#8220;No legitimate documentation or audit link&#8221; ] } } <\/p>\n<p>\ud83d\uddc2\ufe0f Current v0.1 Sample Categories<\/p>\n<p>Crypto Scams \u2192 Meme token pump &amp; dumps, fake DeFi projects<\/p>\n<p>Phishing \u2192 Suspicious finance\/social media messages<\/p>\n<p>Social Engineering \u2192 Manipulative messages exploiting trust<\/p>\n<p>\ud83d\udd1c Next Steps<\/p>\n<p>\ud83d\udd0d Planned Updates:<\/p>\n<p>Expanding dataset with more phishing &amp; malware examples<\/p>\n<p>Refining schema &amp; annotation quality<\/p>\n<p>Open to feedback, contributions, and suggestions<\/p>\n<p>If this is useful, bookmark\/follow the dataset here:<\/p>\n<p>\ud83d\udd17 <a href=\"https:\/\/huggingface.co\/datasets\/tegridydev\/open-malsec\">huggingface.co\/datasets\/tegridydev\/open-malsec<\/a><\/p>\n<p>More updates coming as I expand the datasets \ud83e\udee1<\/p>\n<p>\ud83d\udcac Thoughts, feedback, and ideas are always welcome! Drop a comment or DMs are open \ud83e\udd19<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/tegridyblues\"> \/u\/tegridyblues <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1ie99w5\/openmalsec_v01_opensource_cybersecurity_analysis\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1ie99w5\/openmalsec_v01_opensource_cybersecurity_analysis\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-32437 jlk' href='javascript:void(0)' data-task='like' data-post_id='32437' data-nonce='614a020375' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-32437 lc'>0<\/span><\/a><\/div><\/div> <div class='status-32437 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>Evening! \ud83e\udee1 Just uploaded Open-MalSec v0.1, an early-stage open-source cybersecurity dataset focused on phishing, scams, and malware-related&#8230;<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-32437","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/32437","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=32437"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/32437\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=32437"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=32437"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=32437"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}