toad.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon server operated by David Troy, a tech pioneer and investigative journalist addressing threats to democracy. Thoughtful participation and discussion welcome.

Administered by:

Server stats:

227
active users

#webscraping

0 posts0 participants0 posts today
Lenin alevski 🕵️💻<p>New Open-Source Tool Spotlight 🚨🚨🚨</p><p>Scrapling is redefining Python web scraping. Adaptive, stealthy, and fast, it can bypass anti-bot measures while auto-tracking changes in website structure. A standout: 4.5x faster than AutoScraper for text-based extractions. <a href="https://infosec.exchange/tags/Python" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Python</span></a> <a href="https://infosec.exchange/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a></p><p>🔗 Project link on <a href="https://infosec.exchange/tags/GitHub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GitHub</span></a> 👉 <a href="https://github.com/D4Vinci/Scrapling" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/D4Vinci/Scrapling</span><span class="invisible"></span></a></p><p><a href="https://infosec.exchange/tags/Infosec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Infosec</span></a> <a href="https://infosec.exchange/tags/Cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Cybersecurity</span></a> <a href="https://infosec.exchange/tags/Software" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Software</span></a> <a href="https://infosec.exchange/tags/Technology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Technology</span></a> <a href="https://infosec.exchange/tags/News" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>News</span></a> <a href="https://infosec.exchange/tags/CTF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CTF</span></a> <a href="https://infosec.exchange/tags/Cybersecuritycareer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Cybersecuritycareer</span></a> <a href="https://infosec.exchange/tags/hacking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hacking</span></a> <a href="https://infosec.exchange/tags/redteam" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>redteam</span></a> <a href="https://infosec.exchange/tags/blueteam" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>blueteam</span></a> <a href="https://infosec.exchange/tags/purpleteam" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>purpleteam</span></a> <a href="https://infosec.exchange/tags/tips" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tips</span></a> <a href="https://infosec.exchange/tags/opensource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>opensource</span></a> <a href="https://infosec.exchange/tags/cloudsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cloudsecurity</span></a></p><p>— ✨<br>🔐 P.S. Found this helpful? Tap Follow for more cybersecurity tips and insights! I share weekly content for professionals and people who want to get into cyber. Happy hacking 💻🏴‍☠️</p>
Nicolas MOUART<p>"Including children’s images in datasets has raised ethical concerns, particularly regarding privacy, consent, data protection, and accountability. These datasets, often built by scraping publicly available images from the Internet, can expose children to risks such as exploitation, profiling, and tracking. "</p><p><a href="https://arxiv.org/html/2504.14446" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/html/2504.14446</span><span class="invisible"></span></a></p><p><a href="https://mastodon.social/tags/childprotection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>childprotection</span></a> <a href="https://mastodon.social/tags/technology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>technology</span></a> <a href="https://mastodon.social/tags/socialmedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>socialmedia</span></a> <a href="https://mastodon.social/tags/deepfake" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>deepfake</span></a> <a href="https://mastodon.social/tags/childabuse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>childabuse</span></a> <a href="https://mastodon.social/tags/dataset" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataset</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/webscraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webscraping</span></a></p>
Nicolas MOUART<p>Q: Based on his ideas, would Adolf Hitler be for or against GDPR and right to erasure nowadays if he still lived?</p><p>A: It's reasonable to infer that Hitler would not support a regulation like <a href="https://mastodon.social/tags/GDPR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GDPR</span></a> which emphasizes individual rights such as <a href="https://mastodon.social/tags/privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>privacy</span></a> protection, data accessibility or erasure; and instead might favor more centralized control over information dissemination for propaganda purposes. </p><p><a href="https://mastodon.social/tags/webscraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webscraping</span></a> <a href="https://mastodon.social/tags/technology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>technology</span></a> <a href="https://mastodon.social/tags/EU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>EU</span></a> <a href="https://mastodon.social/tags/history" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>history</span></a> <a href="https://mastodon.social/tags/historyrepeating" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>historyrepeating</span></a> <a href="https://mastodon.social/tags/transparency" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>transparency</span></a> <a href="https://mastodon.social/tags/regulation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>regulation</span></a> <a href="https://mastodon.social/tags/humanrights" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>humanrights</span></a></p>
ResearchBuzz: Firehose<p>ZDNet: Cloudflare just changed the internet, and it’s bad news for the AI giants. “The major internet Content Delivery Network (CDN), Cloudflare, has declared war on AI companies. Starting July 1, Cloudflare now blocks by default AI web crawlers accessing content from your websites without permission or compensation.”</p><p><a href="https://rbfirehose.com/2025/07/02/zdnet-cloudflare-just-changed-the-internet-and-its-bad-news-for-the-ai-giants/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/07/02/zdnet-cloudflare-just-changed-the-internet-and-its-bad-news-for-the-ai-giants/</a></p>
Miguel Afonso Caetano<p>"The report, titled “Are AI Bots Knocking Cultural Heritage Offline?” was written by Weinberg of the GLAM-E Lab, a joint initiative between the Centre for Science, Culture and the Law at the University of Exeter and the Engelberg Center on Innovation Law &amp; Policy at NYU Law, which works with smaller cultural institutions and community organizations to build open access capacity and expertise. GLAM is an acronym for galleries, libraries, archives, and museums. The report is based on a survey of 43 institutions with open online resources and collections in Europe, North America, and Oceania. Respondents also shared data and analytics, and some followed up with individual interviews. The data is anonymized so institutions could share information more freely, and to prevent AI bot operators from undermining their countermeasures. </p><p>Of the 43 respondents, 39 said they had experienced a recent increase in traffic. Twenty-seven of those 39 attributed the increase in traffic to AI training data bots, with an additional seven saying the AI bots could be contributing to the increase. </p><p>“Multiple respondents compared the behavior of the swarming bots to more traditional online behavior such as Distributed Denial of Service (DDoS) attacks designed to maliciously drive unsustainable levels of traffic to a server, effectively taking it offline,” the report said. “Like a DDoS incident, the swarms quickly overwhelm the collections, knocking servers offline and forcing administrators to scramble to implement countermeasures. As one respondent noted, ‘If they wanted us dead, we’d be dead.’”"</p><p><a href="https://www.404media.co/ai-scraping-bots-are-breaking-open-libraries-archives-and-museums/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">404media.co/ai-scraping-bots-a</span><span class="invisible">re-breaking-open-libraries-archives-and-museums/</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/CulturalHeritage" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CulturalHeritage</span></a> <a href="https://tldr.nettime.org/tags/AIBots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIBots</span></a> <a href="https://tldr.nettime.org/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://tldr.nettime.org/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurity</span></a> <a href="https://tldr.nettime.org/tags/DDoS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DDoS</span></a></p>
ResearchBuzz: Firehose<p>TechCrunch: Mastodon updates its terms to prohibit AI model training. “Social networks are bolstering their terms of service against scrapers and bots that crawl the website to train AI models. Days after Elon Musk-owned X updated its terms to explicitly prohibit AI model training, decentralized social network Mastodon today updated its own rules to bar any kind of model training, as well.”</p><p><a href="https://rbfirehose.com/2025/06/18/techcrunch-mastodon-updates-its-terms-to-prohibit-ai-model-training/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/06/18/techcrunch-mastodon-updates-its-terms-to-prohibit-ai-model-training/</a></p>
Harald Klinke<p>Are AI bots overwhelming digital collections?<br>A new GLAM-E Lab report shows how scrapers for AI training datasets are putting real strain on the infrastructures of galleries, libraries, archives, and museums. Technical bottlenecks, ethical dilemmas, and escalating costs—open culture is under pressure.<br>Read the full analysis:<br><a href="https://www.glamelab.org/products/are-ai-bots-knocking-cultural-heritage-offline/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">glamelab.org/products/are-ai-b</span><span class="invisible">ots-knocking-cultural-heritage-offline/</span></a><br><a href="https://det.social/tags/DigitalHeritage" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DigitalHeritage</span></a> <a href="https://det.social/tags/GLAM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GLAM</span></a> <a href="https://det.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://det.social/tags/OpenAccess" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAccess</span></a> <a href="https://det.social/tags/CulturalData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CulturalData</span></a> <a href="https://det.social/tags/MuseTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MuseTech</span></a> <a href="https://det.social/tags/DigitalHumanities" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DigitalHumanities</span></a> <a href="https://det.social/tags/GLAMlab" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GLAMlab</span></a></p>
ResearchBuzz: Firehose<p>404 Media: AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums. “AI bots that scrape the internet for training data are hammering the servers of libraries, archives, museums, and galleries, and are in some cases knocking their collections offline, according to a new survey published today.” As you might imagine this drives me absolutely WILD.</p><p><a href="https://rbfirehose.com/2025/06/17/404-media-ai-scraping-bots-are-breaking-open-libraries-archives-and-museums/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/06/17/404-media-ai-scraping-bots-are-breaking-open-libraries-archives-and-museums/</a></p>
Miguel Afonso Caetano<p>"To reiterate, whatever one's opinion of these particular AI tools, scraping itself is not the problem. Automated access is a fundamental technique of archivists, computer scientists, and everyday users that we hope is here to stay—as long as it can be done non-destructively. However, we realize that not all implementers will follow our suggestions for bots above, and that our mitigations are both technically advanced and incomplete.</p><p>Because we see so many bots operating for the same purpose at the same time, it seems there's an opportunity here to provide these automated data consumers with tailored data providers, removing the need for every AI company to scrape every website, seemingly, every day.</p><p>And on the operators' end, we hope to see more web-hosting and framework technology that is built with an awareness of these issues from day one, perhaps building in responses like just-in-time static content generation or dedicated endpoints for crawlers."</p><p><a href="https://www.eff.org/deeplinks/2025/06/keeping-web-under-weight-ai-crawlers" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">eff.org/deeplinks/2025/06/keep</span><span class="invisible">ing-web-under-weight-ai-crawlers</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/WebCrawlers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebCrawlers</span></a> <a href="https://tldr.nettime.org/tags/BigTech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigTech</span></a> <a href="https://tldr.nettime.org/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://tldr.nettime.org/tags/OpenWeb" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenWeb</span></a></p>
MDZG (Markdown Zen Garden)<p>🔍 / <a href="https://mastodon.uno/tags/software" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>software</span></a> / <a href="https://mastodon.uno/tags/automation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>automation</span></a> </p><p><a href="https://mastodon.uno/tags/UiVision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>UiVision</span></a> Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export. </p><p>🐱🔗 <a href="https://laravista.altervista.org/CatLink/links/299" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">laravista.altervista.org/CatLi</span><span class="invisible">nk/links/299</span></a></p><p><a href="https://mastodon.uno/tags/catlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>catlink</span></a> <a href="https://mastodon.uno/tags/SoftwareAutomation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SoftwareAutomation</span></a> <a href="https://mastodon.uno/tags/RPA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RPA</span></a> <a href="https://mastodon.uno/tags/OCR" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OCR</span></a> <a href="https://mastodon.uno/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://mastodon.uno/tags/A9T9" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>A9T9</span></a> <a href="https://mastodon.uno/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://mastodon.uno/tags/BrowserExtension" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrowserExtension</span></a> <a href="https://mastodon.uno/tags/imacros" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>imacros</span></a> <a href="https://mastodon.uno/tags/SeleniumIDE" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SeleniumIDE</span></a> <a href="https://mastodon.uno/tags/BrowserAutomation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrowserAutomation</span></a> <a href="https://mastodon.uno/tags/WebAutomation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebAutomation</span></a> <a href="https://mastodon.uno/tags/DesktopAutomation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DesktopAutomation</span></a> <a href="https://mastodon.uno/tags/DataDrivenTests" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataDrivenTests</span></a> <a href="https://mastodon.uno/tags/Anthropic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Anthropic</span></a> <a href="https://mastodon.uno/tags/AnthropicClaude" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AnthropicClaude</span></a> <a href="https://mastodon.uno/tags/ComputerUse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ComputerUse</span></a> <a href="https://mastodon.uno/tags/tutorial" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tutorial</span></a></p>
Loki the Cat<p>AI companies: "We're just browsing!" Also AI companies: *scrapes 26M+ pages in March alone while bypassing blockers* Publishers: "This is fine" 🔥🤖</p><p>Traffic from AI bots grew 49% but monetization remains elusive. The digital feast continues unpaid.</p><p><a href="https://news.slashdot.org/story/25/06/14/021246/increased-traffic-from-web-scraping-ai-bots-is-hard-to-monetize" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">news.slashdot.org/story/25/06/</span><span class="invisible">14/021246/increased-traffic-from-web-scraping-ai-bots-is-hard-to-monetize</span></a></p><p><a href="https://toot.community/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://toot.community/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://toot.community/tags/Publishers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Publishers</span></a></p>
ResearchBuzz: Firehose<p>The Register: Reddit sues Anthropic for scraping content into the maw of its eternally ravenous AI. “Reddit, the popular internet discussion forum, sued Anthropic on Wednesday, alleging that the AI biz scraped content generated by its users in violation of contractual terms and technical barriers. The complaint [PDF], filed in San Francisco Superior Court on Wednesday, claims Anthropic’s use […]</p><p><a href="https://rbfirehose.com/2025/06/07/the-register-reddit-sues-anthropic-for-scraping-content-into-the-maw-of-its-eternally-ravenous-ai/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/06/07/the-register-reddit-sues-anthropic-for-scraping-content-into-the-maw-of-its-eternally-ravenous-ai/</a></p>
ResearchBuzz: Firehose<p>The Map Room: Unauthorized Waffle House Index Disaster Maps Taken Down. “The Waffle House Index is an informal metric used to assess the severity of a storm in the U.S. South, because Waffle House restaurants don’t close unless Things Are Very Bad. But when Jack LaFond scraped Waffle House’s website to build a map tracking restaurant closures last fall, he got a cease-and-desist from […]</p><p><a href="https://rbfirehose.com/2025/05/31/the-map-room-unauthorized-waffle-house-index-disaster-maps-taken-down/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/05/31/the-map-room-unauthorized-waffle-house-index-disaster-maps-taken-down/</a></p>
Pyrzout :vm:<p>Threat Actor Claims TikTok Breach, Puts 428 Million Records Up for Sale <a href="https://hackread.com/threat-actor-tiktok-breach-428-million-records-sale/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackread.com/threat-actor-tikt</span><span class="invisible">ok-breach-428-million-records-sale/</span></a> <a href="https://social.skynetcloud.site/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a> <a href="https://social.skynetcloud.site/tags/Cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Cybersecurity</span></a> <a href="https://social.skynetcloud.site/tags/CyberAttack" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberAttack</span></a> <a href="https://social.skynetcloud.site/tags/SocialMedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SocialMedia</span></a> <a href="https://social.skynetcloud.site/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://social.skynetcloud.site/tags/databreach" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>databreach</span></a> <a href="https://social.skynetcloud.site/tags/Scrapping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scrapping</span></a> <a href="https://social.skynetcloud.site/tags/Security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Security</span></a> <a href="https://social.skynetcloud.site/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://social.skynetcloud.site/tags/TikTok" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TikTok</span></a></p>
Pyrzout :vm:<p>Threat Actor Claims TikTok Breach, Puts 428 Million Records Up for Sale – Source:hackread.com <a href="https://ciso2ciso.com/threat-actor-claims-tiktok-breach-puts-428-million-records-up-for-sale-sourcehackread-com/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">ciso2ciso.com/threat-actor-cla</span><span class="invisible">ims-tiktok-breach-puts-428-million-records-up-for-sale-sourcehackread-com/</span></a> <a href="https://social.skynetcloud.site/tags/1CyberSecurityNewsPost" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>1CyberSecurityNewsPost</span></a> <a href="https://social.skynetcloud.site/tags/CyberSecurityNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurityNews</span></a> <a href="https://social.skynetcloud.site/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurity</span></a> <a href="https://social.skynetcloud.site/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a> <a href="https://social.skynetcloud.site/tags/CyberAttack" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberAttack</span></a> <a href="https://social.skynetcloud.site/tags/SocialMedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SocialMedia</span></a> <a href="https://social.skynetcloud.site/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://social.skynetcloud.site/tags/DataBreach" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataBreach</span></a> <a href="https://social.skynetcloud.site/tags/Scrapping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scrapping</span></a> <a href="https://social.skynetcloud.site/tags/Hackread" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hackread</span></a> <a href="https://social.skynetcloud.site/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://social.skynetcloud.site/tags/TikTok" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TikTok</span></a></p>
ResearchBuzz: Firehose<p>404 Media: Developer Builds Tool That Scrapes YouTube Comments, Uses AI to Predict Where Users Live. “If you’ve left a comment on a YouTube video, a new website claims it might be able to find every comment you’ve ever left on any video you’ve ever watched. Then an AI can build a profile of the commenter and guess where you live, what languages you speak, and what your politics might […]</p><p><a href="https://rbfirehose.com/2025/05/29/404-media-developer-builds-tool-that-scrapes-youtube-comments-uses-ai-to-predict-where-users-live/" class="" rel="nofollow noopener" target="_blank">https://rbfirehose.com/2025/05/29/404-media-developer-builds-tool-that-scrapes-youtube-comments-uses-ai-to-predict-where-users-live/</a></p>
PromptCloud<p>Tired of babysitting DIY scraping scripts that crash the moment you scale?<br>You’re not alone.</p><p>PromptCloud takes the pain out of large-scale data extraction with fully managed, reliable solutions — so you can focus on what really matters: insights.</p><p>🔗 <a href="https://shorturl.at/EApIO" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">shorturl.at/EApIO</span><span class="invisible"></span></a></p><p><a href="https://mastodon.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://mastodon.social/tags/OpenData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenData</span></a> <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.social/tags/BigData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigData</span></a> <a href="https://mastodon.social/tags/Automation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Automation</span></a> <a href="https://mastodon.social/tags/PromptCloud" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PromptCloud</span></a> <a href="https://mastodon.social/tags/TechForGood" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechForGood</span></a> <a href="https://mastodon.social/tags/DataOps" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataOps</span></a></p>
Carlo Zottmann<p>Need to grab specific info from a webpage regularly? 🤔 Browser Actions can help! Create a Shortcut to: Open URL ➡️ Wait for data element ➡️ Run JavaScript to extract text ➡️ Pass it back to Shortcuts!</p><p>If you need help with that, just follow the Forum link on the site!</p><p><a href="https://actions.work/browser-actions?ref=mastodon-b10" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">actions.work/browser-actions?r</span><span class="invisible">ef=mastodon-b10</span></a></p><p><a href="https://norden.social/tags/macOS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>macOS</span></a> <a href="https://norden.social/tags/Shortcuts" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Shortcuts</span></a> <a href="https://norden.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a> <a href="https://norden.social/tags/DataExtraction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataExtraction</span></a> <a href="https://norden.social/tags/BrowserAutomation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BrowserAutomation</span></a></p>
@reiver ⊼ (Charles) :batman:<p>3/</p><p>For more on scraping (as in web-scraping) see here:<br><a href="https://mastodon.social/@reiver/114353728684249608" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.social/@reiver/114353</span><span class="invisible">728684249608</span></a></p><p>CC: <span class="h-card" translate="no"><a href="https://mastodon.social/@404mediaco" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>404mediaco</span></a></span> </p><p><a href="https://mastodon.social/tags/Scraper" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scraper</span></a> <a href="https://mastodon.social/tags/Scraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scraping</span></a> <a href="https://mastodon.social/tags/WebScraper" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraper</span></a> <a href="https://mastodon.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a></p>
@reiver ⊼ (Charles) :batman:<p>2/</p><p>Scraping (as in Web Scraping) is the act of extracting data from HTML web-pages where the data is NOT machine-legible.</p><p>If the data, even in an HTML web-page, is in a machine-legible format, then it is NOT scraping.</p><p>...</p><p>And, getting data in JSON (key-value pairs) is definitely NOT scraping — as JSON's purpose is to communicate data in a machine-legible manner.</p><p>CC: <span class="h-card" translate="no"><a href="https://mastodon.social/@404mediaco" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>404mediaco</span></a></span> </p><p><a href="https://mastodon.social/tags/Scraper" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scraper</span></a> <a href="https://mastodon.social/tags/Scraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scraping</span></a> <a href="https://mastodon.social/tags/WebScraper" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraper</span></a> <a href="https://mastodon.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebScraping</span></a></p>