Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / NewStats: 3,150,882 members, 7,810,382 topics. Date: Saturday, 27 April 2024 at 08:03 AM |
Nairaland Forum / Science/Technology / Programming / How I Built A Nairaland Web Scraper (1122 Views)
Looking For Web Scraper For A Paid Scraping Job / Hi, Nairaland Web Devs, Checkout My Freelance Website. / Review My App I Built From Scratch (2) (3) (4)
How I Built A Nairaland Web Scraper by DataMina: 11:20am On Oct 19, 2023 |
Hello everyone. I am a data analyst who enjoys webscraping. Nairaland has been my go to place to keep up with trending news asides Twitter, and being a lover of this platform I decided to build a webscrapper with Selenium and Python. This scraper is designed to extract the thread topics the count of views, users, and guests from Nairaland. The Scraped data can be used to develop a power bi report that can be refreshed each time Scraping is done. The published dashboard can be used to morning threads that generate the highest and lowest views. You can check out the project in my GitHub repository:https://github.com/StephDAnalyst/Nairaland 3 Likes 3 Shares |
Re: How I Built A Nairaland Web Scraper by Babangidapikin: 11:27am On Oct 19, 2023 |
DataMina:Good for practice, can you scrap LinkedIn for email address. |
Re: How I Built A Nairaland Web Scraper by DataMina: 11:30am On Oct 19, 2023 |
I use a different trick to do it. If you need the service you can let me know |
Re: How I Built A Nairaland Web Scraper by BlackhatMentor: 11:32am On Oct 19, 2023 |
A scrapper that scrapes emails, phone numbers with their usernames will be better. This one u did isn't very useful |
Re: How I Built A Nairaland Web Scraper by DataMina: 11:35am On Oct 19, 2023 |
I know, I have worked on a scraper that scraps emails from LinkedIn using Google search after which I used stringr libraries to extract emails from the text. I get your point though... |
Re: How I Built A Nairaland Web Scraper by princely4ever: 11:37am On Oct 19, 2023 |
My own webscraper/datascraper lets you extract web assets including html, css and javascript files 1 Like 1 Share |
Re: How I Built A Nairaland Web Scraper by Babangidapikin: 11:40am On Oct 19, 2023 |
DataMina:Okay by the way can you visualize data |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 11:48am On Oct 19, 2023 |
BlackhatMentor: How isn't this useful? A social media management team would use the report from this scrape to target threads and topics they should be driving contentand engagement with. Some of you need to think laterally 1 Like |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 11:49am On Oct 19, 2023 |
OP, Is it possible to do the "impossible"? Scrape the entire site's content and dump in an LLM 1 Like |
Re: How I Built A Nairaland Web Scraper by BlackhatMentor: 11:52am On Oct 19, 2023 |
airsaylongcome: I don't just comment for commenting sake. It's not useful... Nairaland already provides that data so what's the point creating another one |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 11:56am On Oct 19, 2023 |
BlackhatMentor: Ehhhh...what's the point? How about the OP learning? How about having a visual Dashboard instead of Nairaland bland numbers only without any visual perception of how those numbers compare? I don't get up and make random comments. But to say the OP's work isn't useful is absolutely not correct. It is useful. I find it useful and I'm sure there are loads and loads of non-data science or data engineering folks that will find it very useful |
Re: How I Built A Nairaland Web Scraper by DataMina: 11:57am On Oct 19, 2023 |
airsaylongcome: |
Re: How I Built A Nairaland Web Scraper by DataMina: 11:59am On Oct 19, 2023 |
[quote author=DataMina post=126494004][/quote] It is very much doable, but you know you have to content with CAPTCHas |
Re: How I Built A Nairaland Web Scraper by BlackhatMentor: 11:59am On Oct 19, 2023 |
airsaylongcome: It's useful for learning purpose only. I don't see any problem it offers a solution to sha. 1 Like |
Re: How I Built A Nairaland Web Scraper by DataMina: 12:00pm On Oct 19, 2023 |
Babangidapikin:Integrating the data to Power BI allows you to do that. With BI tools you can even do data refresh 1 Like |
Re: How I Built A Nairaland Web Scraper by Cheryl463337: 12:19pm On Oct 19, 2023 |
DataMina:why can't you use normal requests and BeautifulSoup which i believe will be faster than selenium? |
Re: How I Built A Nairaland Web Scraper by DataMina: 1:25pm On Oct 19, 2023 |
Cheryl463337:I was just experimenting with it because when I tried using Octoparse (a badass no code tool) to scrap nairaland website, it couldn't work because the site didn't appear structured. So I decided to experiment with Selenium and it worked |
Re: How I Built A Nairaland Web Scraper by BeLookingIDIOT(m): 2:26pm On Oct 19, 2023 |
You're doing something illegal while announcing it on the very platform 1 Like |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 3:08pm On Oct 19, 2023 |
BeLookingIDIOT: Illegal is pushing it a bit. Unethical, yes. But if NL doesn’t expose APIs for devs to legally consume data, then people have no option than to scrape shege from it. 2 Likes |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 3:10pm On Oct 19, 2023 |
BlackhatMentor: Think of it as first alpha. They definitely would refine it until they get to v1 |
Re: How I Built A Nairaland Web Scraper by Cheryl463337: 3:18pm On Oct 19, 2023 |
DataMina:Okay |
Re: How I Built A Nairaland Web Scraper by Felixitie(m): 3:44pm On Oct 19, 2023 |
DataMina: Tho, it seems the page loads dynamically making Bs4 hard to easily get the data out, selenium can load the page and render the javascript, then you may now use Bs4 to soup it and get the stuff (combination of Sele&Bs4). Scrappy works too easily. You can as well grab all the front page topic links first and then loop through it using Bs4 to get all the data points , to improve the speed. You have done so well. Can we work on a portfolio project together using scrapy with splash or scrapy with playwright to generate leads, then we dump it into a database plus scheduling using airflow? |
Re: How I Built A Nairaland Web Scraper by Felixitie(m): 3:49pm On Oct 19, 2023 |
airsaylongcome: To scrape 'SHEGE' from it. Lol. |
Re: How I Built A Nairaland Web Scraper by BlackhatMentor: 4:08pm On Oct 19, 2023 |
airsaylongcome: If you say so lol |
Re: How I Built A Nairaland Web Scraper by DataMina: 5:41pm On Oct 19, 2023 |
Felixitie:I was trying to use scrapy and playwright and it turned out that playwright doesn't do well with windows. I tried virtualizing with wsl2 yet it still didn't work. So I decided to stick with Selenium pending when i lay hands on a Mac or Linux PC. My WhatsApp is zero813six3six5six03 1 Like |
Re: How I Built A Nairaland Web Scraper by landiqa(m): 6:10pm On Oct 19, 2023 |
Can you develop one that scrape for phone numbers and validate the phone numbers for Whatsapp. |
Re: How I Built A Nairaland Web Scraper by DataMina: 8:09pm On Oct 19, 2023 |
landiqa:I can scrap Jiji for phone numbers but the validation on Whatsapp is what I don't know about 1 Like |
Re: How I Built A Nairaland Web Scraper by Paystack: 10:02pm On Oct 19, 2023 |
DataMina: I believe validating on WhatsApp shouldn't be an issue tho |
Re: How I Built A Nairaland Web Scraper by DyingFetus: 2:39am On Oct 20, 2023 |
I did something but not web scrapping just extraction of useful threads and posts by certain monikers using requests |
Re: How I Built A Nairaland Web Scraper by DyingFetus: 2:42am On Oct 20, 2023 |
airsaylongcome: |
Re: How I Built A Nairaland Web Scraper by turmacs(f): 6:00am On Oct 20, 2023 |
BlackhatMentor:idiot, do your own then. |
Re: How I Built A Nairaland Web Scraper by airsaylongcome: 8:01am On Oct 20, 2023 |
DyingFetus: That would be an interesting one. There are some monikers that I believe are alts for the Nairaland bbq griller. Would be interesting to scrape his main and the suspected alts for comparison of writing style and similarity |
Networking Or Database..pls Help Me Out. / Building For Android And Blackberry In C# / PHP: Five Common Causes Of White Screen Of Death
(Go Up)
Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health religion celebs tv-movies music-radio literature webmasters programming techmarket Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 49 |