Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / New
Stats: 3,162,182 members, 7,849,682 topics. Date: Tuesday, 04 June 2024 at 07:53 AM

DataMina's Posts

Nairaland Forum / DataMina's Profile / DataMina's Posts

(1) (2) (3) (of 3 pages)

Politics / Re: "Anything That Wants To Happen, Let It Happen" - Governor Fubara Says (Video) by DataMina: 5:50pm On Oct 30, 2023
mrksquare:


They do not know. So we owe it a duty to inform them.
Stop lying if not for Nigerian civil that forced Opobians to hide their identity Opobo is a completely Igbo town. EOD!!!
Programming / Re: How I Crawled Leads From Jiji by DataMina: 11:11pm On Oct 26, 2023
Felixitie:


Great job! Being able to handle the infinite scrolling on jiji page, different websites with different technicalities when it comes to data extraction. I did a project for a client recently, Airbnb and zillow, omoh the zillow gave me wahala small, reasons were, all the Web elements did not load except you scroll the page bit by bit down to the button, and to be able to scroll the page you have to minimise the driver window., very interesting.
Nice one op.
Thanks. You are right there is no one perfect fit all to scraping. If it was that straight forward the client wouldn't have needed your service 👌

Programming / Re: How I Crawled Leads From Jiji by DataMina: 10:43pm On Oct 26, 2023
BlackhatMentor:


Sure
It sells on Upwork and Fiverr like pure water but being able to break through is the issue
Programming / Re: How I Crawled Leads From Jiji by DataMina: 7:26am On Oct 26, 2023
airsaylongcome:
OP,

Word of advice, never use your personal login details for webscrapping
Yea thanks I have changed that
Programming / Re: How I Crawled Leads From Jiji by DataMina: 8:43pm On Oct 25, 2023
landiqa:


Finally. Your contact please
0813six3six5six03
Programming / Re: How I Crawled Leads From Jiji by DataMina: 3:25pm On Oct 25, 2023
GOSPELTRUTH31:


Do you offer classes online
Nope I don't but I am only interested in landing webscraping gigs. To learn webscraping, you can check out freecodecamp on YouTube
Programming / Re: How I Crawled Leads From Jiji by DataMina: 10:36am On Oct 25, 2023
GOSPELTRUTH31:
How can I make money from this
How it works often on Upwork or Fiverr clients make request to scrap specific websites or leads for cold email marketing.
Programming / How I Crawled Leads From The Most Difficult Nigerian Site by DataMina: 6:58pm On Oct 24, 2023
Hello Nairalanders,

I want to share my experience in crawling leads from Jiji. As a web scraping enthusiast, I've tackled various sites, but this one proved to be difficult. This is because the phone numbers are in the product detail page and you have to be logged in to get it. Another issue I faced was that the website is loaded with java script and uses infinite scrolling.

I was able to circumvent these road blocks by using selenium to login to the site with my personal details after which I saved the cookies as a json file. As the cite uses infinite scrolling I scraped all the sellers url links to a csv file. I then read the url links so as to crawl the site using the links.

Remember my login details have been saved as cookies in a json file, In the process of crawling the site again using the URLs, I applied the cookies from the json file. The crawler visited each seller's detail page using the saved URL links and clicked the "Show Contact" button to extract the contact information.

Retailers specializing in second-hand items like laptops can leverage this approach coupled with applied analytics on the scraped the data to gain swift access to mouth-watering deals, faster than a regular visitor.

You can check out the code for executing the project in my github repo:
https://github.com/StephDAnalyst/JijiLeadScraping

1 Like

Programming / How I Crawled Leads From Jiji by DataMina: 6:50pm On Oct 24, 2023
Hello Nairalanders,

I want to share my experience in crawling leads from Jiji. As a web scraping enthusiast, I've tackled various sites, but this one proved to be difficult. This is because the phone numbers are in the product detail page and you have to be logged in to get it. Another issue I faced was that the website is loaded with java script and uses infinite scrolling.

I was able to circumvent these road blocks by using selenium to login to the site with my personal details after which I saved the cookies as a json file. As the cite uses infinite scrolling I scraped all the sellers url links to a csv file. I then read the url links so as to crawl the site using the links.

Remember my login details have been saved as cookies in a json file, In the process of crawling the site again using the URLs, I applied the cookies from the json file. The crawler visited each seller's detail page using the saved URL links and clicked the "Show Contact" button to extract the contact information.

Retailers specializing in second-hand items like laptops can leverage this approach coupled with applied analytics on the scraped the data to gain swift access to mouth-watering deals, faster than regular visitor

You can check out the code for executing the project in my github repo: https://github.com/StephDAnalyst/JijiLeadScraping

1 Like

Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 2:39pm On Oct 24, 2023
alterego17:


Bro can u help in getting leads for an affiliate targeted niche adverts?
Sure it's very much doable. I have been able to write a scraper that scraps leads/ phone number from Jiji. The project is in my GitHub and I would be dropping how I was able to do the project here very soon
Programming / Re: Who Can Use Matlab And Simulink To Solve Process Control Problems? by DataMina: 12:32pm On Oct 20, 2023
What kind of simulation do you want to do
Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 8:09pm On Oct 19, 2023
landiqa:
Can you develop one that scrape for phone numbers and validate the phone numbers for Whatsapp.
I can scrap Jiji for phone numbers but the validation on Whatsapp is what I don't know about

1 Like

Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 5:41pm On Oct 19, 2023
Felixitie:


Tho, it seems the page loads dynamically making Bs4 hard to easily get the data out, selenium can load the page and render the javascript, then you may now use Bs4 to soup it and get the stuff (combination of Sele&Bs4). Scrappy works too easily.

You can as well grab all the front page topic links first and then loop through it using Bs4 to get all the data points , to improve the speed.

You have done so well.

Can we work on a portfolio project together using scrapy with splash or scrapy with playwright to generate leads, then we dump it into a database plus scheduling using airflow?
I was trying to use scrapy and playwright and it turned out that playwright doesn't do well with windows. I tried virtualizing with wsl2 yet it still didn't work. So I decided to stick with Selenium pending when i lay hands on a Mac or Linux PC.

My WhatsApp is zero813six3six5six03

1 Like

Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 1:25pm On Oct 19, 2023
Cheryl463337:
why can't you use normal requests and BeautifulSoup which i believe will be faster than selenium?
I was just experimenting with it because when I tried using Octoparse (a badass no code tool) to scrap nairaland website, it couldn't work because the site didn't appear structured. So I decided to experiment with Selenium and it worked
Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 12:00pm On Oct 19, 2023
Babangidapikin:

Okay by the way can you visualize data
Integrating the data to Power BI allows you to do that. With BI tools you can even do data refresh

1 Like

Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 11:59am On Oct 19, 2023
[quote author=DataMina post=126494004][/quote]
It is very much doable, but you know you have to content with CAPTCHas
Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 11:57am On Oct 19, 2023
airsaylongcome:
OP,

Is it possible to do the "impossible"? Scrape the entire site's content and dump in an LLM
Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 11:35am On Oct 19, 2023
I know, I have worked on a scraper that scraps emails from LinkedIn using Google search after which I used stringr libraries to extract emails from the text. I get your point though...
Programming / Re: How I Built A Nairaland Web Scraper by DataMina: 11:30am On Oct 19, 2023
I use a different trick to do it. If you need the service you can let me know
Programming / How I Built A Nairaland Web Scraper by DataMina: 11:20am On Oct 19, 2023
Hello everyone. I am a data analyst who enjoys webscraping. Nairaland has been my go to place to keep up with trending news asides Twitter, and being a lover of this platform I decided to build a webscrapper with Selenium and Python.

This scraper is designed to extract the thread topics the count of views, users, and guests from Nairaland.

The Scraped data can be used to develop a power bi report that can be refreshed each time Scraping is done. The published dashboard can be used to morning threads that generate the highest and lowest views. You can check out the project in my GitHub repository:https://github.com/StephDAnalyst/Nairaland

3 Likes 3 Shares

(1) (2) (3) (of 3 pages)

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 35
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.