Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / New
Stats: 3,194,116 members, 7,953,433 topics. Date: Thursday, 19 September 2024 at 04:08 PM

Making A Crude Website API - Programming - Nairaland

Nairaland Forum / Science/Technology / Programming / Making A Crude Website API (1419 Views)

Two Step Verification Or Authentication Using Sms Api (tutorial) / Making A Facebook Clone Using Rails In Minimum Time / Nigerian Stock Exchange (nse)api (2) (3) (4)

(1) (Reply) (Go Down)

Making A Crude Website API by lordZOUGA(m): 3:05pm On Jul 21, 2012
For example, A site like nairaland that has no API and I really want to make a client for it. So I decide since HTML is just markup, that if I parse the HTML and get whatever data I want from the markup and put into XML then use it for any evil stuff I want to do with it. What restrictions am I likely to encounter?
Re: Making A Crude Website API by Chimanet(m): 3:40pm On Jul 21, 2012
Before u process nairaland with an xml parser, u have 2 b sure dat nairaland is xhtml compliant, or else ur parser will keep throwing exception
Re: Making A Crude Website API by lordZOUGA(m): 4:06pm On Jul 21, 2012
Chimanet: Before u process nairaland with an xml parser, u have 2 b sure dat nairaland is xhtml compliant, or else ur parser will keep throwing exception
assuming I was able to parse it successfully
Re: Making A Crude Website API by lordZOUGA(m): 4:51pm On Jul 21, 2012
Maybe I decide to let the xml converter software reside on a proxy server.... I make requests to it, it makes the request to the site in question strips it of useless data then wraps it in xml and passes it to my client... I can always use any user agent even a mobile browser's user-agent to reduce size of data to parse...
Example request for a user's profile on NL assuming I have access to all the headers...
GET http:\\nairaland.com\user HTTP 1.1
accept-encoding: *\*
Host: www.nairaland.com
user-agent: mozilla.....

Will this work?
Re: Making A Crude Website API by Nobody: 5:19pm On Jul 21, 2012
with nairaland undergoing continuous updates you can bet your code will break every 2/3 days
expect this as long as seun breaths.
Re: Making A Crude Website API by lordZOUGA(m): 5:26pm On Jul 21, 2012
webdezzi: with nairaland undergoing continuous updates you can bet your code will break every 2/3 days
expect this as long as seun breaths.
so the only problem I'll have is if the webmaster updates the site?
Re: Making A Crude Website API by lordZOUGA(m): 5:36pm On Jul 21, 2012
webdezzi: with nairaland undergoing continuous updates you can bet your code will break every 2/3 days
expect this as long as seun breaths.
so the only problem I'll have is if the webmaster updates the site? But then he will most likely be adding new features to his site than to change the name of the <div> that contains posts' data..
Re: Making A Crude Website API by Chimanet(m): 5:37pm On Jul 21, 2012
Is quite an unrealizable project my dear, if is feasible nairaland would have brought out an api since easily, nairaland was not designed as a restful webservice from scratch, its more of a usual website design architecture.
Re: Making A Crude Website API by lordZOUGA(m): 5:46pm On Jul 21, 2012
Chimanet: Is quite an unrealizable project my dear, if is feasible nairaland would have brought out an api since easily, nairaland was not designed as a restful webservice from scratch, its more of a usual website design architecture.
so you understand what I mean and now you wonder why it hasn't been done... I don't have to care about the underlying architecture as all my client needs is to know how the data is marked up..
Re: Making A Crude Website API by worldbest(m): 7:32pm On Jul 21, 2012
What you want to do is screen scraping and its very possible. I recently scraped nl's homepage to confirm a users session In one of my projects. One of the problem with this approach is the markup changing. I don't think that would happen often though. But then, if seun finds out that you are eating up his bandwidth, he might just ban your crawlers.
Re: Making A Crude Website API by lordZOUGA(m): 10:51pm On Jul 21, 2012
worldbest: What you want to do is screen scraping and its very possible. I recently scraped nl's homepage to confirm a users session In one of my projects. One of the problem with this approach is the markup changing. I don't think that would happen often though. But then, if seun finds out that you are eating up his bandwidth, he might just ban your crawlers.
scraping.. Hmmm.. Quite a befitting name. How can the webmaster find out cos for all the server knows am browsing with my mozilla... NL is just an example I don't plan on using it on NL
Re: Making A Crude Website API by worldbest(m): 12:52am On Jul 22, 2012
How the server finds out depends on how often your crawlers visit. Even if you use a mozilla user-agent, the admin might suspect you if they see that an ip is visiting too fast/often and they may be able check how much bandwidth your crawler's ip has used up. Some websites do this automatically, if your crawler is irresponsible enough to crawl pages without adequate delays, they may block it by redirecting to a captcha page. Nothing beats an API.
Re: Making A Crude Website API by lordZOUGA(m): 1:21am On Jul 22, 2012
worldbest: How the server finds out depends on how often your crawlers visit. Even if you use a mozilla user-agent, the admin might suspect you if they see that an ip is visiting too fast/often and they may be able check how much bandwidth your crawler's ip has used up. Some websites do this automatically, if your crawler is irresponsible enough to crawl pages without adequate delays, they may block it by redirecting to a captcha page. Nothing beats an API.
okay. This makes sense. But the webmaster is aware... So I don't have to worry.
Re: Making A Crude Website API by worldbest(m): 2:08am On Jul 22, 2012
Since the webmaster is aware, that's fine. Just run a test on the page regularly to be sure the markup hasn't changed.
Re: Making A Crude Website API by lordZOUGA(m): 6:20am On Jul 22, 2012
worldbest: Since the webmaster is aware, that's fine. Just run a test on the page regularly to be sure the markup hasn't changed.
okay bosss. Thanks
Re: Making A Crude Website API by Lisa1: 8:06am On Jul 23, 2012
Well
Re: Making A Crude Website API by ektbear: 8:15am On Jul 23, 2012
It will be pretty annoying unless the website itself provides one. You'll have to replicate lots of NL functionality, write a scraper, etc.
Re: Making A Crude Website API by lordZOUGA(m): 10:14am On Jul 23, 2012
ekt_bear: It will be pretty annoying unless the website itself provides one. You'll have to replicate lots of NL functionality, write a scraper, etc.
you mean like going through links and figure out how the requests are constructed?
Re: Making A Crude Website API by netesy(m): 11:10pm On Jul 23, 2012
hmm learning
Re: Making A Crude Website API by ektbear: 2:11am On Jul 25, 2012
lordZOUGA:
you mean like going through links and figure out how the requests are constructed?

What actions do you want your API to allow? If read only operations, then you can do it. But write sort of operations...more difficult.

(1) (Reply)

How Do You Browse The Net On Linux Or Unix? / Create A Sec,ure / Need Help On My Project

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 25
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.