Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / NewStats: 3,194,922 members, 7,956,466 topics. Date: Monday, 23 September 2024 at 12:30 PM |
Nairaland Forum / Science/Technology / Programming / Making A Crude Website API (1421 Views)
Two Step Verification Or Authentication Using Sms Api (tutorial) / Making A Facebook Clone Using Rails In Minimum Time / Nigerian Stock Exchange (nse)api (2) (3) (4)
Making A Crude Website API by lordZOUGA(m): 3:05pm On Jul 21, 2012 |
For example, A site like nairaland that has no API and I really want to make a client for it. So I decide since HTML is just markup, that if I parse the HTML and get whatever data I want from the markup and put into XML then use it for any evil stuff I want to do with it. What restrictions am I likely to encounter? |
Re: Making A Crude Website API by Chimanet(m): 3:40pm On Jul 21, 2012 |
Before u process nairaland with an xml parser, u have 2 b sure dat nairaland is xhtml compliant, or else ur parser will keep throwing exception |
Re: Making A Crude Website API by lordZOUGA(m): 4:06pm On Jul 21, 2012 |
Chimanet: Before u process nairaland with an xml parser, u have 2 b sure dat nairaland is xhtml compliant, or else ur parser will keep throwing exceptionassuming I was able to parse it successfully |
Re: Making A Crude Website API by lordZOUGA(m): 4:51pm On Jul 21, 2012 |
Maybe I decide to let the xml converter software reside on a proxy server.... I make requests to it, it makes the request to the site in question strips it of useless data then wraps it in xml and passes it to my client... I can always use any user agent even a mobile browser's user-agent to reduce size of data to parse... Example request for a user's profile on NL assuming I have access to all the headers... GET http:\\nairaland.com\user HTTP 1.1 accept-encoding: *\* Host: www.nairaland.com user-agent: mozilla..... Will this work? |
Re: Making A Crude Website API by Nobody: 5:19pm On Jul 21, 2012 |
with nairaland undergoing continuous updates you can bet your code will break every 2/3 days expect this as long as seun breaths. |
Re: Making A Crude Website API by lordZOUGA(m): 5:26pm On Jul 21, 2012 |
webdezzi: with nairaland undergoing continuous updates you can bet your code will break every 2/3 daysso the only problem I'll have is if the webmaster updates the site? |
Re: Making A Crude Website API by lordZOUGA(m): 5:36pm On Jul 21, 2012 |
webdezzi: with nairaland undergoing continuous updates you can bet your code will break every 2/3 daysso the only problem I'll have is if the webmaster updates the site? But then he will most likely be adding new features to his site than to change the name of the <div> that contains posts' data.. |
Re: Making A Crude Website API by Chimanet(m): 5:37pm On Jul 21, 2012 |
Is quite an unrealizable project my dear, if is feasible nairaland would have brought out an api since easily, nairaland was not designed as a restful webservice from scratch, its more of a usual website design architecture. |
Re: Making A Crude Website API by lordZOUGA(m): 5:46pm On Jul 21, 2012 |
Chimanet: Is quite an unrealizable project my dear, if is feasible nairaland would have brought out an api since easily, nairaland was not designed as a restful webservice from scratch, its more of a usual website design architecture.so you understand what I mean and now you wonder why it hasn't been done... I don't have to care about the underlying architecture as all my client needs is to know how the data is marked up.. |
Re: Making A Crude Website API by worldbest(m): 7:32pm On Jul 21, 2012 |
What you want to do is screen scraping and its very possible. I recently scraped nl's homepage to confirm a users session In one of my projects. One of the problem with this approach is the markup changing. I don't think that would happen often though. But then, if seun finds out that you are eating up his bandwidth, he might just ban your crawlers. |
Re: Making A Crude Website API by lordZOUGA(m): 10:51pm On Jul 21, 2012 |
worldbest: What you want to do is screen scraping and its very possible. I recently scraped nl's homepage to confirm a users session In one of my projects. One of the problem with this approach is the markup changing. I don't think that would happen often though. But then, if seun finds out that you are eating up his bandwidth, he might just ban your crawlers.scraping.. Hmmm.. Quite a befitting name. How can the webmaster find out cos for all the server knows am browsing with my mozilla... NL is just an example I don't plan on using it on NL |
Re: Making A Crude Website API by worldbest(m): 12:52am On Jul 22, 2012 |
How the server finds out depends on how often your crawlers visit. Even if you use a mozilla user-agent, the admin might suspect you if they see that an ip is visiting too fast/often and they may be able check how much bandwidth your crawler's ip has used up. Some websites do this automatically, if your crawler is irresponsible enough to crawl pages without adequate delays, they may block it by redirecting to a captcha page. Nothing beats an API. |
Re: Making A Crude Website API by lordZOUGA(m): 1:21am On Jul 22, 2012 |
worldbest: How the server finds out depends on how often your crawlers visit. Even if you use a mozilla user-agent, the admin might suspect you if they see that an ip is visiting too fast/often and they may be able check how much bandwidth your crawler's ip has used up. Some websites do this automatically, if your crawler is irresponsible enough to crawl pages without adequate delays, they may block it by redirecting to a captcha page. Nothing beats an API.okay. This makes sense. But the webmaster is aware... So I don't have to worry. |
Re: Making A Crude Website API by worldbest(m): 2:08am On Jul 22, 2012 |
Since the webmaster is aware, that's fine. Just run a test on the page regularly to be sure the markup hasn't changed. |
Re: Making A Crude Website API by lordZOUGA(m): 6:20am On Jul 22, 2012 |
worldbest: Since the webmaster is aware, that's fine. Just run a test on the page regularly to be sure the markup hasn't changed.okay bosss. Thanks |
Re: Making A Crude Website API by Lisa1: 8:06am On Jul 23, 2012 |
Well |
Re: Making A Crude Website API by ektbear: 8:15am On Jul 23, 2012 |
It will be pretty annoying unless the website itself provides one. You'll have to replicate lots of NL functionality, write a scraper, etc. |
Re: Making A Crude Website API by lordZOUGA(m): 10:14am On Jul 23, 2012 |
ekt_bear: It will be pretty annoying unless the website itself provides one. You'll have to replicate lots of NL functionality, write a scraper, etc.you mean like going through links and figure out how the requests are constructed? |
Re: Making A Crude Website API by netesy(m): 11:10pm On Jul 23, 2012 |
hmm learning |
Re: Making A Crude Website API by ektbear: 2:11am On Jul 25, 2012 |
lordZOUGA: What actions do you want your API to allow? If read only operations, then you can do it. But write sort of operations...more difficult. |
(1) (Reply)
$6 Million Treehouse Scholarship-learn Code, Get Hired! / What's The Biggest Project You Have Ever Worked On By Yourself? / C++ Programming For Beginners
(Go Up)
Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health religion celebs tv-movies music-radio literature webmasters programming techmarket Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 26 |