Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / NewStats: 3,150,564 members, 7,809,059 topics. Date: Thursday, 25 April 2024 at 10:03 PM |
Nairaland Forum / Science/Technology / Programming / Another Programming Challenge (2199 Views)
Programming Challenge For Beginners Competition Two N20000 -SEASON 2- / Programming Challenge For Beginners N20000 / Facebook Programming Challenge Question (2) (3) (4)
Another Programming Challenge by ektbear: 12:10pm On Sep 27, 2012 |
You are given a very large text file, too big to load all of it in memory. It has N lines in it, where N is unknown to you. You are also given some number K. Write a program that prints K randomly chosen lines from the text file to the screen. Try to make your program as efficient as possible. (there is a slight bit of vagueness in what I mean by "K randomly chosen lines", but don't worry too much about it). |
Re: Another Programming Challenge by Javanian: 12:32pm On Sep 27, 2012 |
If k =3 and the file contains 100 lines...should the printed lines follow each other e.g line 50,51 and 52 or it can be any 3 random lines e.g. 3,45, 82 ?? |
Re: Another Programming Challenge by ektbear: 12:45pm On Sep 27, 2012 |
The latter. |
Re: Another Programming Challenge by Javanian: 8:13pm On Sep 27, 2012 |
|
Re: Another Programming Challenge by ektbear: 8:16pm On Sep 27, 2012 |
//read each line from the file s = in.readLine(); while(s!=null) { list.add(s); s = in.readLine(); } in.close(); This above loads the entire file into memory ( the ArrayList list()). If the file is bigger than your memory size, then this won't work. Assume the file is several terabytes or something. |
Re: Another Programming Challenge by ektbear: 11:01pm On Sep 27, 2012 |
At a high level, this is the idea: 1. Sweep through file line by line to compute N 2. Select some subset of K integers from the numbers 1 through N 3. Sweep through the file again, adding the corresponding line numbers to the file. |
Re: Another Programming Challenge by ektbear: 12:44am On Sep 28, 2012 |
too boring, I take it? |
Re: Another Programming Challenge by PrinceNN(m): 7:07am On Sep 28, 2012 |
Am On it...expect my submission the moment I arrive my destination..no internet here, just my phone... |
Re: Another Programming Challenge by ektbear: 8:01am On Sep 28, 2012 |
Another small one: 1. Write a function that determines if a number is prime or not (try to use an algorithm at least slightly better than the obvious one that just tests all factors). 2. If you plan on calling this function many times with many repeated values, how would you suggest speeding up the performance? |
Re: Another Programming Challenge by Javanian: 8:36am On Sep 28, 2012 |
^ i suggest you remove this one until some one comes up with a solution to the former...so that we dont derail the thread... For the former can i read it in parts, lets say 100kb until the file is exhausted? |
Re: Another Programming Challenge by lordZOUGA(m): 10:37am On Sep 28, 2012 |
ekt_bear: At a high level, this is the idea:sweeping through a file of unknown length line by line?? Even if the file is say up to 1terabyte?? Alot of vagueness in your question though.... Besides I wouldn't append index to the file... Why? Because there would be no way to jump to it since you cannot load all the file into memory at once. A buffer is a wise choice though... |
Re: Another Programming Challenge by Javanian: 5:30pm On Sep 28, 2012 |
|
Re: Another Programming Challenge by ektbear: 8:48pm On Sep 28, 2012 |
lordZOUGA:If there are N lines, this will take O(N) time, won't it? Plus most languages will do some sort of buffered I/O which reads big blocks of the file at a time, no? |
Re: Another Programming Challenge by Javanian: 8:56pm On Sep 28, 2012 |
@ekt_bear did i get it right?? |
Re: Another Programming Challenge by ektbear: 9:02pm On Sep 28, 2012 |
Hmm...I think you need to have a function countLines() that first computes N. Once you have this, then you generate a list of the K row numbers you'll need to grab. Then you grab them. |
Re: Another Programming Challenge by Javanian: 9:23pm On Sep 28, 2012 |
i dont think i need an extra function to compute N |
Re: Another Programming Challenge by ektbear: 9:25pm On Sep 28, 2012 |
If the file is 4 terabyes, you've just created a list of size 4 TB. Which will not fit in your computer's memory. |
Re: Another Programming Challenge by Javanian: 9:29pm On Sep 28, 2012 |
ok, then i will seat back and wait for a solution 1 Like |
Re: Another Programming Challenge by mkwayisi: 9:38pm On Sep 28, 2012 |
Javanian: ok, then i will seat back and wait for a solutionAnd there you have it... implemented in C#
|
Re: Another Programming Challenge by ektbear: 9:41pm On Sep 28, 2012 |
goto end; // The more they hate you, the more I love you! lmao! |
Re: Another Programming Challenge by Javanian: 9:41pm On Sep 28, 2012 |
while (!sr.EndOfStream) { You are doing the samething i did |
Re: Another Programming Challenge by Javanian: 9:41pm On Sep 28, 2012 |
ekt_bear: goto end; // The more they hate you, the more I love you! dont mind him |
Re: Another Programming Challenge by ektbear: 9:43pm On Sep 28, 2012 |
I think your algorithm works, mkwayisi. However, the splitting the lines into groups stuff is not something I mentioned...but can of course be corrected pretty easily. |
Re: Another Programming Challenge by ektbear: 9:44pm On Sep 28, 2012 |
Javanian:while (!sr.EndOfStream) { Right, but he just counts the number of lines to compute N. He doesn't store the line in memory like you do with your list. |
Re: Another Programming Challenge by mkwayisi: 9:48pm On Sep 28, 2012 |
ekt_bear: I think your algorithm works, mkwayisi. However, the splitting the lines into groups stuff is not something I mentioned...but can of course be corrected pretty easily.I introduced it just to make the result *more* random |
Re: Another Programming Challenge by ektbear: 9:49pm On Sep 28, 2012 |
Eh...that makes it less random, not more random. Well, it depends on how you define random I guess. |
Re: Another Programming Challenge by mkwayisi: 9:53pm On Sep 28, 2012 |
My definition of "random": Suppose you are to select 3 random numbers from set P, where P = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, R = {2, 5, 9} is *more* random logically (but certainly not academically) than R = {1, 2, 3} |
Re: Another Programming Challenge by ektbear: 9:56pm On Sep 28, 2012 |
That typically isn't how people define a random subset of size 3 from N objects. Basically, there are N choose 3 such subsets. And when people say "random", then mean that each of these subsets are equally likely. Otoh, you seem to be throwing away subsets such as R={1,2,3}. Anyway, not a big deal...easy to fix |
Re: Another Programming Challenge by PrinceNN(m): 12:58am On Sep 29, 2012 |
Done in Java http://pastebin.com/znMpn3TH Pass the file path as d only arguement Java fileClass.java c:/file.txt |
Re: Another Programming Challenge by mkwayisi: 8:42am On Sep 29, 2012 |
₱®ÌИСΞ: With your implementation, you did not eliminate the possibility of the same number being generated multiple times. So if your random numbers generated are say {1, 1, 2, 2, 3}, you'd end up displaying same lines multiple times as well. |
Re: Another Programming Challenge by PrinceNN(m): 12:38pm On Sep 29, 2012 |
mkwayisi: Hahaha lol...thnx for dat...I imagined the possibility of dat occuring in a HUGE file was near impossible so I guess my mind didn't go there...lemme fix it |
Re: Another Programming Challenge by lordZOUGA(m): 3:57pm On Sep 29, 2012 |
ekt_bear:it will take O(N)... This would be nice but your algorithm suggested doing it twice when in my opinion, once would be enough. Yea most languages has internal buffer... C++ has the internal buffer object stream_buf that is created for every I/O operation |
Your Advice For A Beginner Database Administrator? / Lack Of Startup Ideas / Please How Can I Download A Torrent File From Freetutorial.us
(Go Up)
Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health religion celebs tv-movies music-radio literature webmasters programming techmarket Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 43 |