Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / NewStats: 3,194,731 members, 7,955,781 topics. Date: Sunday, 22 September 2024 at 02:53 PM |
Nairaland Forum / Science/Technology / Programming / Storing K largest numbers from a stream (2571 Views)
Sorting List Of Numbers And Strings: Simple Puzzle / Microsoft Attempting World's Largest Coding Marathon / Randomising ,storing And Retrieving From Database (2) (3) (4)
Storing K largest numbers from a stream by ektbear: 3:39am On Oct 31, 2012 |
I'm going to see a long stream of numbers. Say, N of them..think N=1 billion, 1 trillion, etc. I want to store the top K largest of these numbers. Think K=10, etc. Design a function/program that given stores the K largest numbers of this stream. Try to make your program/function as efficient as possible. Hint: Essentially, start thinking about how to design a function insert(f) that given a number f, will check whether it needs to be stored in my collection of the K largest numbers and updates this collection appropriately. |
Re: Storing K largest numbers from a stream by Javanian: 8:22am On Oct 31, 2012 |
Is memory a problem? |
Re: Storing K largest numbers from a stream by ektbear: 8:24am On Oct 31, 2012 |
The N numbers could be much bigger than the capacity of your memory. So the simple algorithm one might have in mind of loading all the N numbers into memory, sorting, and picking the K largest won't work, since you don't have space to sort all N items at once. So, you sort of want to process the data sequentially. So a possible design is this: 1. Build some class/data structure that has internal storage for the K largest elements 2. build an insert method for this class that allows you to insert a potential number into the data structure. 3. Then, as you see this stream of N numbers, call your insert function on each new number you see. So in pseudocode: a = MyClass.new(K) for i=1..N a.insert(Next # in stream of numbers) end Something like that. Or more legibly/filled out a bit more, something like this: http://pastebin.com/5xRVQ6iC |
Re: Storing K largest numbers from a stream by lordZOUGA(m): 2:18pm On Oct 31, 2012 |
ekt_bear: I'm going to see a long stream of numbers. Say, N of them..think N=1 billion, 1 trillion, etc.okay, my first reply was kinda hasty. Didn't get that the exercise needed K greater numbers. I will make an array that can hold K greater numbers and another K sized array that will hold the buffer... Both arrays will always be sorted in descending order... If(greater[LAST] >= buffer[FIRST]){ //discard buffer} if(greater[FIRST] <= buffer[LAST]){ //replace greater with buffer } if all the above returns false, I will use a merge() that returns an array with the greatest of both arrays... |
Re: Storing K largest numbers from a stream by Javanian: 8:39pm On Oct 31, 2012 |
|
Re: Storing K largest numbers from a stream by ektbear: 9:04pm On Oct 31, 2012 |
Your Java implementation looks close. However, some comments: 1. In your constructor, shouldn't you pass k as an argument to it? 2. In your constructor, you initialize all the values to zero. This is fine if you know that all of the #s in the stream are bigger than zero. But if they are just arbitrary floating point numbers, then you'd need to do things a bit differently. 3. In your function insert(), since you sort numbers right away, then you only need to compare n to the bottom element of numbers, i.e., numbers[0], right? But you are quite close. |
Re: Storing K largest numbers from a stream by Javanian: 9:06pm On Oct 31, 2012 |
Like i said I'm writing from a mobile phone so i overlooked a lot of things but those should be easy to fix right? |
Re: Storing K largest numbers from a stream by ektbear: 9:08pm On Oct 31, 2012 |
1 and 3 are easy to fix. 2 is also easy to fix, but you'll either need some sort of minimum float value from Java, or (even better) to handle the case when you've only seen K numbers from the stream separately from the case when you've already seen more than K numbers. |
Re: Storing K largest numbers from a stream by ektbear: 10:25pm On Oct 31, 2012 |
Anyway, let's pretend/assume that you fixed those issues. Essentially, you are paying K log(K) time each time you call insert. If K is a small constant or something, this isn't a big deal. However, there is a way to improve this substantially by making the "cost" of each insertion log(K). Have you heard of something called a heap/priority queue? You can make the program much faster if you use a min-heap. |
Re: Storing K largest numbers from a stream by Javanian: 10:41pm On Oct 31, 2012 |
Yeah!...i know of heap and priority queues...i will attempt your suggestion when am on p.c. Although i would have prefered using something like a Binary search tree |
Re: Storing K largest numbers from a stream by ektbear: 10:49pm On Oct 31, 2012 |
Interesting, if you can implement it with a BST that'd be cool. |
Re: Storing K largest numbers from a stream by PrinceNN(m): 8:07am On Nov 01, 2012 |
*peeps in...steals @javanian's implementation for rebranding* hehe |
Re: Storing K largest numbers from a stream by lordZOUGA(m): 6:00pm On Nov 01, 2012 |
modified my post above |
Re: Storing K largest numbers from a stream by ektbear: 8:45am On Nov 04, 2012 |
Here is a Minheap based approach in Ruby: http://pastebin.com/P1uYFJP5 I'll at some point see if the data structures Javanian proposes can also be used to solve the problem. |
Re: Storing K largest numbers from a stream by Javanian: 9:50pm On Nov 04, 2012 |
with a HEAP
|
Re: Storing K largest numbers from a stream by ektbear: 10:35pm On Nov 04, 2012 |
Why not use the java collections heap rather than implementing your own? |
Re: Storing K largest numbers from a stream by ektbear: 10:36pm On Nov 04, 2012 |
Cool that you know how to implement a heap from scratch off the top of your head though |
Re: Storing K largest numbers from a stream by Javanian: 10:51pm On Nov 04, 2012 |
ekt_bear: Why not use the java collections heap rather than implementing your own? Actually, I've been looking for an opportunity to implement it , Thanks for giving me one... |
Re: Storing K largest numbers from a stream by ektbear: 8:23am On Nov 05, 2012 |
Sure. Now, let's see your binary search tree based approach. |
Re: Storing K largest numbers from a stream by ektbear: 12:10am On Nov 09, 2012 |
Javanian, where u dey |
Re: Storing K largest numbers from a stream by kadeerna: 1:21pm On Nov 09, 2012 |
Maybe I don't completely understand what you guys are doing here, but exactly why are you sorting and comparing such a large stream of numbers using less than ( < ) and greater than ( > ) comparators? Have you thought of keeping a moving track of the ex. lengths of each entry instead? |
Re: Storing K largest numbers from a stream by ektbear: 7:55am On Nov 10, 2012 |
not too sure what you mean, kadeerna |
Re: Storing K largest numbers from a stream by kadeerna: 11:26am On Nov 10, 2012 |
In keeping track of the highest K values from a list of N values, - Start at value 1 and store the length of the value converted to a string. - On every other value in the list, check the length stored from 1 above against the length of the current value (converted to a string). - If the length of the current value is greater than the value from 1, update the variable with the current longest found length and save the value to longest[0]. - Continue the same check through other values in the list - Where you find the length of the current value is equal to longest length variable, either do a greater than, less than or equality comparison. Another option might be to check the individual digits ex. longestValue = 122323213254354664323, longestValueLength = 20 current(list) = 1290344355465742345, current(list).length = 20 currentLonger = false while i < 19 or currentLonger == false currentLonger = (longestValue[0] == current(list)[0]) <-- cross check condition. longestValue = (currentLonger) ? current(list) : longestValue; Expanding this to store the last K values will simply modifying the above to use arrays or K instead. This way, you don't store at least 2 data structures of N size. |
Re: Storing K largest numbers from a stream by ektbear: 10:13pm On Nov 10, 2012 |
What is "length of value?" Like, what is the length of the value of say the number 50? |
Re: Storing K largest numbers from a stream by Javanian: 2:33am On Nov 11, 2012 |
@ekt_bear i'm sorry for the late reply there has been shortage of power in my area. O.k. I tried implementing it with a BST it worked but it was no different from when i did it with a heap. It will take O(log2n) time at worst case. Although i have to admit that suggesting a BST at first was wrong, what i had in mind was something like a Binary Heap which will return the maximum item in O(1) time at worst case. But it is more or less useless because I'm not going to be inserting all the items. So i wont be able to take advantage of this... |
Re: Storing K largest numbers from a stream by kadeerna: 11:24am On Nov 11, 2012 |
ekt_bear: What is "length of value?" Length here, is the number of characters in the value when converted to a string. ->>>> ((string)12232111).Length Length of 50 = 2 |
Re: Storing K largest numbers from a stream by Nobody: 6:06pm On Nov 11, 2012 |
Ekt, I'd like to program like you. How do I go about it? I've intermediate programming knowledge. |
Re: Storing K largest numbers from a stream by ektbear: 1:08am On Nov 12, 2012 |
Hi, My suggestion is for you to pick one nice, simple, well-designed scripting language like Python and use it as your main programming language for a year. The nice thing about Python is that it is only a level above pseudo-code...makes it very easy to express your ideas. Register for a programming competition site like spoj (http://www.spoj.pl/), work through the exercises there. You'll learn a lot from doing this. Anyway, I think this would be a good place to start. Let me know if you have any questions. |
Re: Storing K largest numbers from a stream by ektbear: 1:24am On Nov 12, 2012 |
Javanian: @ekt_bear i'm sorry for the late reply there has been shortage of power in my area. Yeah, so it looks like using a binary search tree leads to basically the same result as a heap. You pay log(k) for insertions, deletions, and searches. Java has a TreeMap (http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html) data structure which can be used for this. |
Re: Storing K largest numbers from a stream by Nobody: 10:55am On Nov 12, 2012 |
@Ekt_bear Thanks. |
Re: Storing K largest numbers from a stream by Javanian: 5:05pm On Dec 25, 2013 |
bump |
Re: Storing K largest numbers from a stream by ektbear: 3:28am On Dec 26, 2013 |
Here is a refreshed version in Java: http://pastebin.com/X61JmSEs |
(1) (Reply)
Using 2-dimensional Array In C++ / How To Download UC Browser For Android Phone|www.wap.ucweb.com / Html & Css
(Go Up)
Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health religion celebs tv-movies music-radio literature webmasters programming techmarket Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 53 |