Get Time
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat (newest first)  | Threaded  | Tree
Previous Thread  |  Next Thread
<< PREV    [ 1 2 3 4 5 ]    NEXT >
Re: Meh? (response to post by jmzero) | Reply
"edit-as-you-pass-through" clever solution (which I think is horrible - negligible performance benefit over d000hg's approach, and painful to work with)

Can you elaborate on that? I don't see either why it is faster or why it is painful to work with. I have no trouble working with unique in C++ and that's how it is implemented. The reason is that it generalizes for arbitrary large alphabets. Also, I wouldn't mind maintaining the following.
// requires that s is a sorted null terminated string
// ensures that repeated letters in s are removed
// returns a pointer to the new end of the string
// ensures *result == 0
char* unique(char* s) {
  // s points to the last saved character
  // t points to the character currently examined
  char* t = s;
  do {
    if (*++t != *s) *++s = *t;
  } while (*t);
  return s;
I agree with you that the question is horrible. I would be very suspicious if an interviewer asks me to write a function that returns something and the signature has void for the return type. I would be even more suspicious in I'm working with C strings and there is an extra length parameter. I was also about to write a message asking what's wrong with d000hg's solution. (BTW, d000hg, your handle is horrible. It took me more time to spell it correctly than to write the rest of the message :) )

PS: In fact, I would prefer to use a smarter language that allows me to write at least the method contract comment in a way understood by tools. (for example Spec#, Java+JML, Eiffel)
Re: Meh? (response to post by rgrig) | Reply
I agree with you about d000hg's handle. For a long time I kept trying to spell it as something like dh00g, though I don't know why.
Re: Meh? (response to post by rgrig) | Reply
Well, I suppose the maintainability depends where you imagine this is going. The solution you show does move seamlessly to a longer alphabet (well, up to 256 characters, at least with that data type) - but if, say, you're going to change it to "ignore spaces in the input" or something then it's a little more awkward. With an abstract problem like this, I was probably too quick to jump to a conclusion on what kind of approach might be most maintainable - again the question is not really great.

And I suppose it also depends on the staff you have. To a lot of journeyman programmers, the code you wrote is pretty much gibberish. To others such as yourself (or most people here, I'd guess) that's a natural way to program. But writing code that's easily legible to a range of possible future staff is a worthwhile consideration - at least in many businesses.

As to speed, I think the second solution is going to run slightly faster than d000hg's - but only by a small factor (there's no complexity difference, unless the alphabet size dominates over the string length). But, in general, moving the pointers should be faster than using array indexes, and you don't have to build the string at the end (which, again, is only going to be significant if there's a large alphabet and a short string). To be clear - I wasn't trying to highlight the performance difference, rather I was trying to downplay it as very minor.
Re: Meh? (response to post by jmzero) | Reply
Since indexing an array is the same number of low-level instructions as dereferencing a pointer, the only possible difference I can see is that you might use more registers to do the array indexing, so downplaying it is appropriate :-)

Also, if you are in an interview where the interview seems to be asking you to do all kinds of trivial low-level optimizations, ask them if they use a good optimizing compiler :-p

I remember doing a question in an interview once, it was something to the effect of "write an algorithm to draw a horizontal line on a PDA with a 320x240 monotonic screen whose screen buffer is represented by an array of 32-bit integer bitmasks".

Well, as I started moving along on the whiteboard, I started noticing I was dividing (and modding) by 32 alot, and I started converting those statements to >>5 (and &31) as I was writing. Where the beginning of my code was clear and slow, the end of my code was convoluted and fast.

Then he surprised me by saying that a good optimizing compiler would have probably done most or all of those optimizations for me if I just left the code readable :-) We did have a good discussion about how one could determine whether those optimizations happen or not, and what to do if the more obscure code actually is faster.
Re: Meh? (response to post by Kawigi) | Reply
Since indexing an array is the same number of low-level instructions as dereferencing a pointer, the only possible difference I can see is that you might use more registers to do the array indexing, so downplaying it is appropriate :-)

Well, *(str + 5) is the same as str[5], but q=str[++i] is doing something different than q=++str and I can't imagine one or the other isn't going to be a little faster. Though I suppose the faster one might be the opposite of the one I picked initially - as I think there is a family of fast instructions for doing operations on [bp+ex] (so you could use ex for i). But I don't really remember x86 assembler at all - so I should probably be quiet before I say anything else that is almost certainly embarrassingly wrong.

I used to be part of the game/demo scene and worry about this kind of thing quite a bit - but even if I remembered that stuff, things have changed "a bit" since the 386. As you suggest - the value of trying to game this kind of thing is shrinking over time, and probably now tips towards de-optimization as often as not.

As to your PDA code, I did a Palm game a while back - and I ended up doing some inline assembler in drawing functions (and it made noticeable performance changes). Not every platform has progressed beyond comprehensibility.
Re: Meh? (response to post by jmzero) | Reply

Not every platform has progressed beyond comprehensibility.


I remember when I was learning little bits about the x86 architecture, there was a strange instruction that compilers used with great regularity, whose definition was basically "take the word from memory location $a+i*$b and put it into $c" or something like that, where $a, $b and $c were registers and i was an immediate (I'm probably modifying this a little bit). At first, that seemed like a really obscure instruction, until I realized the purpose was that i was the size of an element of an array, $b was the index and $a was the beginning of the array.
Re: Meh? (response to post by Kawigi) | Reply
Don't you think >>5 and &((1<<5)-1) are much better readable than /32 and %32 in case your manipulating bits?
Re: Meh? (response to post by dskloet) | Reply
But the operations with 32 weren't actually directly the manipulation of bits, they were used to decide which bits to manipulate in which integers (try writing the code yourself and that might be clearer). Using & and << is definitely more sensible than / and % if you're doing those operations on the bitmasks themselves.
Re: Meh? (response to post by Kawigi) | Reply
A compiler should be pretty smart to replace x % 32 by x & 31 because it needs to infer that x is always nonnegative. Incidentally, the latter corresponds to the mathematical operation while the first one doesn't.
Re: Meh? (response to post by jmzero) | Reply
There are a couple of points in favor of an in-place, rather than a counting scheme. First of all, with many languages using unicode strings by default, the larger alphabet can be a very real problem. Second, the counting method allocates extra memory and will push more memory out of the cache because of that.
Re: Meh? (response to post by rgrig) | Reply
That's the case but only if x is unsigned.
Re: Meh? (response to post by rgrig) | Reply
I seriously doubt that he will be given any negative length strings. Also his solution only touches the very first char if the length of the string is zero. Last time I checked c style strings with a length of zero certainly are represented by a single character.

Re: Meh? (response to post by tster123) | Reply
tster123 You have read one part of my message - but not another. If string is null-terminated - what's the point of length argument here ? Can you clarify ?

Regarding negative length's - just take a look on bug that things like this can cause in REAL code
Re: Meh? (response to post by TAG) | Reply
Additionally, there's a new kind of exploit that has been gaining ground lately involving integer overflow. Watching for such things in code when data came from outside the application (like from the user or from a file) is a good idea.
Re: Foiled by a Div2-250! (response to post by d000hg) | Reply
How about this Java solution:
public String Func(String str)
	String out="";
	for (char c='a'; c<='z'; c++)
		if (str.indexOf(""+c)>=0) out+=c;
	return out;
<< PREV    [ 1 2 3 4 5 ]    NEXT >