JOIN
Get Time
forums   
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat  | Threaded  | Tree
Previous Thread  |  Next Thread
Vector Resize Hack | Reply
Problem:

In topcoder competitions all the data is passed in standard containers. Usually it is OK but in high-performance marathons where processing speed is the main goal copying a lot of data is too slow. It also makes SSE cumbersome to use on input/output.


Solution:

Since the problem appears only in high-performance marathons, only C++ is considered here (other languages have little chance in such marathons anyway). The competitors have to apply very dirty hacks to std::vectors to cope with the problems described above.

The example below shows a hack to attach vector to custom memory buffer (thanks to iquadrat). It is very unpractical, but it works well with GCC and MSVC (release, no IDE).

template<class T> void ResizeVectorHack(std::vector<T> &v, int size) {
  T* data = (T*)_mm_malloc(size * sizeof(T), 16);  //allocate aligned buffer
  T** ptr = (T**)&v;
  v.resize(1);                                     //leads to memory leak, but who cares...
  while (*ptr != &v[0]) ptr++;                     //MSVC adds some trash to std::vector
  *(ptr++) = data;
  *(ptr++) = data + size;
  *(ptr++) = data + size;
}
 
std::vector<float> CompetitorsMainMethod(const std::vector<float> &input) {
  std::vector<float> res;
  ResizeVectorHack(res, input.size());
  memcpy(&res[0], &input[0], res.size() * sizeof(res[0]));
  return res;
}



Discussion:

Here CompetitorsMainMethod is the method which is called by the testing system. The only thing it does is duplicates the input and returns it. Note that ResizeVectorHack calls _mm_malloc only for SSE alignment. If you don't use SSE you can replace it with ordinary malloc call.

Let's look into input/output in more detail:
1. The input is declared as const reference. You can change the type of any input parameter to const reference freely although there is no place in rules that admits it. Since only reference is passed, no copy occurs here.
2. The vector buffer is allocated via low-level C routines. malloc performs no initialization - no overhead here. The three necessary pointers to the data buffers are written to the vector internal members.
3. The output vector is returned by value. Does it mean that it is copied on return? Most certainly not because of the return value optimization. The C++ standard allows the compiler to ignore the copy + destructor calls of the function return value. There are several requirements to do it: the returned object must be created on the stack of function considered and be returned in all the possible execution pathes. In this particular case profiling on GCC and MSVC shows that no copy and destruction occurs.

Also note that it is useless to declare the return value as a const reference: the copy on return will still happen in this case.
Some time ago the static initialization hack was working in marathons. The static initialization of global members was not accounted in the work time measurement. Unfortunately this bug in the tester was fixed and the hack is no longer useful.
Re: Vector Resize Hack (response to post by syg96) | Reply
If you are using gcc locally, here is this code. The memory alloc could be changed from using the default reserve to making aligned allocations, if desired.
template<typename T>
class vector_uninit : public vector<T>
{
public:
    T* set_size(typename vector<T>::size_type new_size)
    {
        this->reserve(new_size);
        this->_M_impl._M_finish = this->_M_impl._M_start + new_size;
        return this->_M_impl._M_start;
    }
};

Note that with this method, it is necessary before returning the vector_uninit to swap it into a normal vector. Otherwise it will be copied.
vector_uninit<int> v;
v.set_size(123456);
// ...
vector<int> retval;
retval.swap(v);
return retval;
Re: Vector Resize Hack (response to post by syg96) | Reply
I want to note here something that is not directly related to std::vector but more to memory allocation in general. malloc will allocate memory to use. This memory needs to come from somewhere. This can be A)from the heap of the process or B) a call to mmap (I am assuming that a environment with glibc is used). Which of the two depends on the size and the mallopt settings.

In case A) there are two additional possibilities. Either an unused block is found in the heap and reused or the heap segment must be enlarged. Now, in the case of the enlargment, and also whenever mmap is used, there will be new pages associated to the process. These pages are initially unmapped in the virtual memory table. Upon first access, the OS will associate a physical page with the virtual page and zero initialize the page. (The zero initialization is to prevent one process getting used memory of another process and thereby possibly confidential information.)

This means although you might avoid the zero-initialisation by the stl using the resize hack, you still might have the cost of zero-initializing memory by the OS. Often, this cost cannot be avoided. But when many vectors are allocated and deallocated you can make sure that the process memory will be reused by following tuning of the memory allocator:
mallopt(M_TRIM_THRESHOLD, -1)
mallopt(M_MMAP_MAX, 0)

The first call makes sure that the processes heap segement never gets smaller, the second call disables mmap. This makes sure that physical memory once allocated to the process stays so.
Re: Vector Resize Hack (response to post by syg96) | Reply
LG and GE make the most effective webstarts.com obtains the very best evaluations of Types of Washer and Dryers however don't have the allocate a $1,000-plus ventless device, Best Hitachi Washer Dryer The GE GUD27ESSJWW ranks among Washer And Dryer Guide If you're very brief on room, the 24-inch-wide read full article as well as cleans up as well as dries out well, reviewers state.
RSS