[interprocess] Performance problem with managed_shared_memory

View: New views
3 Messages — Rating Filter:   Alert me  

[interprocess] Performance problem with managed_shared_memory

by Moritz-9 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi there,

I have a performance problem using the managed_shared_memory and the
interprocess_vector. I attached a minimalistic, compilable example that
demonstrates this.
I create a vector that contains a simple class and I write into this
vector. If the vector is located in the shared_memory this takes much
longer than if it is located in the process-local memory. The main
difference then is the used allocator. But I can not explain it.

If I run the attached code, I get the following results:
(running on Ubuntu 8.10, Boost version 1.40, gcc-Version 4.3.3)

SHMEM_TESTING: Mean: 0.024768 seconds.
else           Mean: 0.015022 seconds.

I do not understand where the difference results from. Is there anybody
who has an explanation for that?


#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/thread/xtime.hpp>
#include <iostream>
#include <vector>

namespace ipc = boost::interprocess;

double get_timestamp()
{
    boost::xtime timestamp;
    boost::xtime_get(×tamp, boost::TIME_UTC);
    return timestamp.sec + ((double)timestamp.nsec / 1000000000.0);
}

class Point3f
{
    public:
        double x;
        double y;
        double z;
};

#define VECTOR_ELEMENTS 500000

int main()
{
#define SHMEM_TESTING
#ifdef SHMEM_TESTING
    ipc::shared_memory_object::remove("shmem");
    ipc::managed_shared_memory managed_shm( ipc::create_only, "shmem" ,
VECTOR_ELEMENTS*3*sizeof( double ) + 1024 );

    typedef ipc::managed_shared_memory::segment_manager
segment_manager_t;
    typedef ipc::allocator<void, segment_manager_t>         void_allocator;
    typedef ipc::allocator<Point3f, segment_manager_t>
Point3fAllocator;
    typedef ipc::vector<Point3f, Point3fAllocator>          Point3fVector;

    void_allocator alloc( managed_shm.get_segment_manager() );
    Point3fVector * vec = managed_shm.construct<Point3fVector>(
ipc::unique_instance )( alloc );

    if ( !vec ) return -1;
#else
    ipc::vector<Point3f> * vec = new ipc::vector<Point3f>();
//  std::vector<Point3f> * vec = new std::vector<Point3f>();
#endif

    for ( unsigned int i = 0; i < VECTOR_ELEMENTS; ++i )
    {
        vec->push_back( Point3f() );
    }

    double sum = 0;
    unsigned int count = 0;
    for ( ; count < 20; ++count )
    {
        double t1 = get_timestamp();
        for ( unsigned int i = 0; i < vec->size(); ++i )
        {
            ( *vec )[i].x = i;
            ( *vec )[i].y = i;
            ( *vec )[i].z = i;
        }
        sum += get_timestamp() - t1;
    }

    std::cerr << std::fixed << "Mean: " <<
sum/static_cast<double>(count) << " seconds." << std::endl;

    return 0;
}

_______________________________________________
Boost-users mailing list
Boost-users@...
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Re: [interprocess] Performance problem with managed_shared_memory

by Ion Gaztañaga :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Moritz escribió:
> Hi there,
>
> I have a performance problem using the managed_shared_memory and the
> interprocess_vector. I attached a minimalistic, compilable example that
> demonstrates this.
> I create a vector that contains a simple class and I write into this
> vector. If the vector is located in the shared_memory this takes much
> longer than if it is located in the process-local memory. The main
> difference then is the used allocator. But I can not explain it.

Process shared containers use relative pointers so that each dereference
needs additional operations to get the address on each process. Example:

T &operator[](size_type idx)
{
    *(start_+ idx;)
}

start_ is a smart pointer so pointer arithmetic is not trivial. The
compiler can't also apply as many optimization as with raw pointers. So
this is expected behaviour. You can improve it a bit with:

         for ( unsigned int i = 0; i < vec->size(); ++i )
         {
             Point3f &f = ( *vec )[i];
             f.x = i;
             f.y = i;
             f.z = i;
         }

Or even faster obtaining a raw pointer to the first element:

         Point3f *first = &(*vec )[0];
         for ( unsigned int i = 0; i < vec->size(); ++i )
         {
             first[i].x = i;
             first[i].y = i;
             first[i].z = i;
         }

Best,

Ion
_______________________________________________
Boost-users mailing list
Boost-users@...
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Re: [interprocess] Performance problem with managed_shared_memory

by Moritz-9 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ion,

thank you for your tremendous support in this mailing list.

Best regards,
Moritz


Ion Gaztañaga wrote:
>
> Process shared containers use relative pointers so that each dereference
> needs additional operations to get the address on each process. Example:
[...]
> Best,
>
> Ion

_______________________________________________
Boost-users mailing list
Boost-users@...
http://lists.boost.org/mailman/listinfo.cgi/boost-users