vtkFieldData::AddArray() is very painful with lots of arrays

View: New views
5 Messages — Rating Filter:   Alert me  

vtkFieldData::AddArray() is very painful with lots of arrays

by Wilson, Andrew T :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have an application where I have about 90,000 state arrays that I need to
save (to disk) and restore on subsequent runs.  The arrays are all of
different lengths.   It seemed to make sense to put them in the field data
of a vtkDataObject and use vtkDataObjectReader/Writer.

It turns out that it takes a ridiculously long time (25 minutes) to load a
100MB file.  This is because vtkDataReader calls vtkFieldData::AddArray() as
it loads in each array (which seems proper).  However, AddArray() takes time
linear in the number of arrays already in the field data, so loading in a
field data takes O(n^2) time in the number of arrays being loaded.  When n =
90000 this is unpleasant.

I suppose that for the purposes of my application I can condense my 90000
arrays into one ginormous one and just save that.  Padding them out to the
same length isn't practical because some are very short and some are very,
very long.  Likewise, saving them in point data is problematic because every
array is a different length.

The problem remains that vtkFieldData::AddArray() is O(n^2).  This seems...
Unfashionable.  What could we do about it?  Replace the loop in
GetAbstractArray() with a hashtable lookup?  The worst-case complexity is
still O(n) for the lookup but the average case should be much better.

Other suggestions?  This is really vexing me.

-- Andy


PS.  Here are the relevant snippets of code:


//-----------------------------------------------------------------------
int vtkFieldData::AddArray(vtkAbstractArray *array)
{
[snip]
  this->GetAbstractArray(array->GetName(), index);
[snip]
  if (index == -1)
    {
    index = this->NumberOfActiveArrays;
    this->NumberOfActiveArrays++;
    }
  this->SetArray(index, array);
  return index;
}


//-------------------------------------------------------------------
vtkAbstractArray *vtkFieldData::GetAbstractArray(const char *arrayName, int
&index)
{
[snip]
  for (i=0; i < this->GetNumberOfArrays(); i++)
    {
    name = this->GetArrayName(i);
    if ( name && !strcmp(name,arrayName) )
      {
      index = i;
      return this->GetAbstractArray(i);
      }
    }
  return NULL;
}



_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://www.vtk.org/mailman/listinfo/vtk-developers


Re: vtkFieldData::AddArray() is very painful with lots of arrays

by Bill Lorensen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is it possible to use an stl container?

On Fri, Nov 6, 2009 at 3:06 PM, Wilson, Andrew T <atwilso@...> wrote:

> I have an application where I have about 90,000 state arrays that I need to
> save (to disk) and restore on subsequent runs.  The arrays are all of
> different lengths.   It seemed to make sense to put them in the field data
> of a vtkDataObject and use vtkDataObjectReader/Writer.
>
> It turns out that it takes a ridiculously long time (25 minutes) to load a
> 100MB file.  This is because vtkDataReader calls vtkFieldData::AddArray() as
> it loads in each array (which seems proper).  However, AddArray() takes time
> linear in the number of arrays already in the field data, so loading in a
> field data takes O(n^2) time in the number of arrays being loaded.  When n =
> 90000 this is unpleasant.
>
> I suppose that for the purposes of my application I can condense my 90000
> arrays into one ginormous one and just save that.  Padding them out to the
> same length isn't practical because some are very short and some are very,
> very long.  Likewise, saving them in point data is problematic because every
> array is a different length.
>
> The problem remains that vtkFieldData::AddArray() is O(n^2).  This seems...
> Unfashionable.  What could we do about it?  Replace the loop in
> GetAbstractArray() with a hashtable lookup?  The worst-case complexity is
> still O(n) for the lookup but the average case should be much better.
>
> Other suggestions?  This is really vexing me.
>
> -- Andy
>
>
> PS.  Here are the relevant snippets of code:
>
>
> //-----------------------------------------------------------------------
> int vtkFieldData::AddArray(vtkAbstractArray *array)
> {
> [snip]
>  this->GetAbstractArray(array->GetName(), index);
> [snip]
>  if (index == -1)
>    {
>    index = this->NumberOfActiveArrays;
>    this->NumberOfActiveArrays++;
>    }
>  this->SetArray(index, array);
>  return index;
> }
>
>
> //-------------------------------------------------------------------
> vtkAbstractArray *vtkFieldData::GetAbstractArray(const char *arrayName, int
> &index)
> {
> [snip]
>  for (i=0; i < this->GetNumberOfArrays(); i++)
>    {
>    name = this->GetArrayName(i);
>    if ( name && !strcmp(name,arrayName) )
>      {
>      index = i;
>      return this->GetAbstractArray(i);
>      }
>    }
>  return NULL;
> }
>
>
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Follow this link to subscribe/unsubscribe:
> http://www.vtk.org/mailman/listinfo/vtk-developers
>
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://www.vtk.org/mailman/listinfo/vtk-developers


Re: vtkFieldData::AddArray() is very painful with lots of arrays

by Wilson, Andrew T :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/6/09 2:08 PM, "Bill Lorensen" <bill.lorensen@...> wrote:
> Is it possible to use an stl container?

I'm confused -- use an STL container for what?  For the hashtable?
Certainly, as long as it's wrapped in a PIMPL implementation.  For my state
arrays?  Sure -- that's how I store them internally in my application where
fast access is paramount and then convert them to/from VTK arrays for
storage.  I'd rather not have to write my own reader/writer if I can help
it.

Am I missing the point of your suggestion?

-- Andy


_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://www.vtk.org/mailman/listinfo/vtk-developers


Re: vtkFieldData::AddArray() is very painful with lots of arrays

by pat marion-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If you're open to different approaches, there is code in vtk edge to
serialize certain types of vtkDataArray to xml (no support for
binary).   It works for int, idType, and double arrays.  It would be
easy to add support for other array types.  Of course, I haven't given
the serializer a good stress test of 90,000 arrays, it's possible
there is some n^2 behavior hidden in it somewhere but I haven't
noticed any problems yet.  It also serializes vtkInformation objects
quite well.

https://www.kitware.com/cgi-bin/viewcvs.cgi/trunk/VTKEdge/IO/Testing/Cxx/TestSerializeInformation.cxx?rev=870&root=KWPublic&view=auto

The above link is to some example code.  It doesn't serialize
vtkDataArray but it gives you the idea.

Pat

On Fri, Nov 6, 2009 at 3:19 PM, Wilson, Andrew T <atwilso@...> wrote:

> On 11/6/09 2:08 PM, "Bill Lorensen" <bill.lorensen@...> wrote:
>> Is it possible to use an stl container?
>
> I'm confused -- use an STL container for what?  For the hashtable?
> Certainly, as long as it's wrapped in a PIMPL implementation.  For my state
> arrays?  Sure -- that's how I store them internally in my application where
> fast access is paramount and then convert them to/from VTK arrays for
> storage.  I'd rather not have to write my own reader/writer if I can help
> it.
>
> Am I missing the point of your suggestion?
>
> -- Andy
>
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Follow this link to subscribe/unsubscribe:
> http://www.vtk.org/mailman/listinfo/vtk-developers
>
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://www.vtk.org/mailman/listinfo/vtk-developers


Re: vtkFieldData::AddArray() is very painful with lots of arrays

by Bill Lorensen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sorry, my message was too brief. I meant can FieldData use an stl
container internally.

On Fri, Nov 6, 2009 at 3:19 PM, Wilson, Andrew T <atwilso@...> wrote:

> On 11/6/09 2:08 PM, "Bill Lorensen" <bill.lorensen@...> wrote:
>> Is it possible to use an stl container?
>
> I'm confused -- use an STL container for what?  For the hashtable?
> Certainly, as long as it's wrapped in a PIMPL implementation.  For my state
> arrays?  Sure -- that's how I store them internally in my application where
> fast access is paramount and then convert them to/from VTK arrays for
> storage.  I'd rather not have to write my own reader/writer if I can help
> it.
>
> Am I missing the point of your suggestion?
>
> -- Andy
>
>
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://www.vtk.org/mailman/listinfo/vtk-developers