Extremly slow XML import.

View: New views
13 Messages — Rating Filter:   Alert me  

Extremly slow XML import.

by Kummel62 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yesterday, when I helped Benny with some sort tests in Windows, I found that doing XML import there
was VERY slow under certain circumstances:

Fast XML import  about 3 min with 6020 people.
GRAMPS: 3.2.0-0.SVN13520M
Python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08...
BSDDB: 4.4.5.2
LANG: sv_SE:utf-8
OS: win32

Slow XML import > 50 minutes with same XML.
GRAMPS: 3.2.0-0.SVN13520M
Python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02...
BSDDB: 4.7.3
LANG: sv_SE:utf-8
OS: win32

I also tried with Gramps 3.1.3 and identical result.

It's obviously a difference between Python 2.5.1 and 2.6.2, but what?

/Peter


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Benny Malengier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/11/7 Peter Landgren <peter.talken@...>:

> Yesterday, when I helped Benny with some sort tests in Windows, I found that doing XML import there
> was VERY slow under certain circumstances:
>
> Fast XML import  about 3 min with 6020 people.
> GRAMPS: 3.2.0-0.SVN13520M
> Python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08...
> BSDDB: 4.4.5.2
> LANG: sv_SE:utf-8
> OS: win32
>
> Slow XML import > 50 minutes with same XML.
> GRAMPS: 3.2.0-0.SVN13520M
> Python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02...
> BSDDB: 4.7.3
> LANG: sv_SE:utf-8
> OS: win32
>
> I also tried with Gramps 3.1.3 and identical result.
>
> It's obviously a difference between Python 2.5.1 and 2.6.2, but what?

Interesting. I would rather think bsddb in combination with it might
be to blame.

But on linux it is always fast? I have 2.5 python here, seems ok to me on linux.
Perhaps ask the windows people to profile the import to see where the
time goes to.

So do
import Utils

and where the import function is called as import_function, rename it
to _import_function, and create a new import_function that has:

def import_function():
    Utils.profile(_import_function)

Benny

>
> /Peter
>
>
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
> trial. Simplify your report design, integration and deployment - and focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Gramps-devel mailing list
> Gramps-devel@...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Parent Message unknown Re: Extremly slow XML import.

by Benny Malengier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stephen, can you forward this research to the windows list.

So, dbshelve.py is 93 times faster in linux than windows. This means windows users have really bad performance on all batch operations (import, change tools, ....), just look at the output. For peter, an import of 3 min in linux is more than 50 min in windows.

My guess is that or the filesystem is to blame (windows ntfs is not as good as what is present on linux) but cannot be for so much? Otherwise contact the bsddb pybsddb people to know if there is something windows users can do to have bsddb working at normal speed.

Perhaps time this yourself once too.

Benny

2009/11/8 Peter Landgren <peter.talken@...>
Yes,
the last hint made it.
I have run in both my Windows box (1.7 GHz 512 MB) and my Linux box (2.4 GHz 1 GB)
and compared the result:
This call:


dbshelve.py:256(put)
takes the longest time in both systems.
If I compare the "tottime" the Linux system is about 5 times faster than the windows except
for the dbshelve.py:256(put) which is 93 times faster in the Linux box. I have attached the outputs from the profiling and the comparison.


I have no deeper knowledge how to interpret the profiling.


/Peter




------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Benny Malengier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

My wrong, it cannot be the filesystem, as windows with python 2.5 was also 3 min, so must be latest bsddb.

Benny

2009/11/9 Benny Malengier <benny.malengier@...>
Stephen, can you forward this research to the windows list.

So, dbshelve.py is 93 times faster in linux than windows. This means windows users have really bad performance on all batch operations (import, change tools, ....), just look at the output. For peter, an import of 3 min in linux is more than 50 min in windows.

My guess is that or the filesystem is to blame (windows ntfs is not as good as what is present on linux) but cannot be for so much? Otherwise contact the bsddb pybsddb people to know if there is something windows users can do to have bsddb working at normal speed.

Perhaps time this yourself once too.

Benny

2009/11/8 Peter Landgren <peter.talken@...>
Yes,

the last hint made it.
I have run in both my Windows box (1.7 GHz 512 MB) and my Linux box (2.4 GHz 1 GB)
and compared the result:
This call:


dbshelve.py:256(put)
takes the longest time in both systems.
If I compare the "tottime" the Linux system is about 5 times faster than the windows except
for the dbshelve.py:256(put) which is 93 times faster in the Linux box. I have attached the outputs from the profiling and the comparison.


I have no deeper knowledge how to interpret the profiling.


/Peter





------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Jerome :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is it not a configuration issue ?

It is "easy" to use multiple python versions under Linux. What about under Windows OS ? Uninstall python + related bindings (pycairo, pygtk, pygobject), then install a new python version + related bindings (pycairo, pygtk, pygobject).

Is there any wrong reference on Windows Registry ?
i.e using python 2.6 with 2.5 bindings !

Also, some Linux distributions are loading python with session (or start). Windows does this for MS Office or OpenOffice but not for python libs. Maybe this will decrease performance the first time the Windows user starts gramps but not so big difference and not on import.



--- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...> a écrit :

> De: Benny Malengier <benny.malengier@...>
> Objet: Re: [Gramps-devel] Extremly slow XML import.
> À: "Gramps Development List" <gramps-devel@...>, "Stephen George" <steve_geo@...>
> Date: Lundi 9 Novembre 2009, 9h33
> My wrong, it cannot be the filesystem, as
> windows with python 2.5 was also 3 min, so must be latest
> bsddb.
>
> Benny
>
> 2009/11/9 Benny Malengier <benny.malengier@...>
>
> Stephen, can you forward this
> research to the windows list.
>
> So, dbshelve.py is 93 times faster in linux than windows.
> This means windows users have really bad performance on all
> batch operations (import, change tools, ....), just look at
> the output. For peter, an import of 3 min in linux is more
> than 50 min in windows.
>
>
>
> My guess is that or the filesystem is to blame (windows
> ntfs is not as good as what is present on linux) but cannot
> be for so much? Otherwise contact the bsddb pybsddb people
> to know if there is something windows users can do to have
> bsddb working at normal speed.
>
>
>
> Perhaps time this yourself once too.
>
> Benny
>
> 2009/11/8 Peter
> Landgren <peter.talken@...>
>
>
> Yes,
>
> the last hint made it.
>
> I have run in both my Windows box (1.7 GHz 512 MB) and my
> Linux box (2.4 GHz 1 GB)
>
> and compared the result:
>
> This call:
>
>
> dbshelve.py:256(put)takes
> the longest time in both systems.
>
>
>
> If I compare the "tottime" the Linux system is
> about 5 times faster than the windows except
>
> for  the dbshelve.py:256(put) which is 93 times faster in
> the Linux box. I have attached the outputs from the
> profiling and the comparison.
>
>
> I have no deeper knowledge  how to interpret the
> profiling.
>
>
> /Peter
>
>
>
>
>
>
>
> -----La pièce jointe associée suit-----
>
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal
> Reports 2008 30-Day
> trial. Simplify your report design, integration and
> deployment - and focus on
> what you do best, core application coding. Discover what's
> new with
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> -----La pièce jointe associée suit-----
>
> _______________________________________________
> Gramps-devel mailing list
> Gramps-devel@...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>


     

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Parent Message unknown Re: Extremly slow XML import.

by Jerome :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> This means windows users have really bad performance on all batch operations

same results by using pythonw.exe rather than python.exe ?



--- En date de : Lun 9.11.09, jerome <romjerome@...> a écrit :

> De: jerome <romjerome@...>
> Objet: Re: [Gramps-devel] Extremly slow XML import.
> À: "Gramps Development List" <gramps-devel@...>, "Stephen George" <steve_geo@...>, "Benny Malengier" <benny.malengier@...>
> Date: Lundi 9 Novembre 2009, 10h11
> Is it not a configuration issue ?
>
> It is "easy" to use multiple python versions under Linux.
> What about under Windows OS ? Uninstall python + related
> bindings (pycairo, pygtk, pygobject), then install a new
> python version + related bindings (pycairo, pygtk,
> pygobject).
>
> Is there any wrong reference on Windows Registry ?
> i.e using python 2.6 with 2.5 bindings !
>
> Also, some Linux distributions are loading python with
> session (or start). Windows does this for MS Office or
> OpenOffice but not for python libs. Maybe this will decrease
> performance the first time the Windows user starts gramps
> but not so big difference and not on import.
>
>
>
> --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...>
> a écrit :
>
> > De: Benny Malengier <benny.malengier@...>
> > Objet: Re: [Gramps-devel] Extremly slow XML import.
> > À: "Gramps Development List" <gramps-devel@...>,
> "Stephen George" <steve_geo@...>
> > Date: Lundi 9 Novembre 2009, 9h33
> > My wrong, it cannot be the filesystem, as
> > windows with python 2.5 was also 3 min, so must be
> latest
> > bsddb.
> >
> > Benny
> >
> > 2009/11/9 Benny Malengier <benny.malengier@...>
> >
> > Stephen, can you forward this
> > research to the windows list.
> >
> > So, dbshelve.py is 93 times faster in linux than
> windows.
> > This means windows users have really bad performance
> on all
> > batch operations (import, change tools, ....), just
> look at
> > the output. For peter, an import of 3 min in linux is
> more
> > than 50 min in windows.
> >
> >
> >
> > My guess is that or the filesystem is to blame
> (windows
> > ntfs is not as good as what is present on linux) but
> cannot
> > be for so much? Otherwise contact the bsddb pybsddb
> people
> > to know if there is something windows users can do to
> have
> > bsddb working at normal speed.
> >
> >
> >
> > Perhaps time this yourself once too.
> >
> > Benny
> >
> > 2009/11/8 Peter
> > Landgren <peter.talken@...>
> >
> >
> > Yes,
> >
> > the last hint made it.
> >
> > I have run in both my Windows box (1.7 GHz 512 MB) and
> my
> > Linux box (2.4 GHz 1 GB)
> >
> > and compared the result:
> >
> > This call:
> >
> >
> > dbshelve.py:256(put)takes
> > the longest time in both systems.
> >
> >
> >
> > If I compare the "tottime" the Linux system is
> > about 5 times faster than the windows except
> >
> > for  the dbshelve.py:256(put) which is 93 times
> faster in
> > the Linux box. I have attached the outputs from the
> > profiling and the comparison.
> >
> >
> > I have no deeper knowledge  how to interpret the
> > profiling.
> >
> >
> > /Peter
> >
> >
> >
> >
> >
> >
> >
> > -----La pièce jointe associée suit-----
> >
> >
> ------------------------------------------------------------------------------
> > Let Crystal Reports handle the reporting - Free
> Crystal
> > Reports 2008 30-Day
> > trial. Simplify your report design, integration and
> > deployment - and focus on
> > what you do best, core application coding. Discover
> what's
> > new with
> > Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> > -----La pièce jointe associée suit-----
> >
> > _______________________________________________
> > Gramps-devel mailing list
> > Gramps-devel@...
> > https://lists.sourceforge.net/lists/listinfo/gramps-devel
> >
>
>
>
>


     

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Kummel62 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Den Monday 09 November 2009 10.14.39 skrev jerome:
> > This means windows users have really bad performance on all batch
> > operations
>
> same results by using pythonw.exe rather than python.exe ?
No, pythonw.exe is as slow as python.

/Peter

> --- En date de : Lun 9.11.09, jerome <romjerome@...> a écrit :
> > De: jerome <romjerome@...>
> > Objet: Re: [Gramps-devel] Extremly slow XML import.
> > À: "Gramps Development List" <gramps-devel@...>,
> > "Stephen George" <steve_geo@...>, "Benny Malengier"
> > <benny.malengier@...> Date: Lundi 9 Novembre 2009, 10h11
> > Is it not a configuration issue ?
> >
> > It is "easy" to use multiple python versions under Linux.
> > What about under Windows OS ? Uninstall python + related
> > bindings (pycairo, pygtk, pygobject), then install a new
> > python version + related bindings (pycairo, pygtk,
> > pygobject).
> >
> > Is there any wrong reference on Windows Registry ?
> > i.e using python 2.6 with 2.5 bindings !
> >
> > Also, some Linux distributions are loading python with
> > session (or start). Windows does this for MS Office or
> > OpenOffice but not for python libs. Maybe this will decrease
> > performance the first time the Windows user starts gramps
> > but not so big difference and not on import.
> >
> >
> >
> > --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...>
> >
> > a écrit :
> > > De: Benny Malengier <benny.malengier@...>
> > > Objet: Re: [Gramps-devel] Extremly slow XML import.
> > > À: "Gramps Development List" <gramps-devel@...>,
> >
> > "Stephen George" <steve_geo@...>
> >
> > > Date: Lundi 9 Novembre 2009, 9h33
> > > My wrong, it cannot be the filesystem, as
> > > windows with python 2.5 was also 3 min, so must be
> >
> > latest
> >
> > > bsddb.
> > >
> > > Benny
> > >
> > > 2009/11/9 Benny Malengier <benny.malengier@...>
> > >
> > > Stephen, can you forward this
> > > research to the windows list.
> > >
> > > So, dbshelve.py is 93 times faster in linux than
> >
> > windows.
> >
> > > This means windows users have really bad performance
> >
> > on all
> >
> > > batch operations (import, change tools, ....), just
> >
> > look at
> >
> > > the output. For peter, an import of 3 min in linux is
> >
> > more
> >
> > > than 50 min in windows.
> > >
> > >
> > >
> > > My guess is that or the filesystem is to blame
> >
> > (windows
> >
> > > ntfs is not as good as what is present on linux) but
> >
> > cannot
> >
> > > be for so much? Otherwise contact the bsddb pybsddb
> >
> > people
> >
> > > to know if there is something windows users can do to
> >
> > have
> >
> > > bsddb working at normal speed.
> > >
> > >
> > >
> > > Perhaps time this yourself once too.
> > >
> > > Benny
> > >
> > > 2009/11/8 Peter
> > > Landgren <peter.talken@...>
> > >
> > >
> > > Yes,
> > >
> > > the last hint made it.
> > >
> > > I have run in both my Windows box (1.7 GHz 512 MB) and
> >
> > my
> >
> > > Linux box (2.4 GHz 1 GB)
> > >
> > > and compared the result:
> > >
> > > This call:
> > >
> > >
> > > dbshelve.py:256(put)takes
> > > the longest time in both systems.
> > >
> > >
> > >
> > > If I compare the "tottime" the Linux system is
> > > about 5 times faster than the windows except
> > >
> > > for  the dbshelve.py:256(put) which is 93 times
> >
> > faster in
> >
> > > the Linux box. I have attached the outputs from the
> > > profiling and the comparison.
> > >
> > >
> > > I have no deeper knowledge  how to interpret the
> > > profiling.
> > >
> > >
> > > /Peter
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > -----La pièce jointe associée suit-----
> >
> > -------------------------------------------------------------------------
> >-----
> >
> > > Let Crystal Reports handle the reporting - Free
> >
> > Crystal
> >
> > > Reports 2008 30-Day
> > > trial. Simplify your report design, integration and
> > > deployment - and focus on
> > > what you do best, core application coding. Discover
> >
> > what's
> >
> > > new with
> > > Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> > > -----La pièce jointe associée suit-----
> > >
> > > _______________________________________________
> > > Gramps-devel mailing list
> > > Gramps-devel@...
> > > https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
> ---------------------------------------------------------------------------
>--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day trial. Simplify your report design, integration and deployment - and
> focus on what you do best, core application coding. Discover what's new
> with Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Gramps-devel mailing list
> Gramps-devel@...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel

--
Peter Landgren
Talken Hagen
671 94  BRUNSKOG
0570-530 21
070-635 4719
peter.talken@...
Skype: pgl4820.2


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Kummel62 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Is it not a configuration issue ?
>
> It is "easy" to use multiple python versions under Linux. What about under
> Windows OS ? Uninstall python + related bindings (pycairo, pygtk,
> pygobject), then install a new python version + related bindings (pycairo,
> pygtk, pygobject).
>
> Is there any wrong reference on Windows Registry ?
> i.e using python 2.6 with 2.5 bindings !
>
> Also, some Linux distributions are loading python with session (or start).
> Windows does this for MS Office or OpenOffice but not for python libs.
> Maybe this will decrease performance the first time the Windows user starts
> gramps but not so big difference and not on import.
If you have seen the results from the profiling, you see that of the 100 items listed
99 of them are about 5 times slower (partly depending on slower cpu by a factor of 1.4)
with very small deviation. Only ONE item and that's a call to function put in pyshelve.py, which is
part of bsddb, is 93 times slower.

I have run with only python 2.6.2 installed'and it's still as slow as before, so it's not a
configuration nissue.

/Peter

> --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...> a écrit :
> > De: Benny Malengier <benny.malengier@...>
> > Objet: Re: [Gramps-devel] Extremly slow XML import.
> > À: "Gramps Development List" <gramps-devel@...>,
> > "Stephen George" <steve_geo@...> Date: Lundi 9 Novembre 2009,
> > 9h33
> > My wrong, it cannot be the filesystem, as
> > windows with python 2.5 was also 3 min, so must be latest
> > bsddb.
> >
> > Benny
> >
> > 2009/11/9 Benny Malengier <benny.malengier@...>
> >
> > Stephen, can you forward this
> > research to the windows list.
> >
> > So, dbshelve.py is 93 times faster in linux than windows.
> > This means windows users have really bad performance on all
> > batch operations (import, change tools, ....), just look at
> > the output. For peter, an import of 3 min in linux is more
> > than 50 min in windows.
> >
> >
> >
> > My guess is that or the filesystem is to blame (windows
> > ntfs is not as good as what is present on linux) but cannot
> > be for so much? Otherwise contact the bsddb pybsddb people
> > to know if there is something windows users can do to have
> > bsddb working at normal speed.
> >
> >
> >
> > Perhaps time this yourself once too.
> >
> > Benny
> >
> > 2009/11/8 Peter
> > Landgren <peter.talken@...>
> >
> >
> > Yes,
> >
> > the last hint made it.
> >
> > I have run in both my Windows box (1.7 GHz 512 MB) and my
> > Linux box (2.4 GHz 1 GB)
> >
> > and compared the result:
> >
> > This call:
> >
> >
> > dbshelve.py:256(put)takes
> > the longest time in both systems.
> >
> >
> >
> > If I compare the "tottime" the Linux system is
> > about 5 times faster than the windows except
> >
> > for  the dbshelve.py:256(put) which is 93 times faster in
> > the Linux box. I have attached the outputs from the
> > profiling and the comparison.
> >
> >
> > I have no deeper knowledge  how to interpret the
> > profiling.
> >
> >
> > /Peter
> >
> >
> >
> >
> >
> >
> >
> > -----La pièce jointe associée suit-----
> >
> > -------------------------------------------------------------------------
> >----- Let Crystal Reports handle the reporting - Free Crystal
> > Reports 2008 30-Day
> > trial. Simplify your report design, integration and
> > deployment - and focus on
> > what you do best, core application coding. Discover what's
> > new with
> > Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> > -----La pièce jointe associée suit-----
> >
> > _______________________________________________
> > Gramps-devel mailing list
> > Gramps-devel@...
> > https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
> ---------------------------------------------------------------------------
>--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day trial. Simplify your report design, integration and deployment - and
> focus on what you do best, core application coding. Discover what's new
> with Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Gramps-devel mailing list
> Gramps-devel@...
> https://lists.sourceforge.net/lists/listinfo/gramps-devel

--
Peter Landgren
Talken Hagen
671 94  BRUNSKOG
0570-530 21
070-635 4719
peter.talken@...
Skype: pgl4820.2



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Jerome :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> If you have seen the results from the profiling,

I just have read that :

 >"dbshelve.py is 93 times faster in linux than windows."

and your testing config :

"Fast XML import  about 3 min with 6020 people.
 > GRAMPS: 3.2.0-0.SVN13520M
 > Python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08...
 > BSDDB: 4.4.5.2
 > LANG: sv_SE:utf-8
 > OS: win32
 >
 > Slow XML import > 50 minutes with same XML.
 > GRAMPS: 3.2.0-0.SVN13520M
 > Python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02...
 > BSDDB: 4.7.3
 > LANG: sv_SE:utf-8
 > OS: win32"

on this post ...


True, there is change between python 2.5 and 2.6.
"The bsddb.dbshelve module now uses the highest pickling protocol
available, instead of restricting itself to protocol 1. (Contributed by
W. Barnes.)"
http://docs.python.org/whatsnew/2.6.html#new-improved-and-deprecated-modules

I just see some specific issues under Windows OS, like this one :
http://bugs.python.org/issue6290

Maybe you can try python 2.6.4 ?
http://www.python.org/download/releases/2.6.4/NEWS.txt


Jérôme

Peter Landgren a écrit :

>> Is it not a configuration issue ?
>>
>> It is "easy" to use multiple python versions under Linux. What about under
>> Windows OS ? Uninstall python + related bindings (pycairo, pygtk,
>> pygobject), then install a new python version + related bindings (pycairo,
>> pygtk, pygobject).
>>
>> Is there any wrong reference on Windows Registry ?
>> i.e using python 2.6 with 2.5 bindings !
>>
>> Also, some Linux distributions are loading python with session (or start).
>> Windows does this for MS Office or OpenOffice but not for python libs.
>> Maybe this will decrease performance the first time the Windows user starts
>> gramps but not so big difference and not on import.
> If you have seen the results from the profiling, you see that of the 100 items listed
> 99 of them are about 5 times slower (partly depending on slower cpu by a factor of 1.4)
> with very small deviation. Only ONE item and that's a call to function put in pyshelve.py, which is
> part of bsddb, is 93 times slower.
>
> I have run with only python 2.6.2 installed'and it's still as slow as before, so it's not a
> configuration nissue.
>
> /Peter
>
>> --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...> a écrit :
>>> De: Benny Malengier <benny.malengier@...>
>>> Objet: Re: [Gramps-devel] Extremly slow XML import.
>>> À: "Gramps Development List" <gramps-devel@...>,
>>> "Stephen George" <steve_geo@...> Date: Lundi 9 Novembre 2009,
>>> 9h33
>>> My wrong, it cannot be the filesystem, as
>>> windows with python 2.5 was also 3 min, so must be latest
>>> bsddb.
>>>
>>> Benny
>>>
>>> 2009/11/9 Benny Malengier <benny.malengier@...>
>>>
>>> Stephen, can you forward this
>>> research to the windows list.
>>>
>>> So, dbshelve.py is 93 times faster in linux than windows.
>>> This means windows users have really bad performance on all
>>> batch operations (import, change tools, ....), just look at
>>> the output. For peter, an import of 3 min in linux is more
>>> than 50 min in windows.
>>>
>>>
>>>
>>> My guess is that or the filesystem is to blame (windows
>>> ntfs is not as good as what is present on linux) but cannot
>>> be for so much? Otherwise contact the bsddb pybsddb people
>>> to know if there is something windows users can do to have
>>> bsddb working at normal speed.
>>>
>>>
>>>
>>> Perhaps time this yourself once too.
>>>
>>> Benny
>>>
>>> 2009/11/8 Peter
>>> Landgren <peter.talken@...>
>>>
>>>
>>> Yes,
>>>
>>> the last hint made it.
>>>
>>> I have run in both my Windows box (1.7 GHz 512 MB) and my
>>> Linux box (2.4 GHz 1 GB)
>>>
>>> and compared the result:
>>>
>>> This call:
>>>
>>>
>>> dbshelve.py:256(put)takes
>>> the longest time in both systems.
>>>
>>>
>>>
>>> If I compare the "tottime" the Linux system is
>>> about 5 times faster than the windows except
>>>
>>> for  the dbshelve.py:256(put) which is 93 times faster in
>>> the Linux box. I have attached the outputs from the
>>> profiling and the comparison.
>>>
>>>
>>> I have no deeper knowledge  how to interpret the
>>> profiling.
>>>
>>>
>>> /Peter
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----La pièce jointe associée suit-----
>>>
>>> -------------------------------------------------------------------------
>>> ----- Let Crystal Reports handle the reporting - Free Crystal
>>> Reports 2008 30-Day
>>> trial. Simplify your report design, integration and
>>> deployment - and focus on
>>> what you do best, core application coding. Discover what's
>>> new with
>>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
>>> -----La pièce jointe associée suit-----
>>>
>>> _______________________________________________
>>> Gramps-devel mailing list
>>> Gramps-devel@...
>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>> ---------------------------------------------------------------------------
>> --- Let Crystal Reports handle the reporting - Free Crystal Reports 2008
>> 30-Day trial. Simplify your report design, integration and deployment - and
>> focus on what you do best, core application coding. Discover what's new
>> with Crystal Reports now.  http://p.sf.net/sfu/bobj-july
>> _______________________________________________
>> Gramps-devel mailing list
>> Gramps-devel@...
>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Kummel62 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some further notes:

1. I upgraded to python 2.6 4 on my Windows box and it's still as slow as before!

2. I downloaded Mandriva 2010 Live and installed it and gramps on a fresh partition.
I did the same XML import and now it took 1min 40 sec! With Python 2.6.4.

/Peter

> > If you have seen the results from the profiling,
>
> I just have read that :
>  >"dbshelve.py is 93 times faster in linux than windows."
>
> and your testing config :
>
> "Fast XML import  about 3 min with 6020 people.
>
>  > GRAMPS: 3.2.0-0.SVN13520M
>  > Python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08...
>  > BSDDB: 4.4.5.2
>  > LANG: sv_SE:utf-8
>  > OS: win32
>  >
>  > Slow XML import > 50 minutes with same XML.
>  > GRAMPS: 3.2.0-0.SVN13520M
>  > Python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02...
>  > BSDDB: 4.7.3
>  > LANG: sv_SE:utf-8
>  > OS: win32"
>
> on this post ...
>
>
> True, there is change between python 2.5 and 2.6.
> "The bsddb.dbshelve module now uses the highest pickling protocol
> available, instead of restricting itself to protocol 1. (Contributed by
> W. Barnes.)"
> http://docs.python.org/whatsnew/2.6.html#new-improved-and-deprecated-module
>s
>
> I just see some specific issues under Windows OS, like this one :
> http://bugs.python.org/issue6290
>
> Maybe you can try python 2.6.4 ?
> http://www.python.org/download/releases/2.6.4/NEWS.txt
>
>
> Jérôme
>
> Peter Landgren a écrit :
> >> Is it not a configuration issue ?
> >>
> >> It is "easy" to use multiple python versions under Linux. What about
> >> under Windows OS ? Uninstall python + related bindings (pycairo, pygtk,
> >> pygobject), then install a new python version + related bindings
> >> (pycairo, pygtk, pygobject).
> >>
> >> Is there any wrong reference on Windows Registry ?
> >> i.e using python 2.6 with 2.5 bindings !
> >>
> >> Also, some Linux distributions are loading python with session (or
> >> start). Windows does this for MS Office or OpenOffice but not for python
> >> libs. Maybe this will decrease performance the first time the Windows
> >> user starts gramps but not so big difference and not on import.
> >
> > If you have seen the results from the profiling, you see that of the 100
> > items listed 99 of them are about 5 times slower (partly depending on
> > slower cpu by a factor of 1.4) with very small deviation. Only ONE item
> > and that's a call to function put in pyshelve.py, which is part of bsddb,
> > is 93 times slower.
> >
> > I have run with only python 2.6.2 installed'and it's still as slow as
> > before, so it's not a configuration nissue.
> >
> > /Peter
> >
> >> --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...> a écrit :
> >>> De: Benny Malengier <benny.malengier@...>
> >>> Objet: Re: [Gramps-devel] Extremly slow XML import.
> >>> À: "Gramps Development List" <gramps-devel@...>,
> >>> "Stephen George" <steve_geo@...> Date: Lundi 9 Novembre
> >>> 2009, 9h33
> >>> My wrong, it cannot be the filesystem, as
> >>> windows with python 2.5 was also 3 min, so must be latest
> >>> bsddb.
> >>>
> >>> Benny
> >>>
> >>> 2009/11/9 Benny Malengier <benny.malengier@...>
> >>>
> >>> Stephen, can you forward this
> >>> research to the windows list.
> >>>
> >>> So, dbshelve.py is 93 times faster in linux than windows.
> >>> This means windows users have really bad performance on all
> >>> batch operations (import, change tools, ....), just look at
> >>> the output. For peter, an import of 3 min in linux is more
> >>> than 50 min in windows.
> >>>
> >>>
> >>>
> >>> My guess is that or the filesystem is to blame (windows
> >>> ntfs is not as good as what is present on linux) but cannot
> >>> be for so much? Otherwise contact the bsddb pybsddb people
> >>> to know if there is something windows users can do to have
> >>> bsddb working at normal speed.
> >>>
> >>>
> >>>
> >>> Perhaps time this yourself once too.
> >>>
> >>> Benny
> >>>
> >>> 2009/11/8 Peter
> >>> Landgren <peter.talken@...>
> >>>
> >>>
> >>> Yes,
> >>>
> >>> the last hint made it.
> >>>
> >>> I have run in both my Windows box (1.7 GHz 512 MB) and my
> >>> Linux box (2.4 GHz 1 GB)
> >>>
> >>> and compared the result:
> >>>
> >>> This call:
> >>>
> >>>
> >>> dbshelve.py:256(put)takes
> >>> the longest time in both systems.
> >>>
> >>>
> >>>
> >>> If I compare the "tottime" the Linux system is
> >>> about 5 times faster than the windows except
> >>>
> >>> for  the dbshelve.py:256(put) which is 93 times faster in
> >>> the Linux box. I have attached the outputs from the
> >>> profiling and the comparison.
> >>>
> >>>
> >>> I have no deeper knowledge  how to interpret the
> >>> profiling.
> >>>
> >>>
> >>> /Peter
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----La pièce jointe associée suit-----
> >>>
> >>> -----------------------------------------------------------------------
> >>>-- ----- Let Crystal Reports handle the reporting - Free Crystal
> >>> Reports 2008 30-Day
> >>> trial. Simplify your report design, integration and
> >>> deployment - and focus on
> >>> what you do best, core application coding. Discover what's
> >>> new with
> >>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> >>> -----La pièce jointe associée suit-----
> >>>
> >>> _______________________________________________
> >>> Gramps-devel mailing list
> >>> Gramps-devel@...
> >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
> >>
> >> ------------------------------------------------------------------------
> >>--- --- Let Crystal Reports handle the reporting - Free Crystal Reports
> >> 2008 30-Day trial. Simplify your report design, integration and
> >> deployment - and focus on what you do best, core application coding.
> >> Discover what's new with Crystal Reports now.
> >> http://p.sf.net/sfu/bobj-july
> >> _______________________________________________
> >> Gramps-devel mailing list
> >> Gramps-devel@...
> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel

--
Peter Landgren
Talken Hagen
671 94  BRUNSKOG
0570-530 21
070-635 4719
peter.talken@...
Skype: pgl4820.2


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Benny Malengier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/11/9 Peter Landgren <peter.talken@...>:
> Some further notes:
>
> 1. I upgraded to python 2.6 4 on my Windows box and it's still as slow as before!
>
> 2. I downloaded Mandriva 2010 Live and installed it and gramps on a fresh partition.
> I did the same XML import and now it took 1min 40 sec! With Python 2.6.4.

Do you have ext4 in Mandriva 2010 ? I hear it improves a lot storage
to disk, which I expect is one of the main limitations of an import.

Is installing a linux distribution something you do after breakfast
when you feel like it? I'm on kubuntu 8.4 here, have some feelings!

Benny

>
> /Peter
>
>> > If you have seen the results from the profiling,
>>
>> I just have read that :
>>  >"dbshelve.py is 93 times faster in linux than windows."
>>
>> and your testing config :
>>
>> "Fast XML import  about 3 min with 6020 people.
>>
>>  > GRAMPS: 3.2.0-0.SVN13520M
>>  > Python: 2.5.1 (r251:54863, Apr 18 2007, 08:51:08...
>>  > BSDDB: 4.4.5.2
>>  > LANG: sv_SE:utf-8
>>  > OS: win32
>>  >
>>  > Slow XML import > 50 minutes with same XML.
>>  > GRAMPS: 3.2.0-0.SVN13520M
>>  > Python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02...
>>  > BSDDB: 4.7.3
>>  > LANG: sv_SE:utf-8
>>  > OS: win32"
>>
>> on this post ...
>>
>>
>> True, there is change between python 2.5 and 2.6.
>> "The bsddb.dbshelve module now uses the highest pickling protocol
>> available, instead of restricting itself to protocol 1. (Contributed by
>> W. Barnes.)"
>> http://docs.python.org/whatsnew/2.6.html#new-improved-and-deprecated-module
>>s
>>
>> I just see some specific issues under Windows OS, like this one :
>> http://bugs.python.org/issue6290
>>
>> Maybe you can try python 2.6.4 ?
>> http://www.python.org/download/releases/2.6.4/NEWS.txt
>>
>>
>> Jérôme
>>
>> Peter Landgren a écrit :
>> >> Is it not a configuration issue ?
>> >>
>> >> It is "easy" to use multiple python versions under Linux. What about
>> >> under Windows OS ? Uninstall python + related bindings (pycairo, pygtk,
>> >> pygobject), then install a new python version + related bindings
>> >> (pycairo, pygtk, pygobject).
>> >>
>> >> Is there any wrong reference on Windows Registry ?
>> >> i.e using python 2.6 with 2.5 bindings !
>> >>
>> >> Also, some Linux distributions are loading python with session (or
>> >> start). Windows does this for MS Office or OpenOffice but not for python
>> >> libs. Maybe this will decrease performance the first time the Windows
>> >> user starts gramps but not so big difference and not on import.
>> >
>> > If you have seen the results from the profiling, you see that of the 100
>> > items listed 99 of them are about 5 times slower (partly depending on
>> > slower cpu by a factor of 1.4) with very small deviation. Only ONE item
>> > and that's a call to function put in pyshelve.py, which is part of bsddb,
>> > is 93 times slower.
>> >
>> > I have run with only python 2.6.2 installed'and it's still as slow as
>> > before, so it's not a configuration nissue.
>> >
>> > /Peter
>> >
>> >> --- En date de : Lun 9.11.09, Benny Malengier <benny.malengier@...> a écrit :
>> >>> De: Benny Malengier <benny.malengier@...>
>> >>> Objet: Re: [Gramps-devel] Extremly slow XML import.
>> >>> À: "Gramps Development List" <gramps-devel@...>,
>> >>> "Stephen George" <steve_geo@...> Date: Lundi 9 Novembre
>> >>> 2009, 9h33
>> >>> My wrong, it cannot be the filesystem, as
>> >>> windows with python 2.5 was also 3 min, so must be latest
>> >>> bsddb.
>> >>>
>> >>> Benny
>> >>>
>> >>> 2009/11/9 Benny Malengier <benny.malengier@...>
>> >>>
>> >>> Stephen, can you forward this
>> >>> research to the windows list.
>> >>>
>> >>> So, dbshelve.py is 93 times faster in linux than windows.
>> >>> This means windows users have really bad performance on all
>> >>> batch operations (import, change tools, ....), just look at
>> >>> the output. For peter, an import of 3 min in linux is more
>> >>> than 50 min in windows.
>> >>>
>> >>>
>> >>>
>> >>> My guess is that or the filesystem is to blame (windows
>> >>> ntfs is not as good as what is present on linux) but cannot
>> >>> be for so much? Otherwise contact the bsddb pybsddb people
>> >>> to know if there is something windows users can do to have
>> >>> bsddb working at normal speed.
>> >>>
>> >>>
>> >>>
>> >>> Perhaps time this yourself once too.
>> >>>
>> >>> Benny
>> >>>
>> >>> 2009/11/8 Peter
>> >>> Landgren <peter.talken@...>
>> >>>
>> >>>
>> >>> Yes,
>> >>>
>> >>> the last hint made it.
>> >>>
>> >>> I have run in both my Windows box (1.7 GHz 512 MB) and my
>> >>> Linux box (2.4 GHz 1 GB)
>> >>>
>> >>> and compared the result:
>> >>>
>> >>> This call:
>> >>>
>> >>>
>> >>> dbshelve.py:256(put)takes
>> >>> the longest time in both systems.
>> >>>
>> >>>
>> >>>
>> >>> If I compare the "tottime" the Linux system is
>> >>> about 5 times faster than the windows except
>> >>>
>> >>> for  the dbshelve.py:256(put) which is 93 times faster in
>> >>> the Linux box. I have attached the outputs from the
>> >>> profiling and the comparison.
>> >>>
>> >>>
>> >>> I have no deeper knowledge  how to interpret the
>> >>> profiling.
>> >>>
>> >>>
>> >>> /Peter
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> -----La pièce jointe associée suit-----
>> >>>
>> >>> -----------------------------------------------------------------------
>> >>>-- ----- Let Crystal Reports handle the reporting - Free Crystal
>> >>> Reports 2008 30-Day
>> >>> trial. Simplify your report design, integration and
>> >>> deployment - and focus on
>> >>> what you do best, core application coding. Discover what's
>> >>> new with
>> >>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
>> >>> -----La pièce jointe associée suit-----
>> >>>
>> >>> _______________________________________________
>> >>> Gramps-devel mailing list
>> >>> Gramps-devel@...
>> >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>> >>
>> >> ------------------------------------------------------------------------
>> >>--- --- Let Crystal Reports handle the reporting - Free Crystal Reports
>> >> 2008 30-Day trial. Simplify your report design, integration and
>> >> deployment - and focus on what you do best, core application coding.
>> >> Discover what's new with Crystal Reports now.
>> >> http://p.sf.net/sfu/bobj-july
>> >> _______________________________________________
>> >> Gramps-devel mailing list
>> >> Gramps-devel@...
>> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
> --
> Peter Landgren
> Talken Hagen
> 671 94  BRUNSKOG
> 0570-530 21
> 070-635 4719
> peter.talken@...
> Skype: pgl4820.2
>
>

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Jerome :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I did the same XML import and now it took 1min 40 sec! With
> Python 2.6.4.

Seems there is some possible optimisations out of gramps ...
Last question : is it an ungziped .gramps file ?

I have seen (visible) time diff on the first import action.
i.e it looks like (I did not track it or looked at dir modules), that one is reading "only", then import from a tmp place (where ?), uncompressed .gramps file was only faster the first time on session (maybe once tmp file is generated, python knows the tmp place).

Just an impression and nothing technical, transcription necessary (or decryption) !!!

better results according python versions sounds logical :
2.5.1 -> 2.6.2 -> 2.6.4
but so slow under Windows whatever 2.6 python versions, is something hard to check/track without this OS ... I let you work on it !


Jérôme

--- En date de : Lun 9.11.09, Peter Landgren <peter.talken@...> a écrit :

> De: Peter Landgren <peter.talken@...>
> Objet: Re: [Gramps-devel] Extremly slow XML import.
> À: romjerome@...
> Cc: gramps-devel@..., "Stephen George" <steve_geo@...>, "Benny Malengier" <benny.malengier@...>
> Date: Lundi 9 Novembre 2009, 21h13
> Some further notes:
>
> 1. I upgraded to python 2.6 4 on my Windows box and it's
> still as slow as before!
>
> 2. I downloaded Mandriva 2010 Live and installed it and
> gramps on a fresh partition.
> I did the same XML import and now it took 1min 40 sec! With
> Python 2.6.4.
>
> /Peter
>
> > > If you have seen the results from the profiling,
> >
> > I just have read that :
> >  >"dbshelve.py is 93 times faster in linux
> than windows."
> >
> > and your testing config :
> >
> > "Fast XML import  about 3 min with 6020 people.
> >
> >  > GRAMPS: 3.2.0-0.SVN13520M
> >  > Python: 2.5.1 (r251:54863, Apr 18 2007,
> 08:51:08...
> >  > BSDDB: 4.4.5.2
> >  > LANG: sv_SE:utf-8
> >  > OS: win32
> >  >
> >  > Slow XML import > 50 minutes with same
> XML.
> >  > GRAMPS: 3.2.0-0.SVN13520M
> >  > Python: 2.6.2 (r262:71605, Apr 14 2009,
> 22:40:02...
> >  > BSDDB: 4.7.3
> >  > LANG: sv_SE:utf-8
> >  > OS: win32"
> >
> > on this post ...
> >
> >
> > True, there is change between python 2.5 and 2.6.
> > "The bsddb.dbshelve module now uses the highest
> pickling protocol
> > available, instead of restricting itself to protocol
> 1. (Contributed by
> > W. Barnes.)"
> > http://docs.python.org/whatsnew/2.6.html#new-improved-and-deprecated-module
> >s
> >
> > I just see some specific issues under Windows OS, like
> this one :
> > http://bugs.python.org/issue6290
> >
> > Maybe you can try python 2.6.4 ?
> > http://www.python.org/download/releases/2.6.4/NEWS.txt
> >
> >
> > Jérôme
> >
> > Peter Landgren a écrit :
> > >> Is it not a configuration issue ?
> > >>
> > >> It is "easy" to use multiple python versions
> under Linux. What about
> > >> under Windows OS ? Uninstall python + related
> bindings (pycairo, pygtk,
> > >> pygobject), then install a new python version
> + related bindings
> > >> (pycairo, pygtk, pygobject).
> > >>
> > >> Is there any wrong reference on Windows
> Registry ?
> > >> i.e using python 2.6 with 2.5 bindings !
> > >>
> > >> Also, some Linux distributions are loading
> python with session (or
> > >> start). Windows does this for MS Office or
> OpenOffice but not for python
> > >> libs. Maybe this will decrease performance
> the first time the Windows
> > >> user starts gramps but not so big difference
> and not on import.
> > >
> > > If you have seen the results from the profiling,
> you see that of the 100
> > > items listed 99 of them are about 5 times slower
> (partly depending on
> > > slower cpu by a factor of 1.4) with very small
> deviation. Only ONE item
> > > and that's a call to function put in pyshelve.py,
> which is part of bsddb,
> > > is 93 times slower.
> > >
> > > I have run with only python 2.6.2 installed'and
> it's still as slow as
> > > before, so it's not a configuration nissue.
> > >
> > > /Peter
> > >
> > >> --- En date de : Lun 9.11.09, Benny Malengier
> <benny.malengier@...>
> a écrit :
> > >>> De: Benny Malengier <benny.malengier@...>
> > >>> Objet: Re: [Gramps-devel] Extremly slow
> XML import.
> > >>> À: "Gramps Development List" <gramps-devel@...>,
> > >>> "Stephen George" <steve_geo@...>
> Date: Lundi 9 Novembre
> > >>> 2009, 9h33
> > >>> My wrong, it cannot be the filesystem,
> as
> > >>> windows with python 2.5 was also 3 min,
> so must be latest
> > >>> bsddb.
> > >>>
> > >>> Benny
> > >>>
> > >>> 2009/11/9 Benny Malengier <benny.malengier@...>
> > >>>
> > >>> Stephen, can you forward this
> > >>> research to the windows list.
> > >>>
> > >>> So, dbshelve.py is 93 times faster in
> linux than windows.
> > >>> This means windows users have really bad
> performance on all
> > >>> batch operations (import, change tools,
> ....), just look at
> > >>> the output. For peter, an import of 3 min
> in linux is more
> > >>> than 50 min in windows.
> > >>>
> > >>>
> > >>>
> > >>> My guess is that or the filesystem is to
> blame (windows
> > >>> ntfs is not as good as what is present on
> linux) but cannot
> > >>> be for so much? Otherwise contact the
> bsddb pybsddb people
> > >>> to know if there is something windows
> users can do to have
> > >>> bsddb working at normal speed.
> > >>>
> > >>>
> > >>>
> > >>> Perhaps time this yourself once too.
> > >>>
> > >>> Benny
> > >>>
> > >>> 2009/11/8 Peter
> > >>> Landgren <peter.talken@...>
> > >>>
> > >>>
> > >>> Yes,
> > >>>
> > >>> the last hint made it.
> > >>>
> > >>> I have run in both my Windows box (1.7
> GHz 512 MB) and my
> > >>> Linux box (2.4 GHz 1 GB)
> > >>>
> > >>> and compared the result:
> > >>>
> > >>> This call:
> > >>>
> > >>>
> > >>> dbshelve.py:256(put)takes
> > >>> the longest time in both systems.
> > >>>
> > >>>
> > >>>
> > >>> If I compare the "tottime" the Linux
> system is
> > >>> about 5 times faster than the windows
> except
> > >>>
> > >>> for  the dbshelve.py:256(put) which
> is 93 times faster in
> > >>> the Linux box. I have attached the
> outputs from the
> > >>> profiling and the comparison.
> > >>>
> > >>>
> > >>> I have no deeper knowledge  how to
> interpret the
> > >>> profiling.
> > >>>
> > >>>
> > >>> /Peter
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> -----La pièce jointe associée
> suit-----
> > >>>
> > >>>
> -----------------------------------------------------------------------
> > >>>-- ----- Let Crystal Reports handle the
> reporting - Free Crystal
> > >>> Reports 2008 30-Day
> > >>> trial. Simplify your report design,
> integration and
> > >>> deployment - and focus on
> > >>> what you do best, core application
> coding. Discover what's
> > >>> new with
> > >>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> > >>> -----La pièce jointe associée
> suit-----
> > >>>
> > >>>
> _______________________________________________
> > >>> Gramps-devel mailing list
> > >>> Gramps-devel@...
> > >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
> > >>
> > >>
> ------------------------------------------------------------------------
> > >>--- --- Let Crystal Reports handle the
> reporting - Free Crystal Reports
> > >> 2008 30-Day trial. Simplify your report
> design, integration and
> > >> deployment - and focus on what you do best,
> core application coding.
> > >> Discover what's new with Crystal Reports now.
>
> > >> http://p.sf.net/sfu/bobj-july
> > >>
> _______________________________________________
> > >> Gramps-devel mailing list
> > >> Gramps-devel@...
> > >> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
> --
> Peter Landgren
> Talken Hagen   
> 671 94  BRUNSKOG
> 0570-530 21
> 070-635 4719
> peter.talken@...
> Skype: pgl4820.2
>
>


     

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel

Re: Extremly slow XML import.

by Kummel62 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Den Tuesday 10 November 2009 08.55.01 skrev jerome:
> > I did the same XML import and now it took 1min 40 sec! With
> > Python 2.6.4.
>
> Seems there is some possible optimisations out of gramps ...
> Last question : is it an ungziped .gramps file ?
Yes, it's a original *.gramps file.
/Peter

> I have seen (visible) time diff on the first import action.
> i.e it looks like (I did not track it or looked at dir modules), that one
> is reading "only", then import from a tmp place (where ?), uncompressed
> .gramps file was only faster the first time on session (maybe once tmp file
> is generated, python knows the tmp place).
>
> Just an impression and nothing technical, transcription necessary (or
> decryption) !!!
>
> better results according python versions sounds logical :
> 2.5.1 -> 2.6.2 -> 2.6.4
> but so slow under Windows whatever 2.6 python versions, is something hard
> to check/track without this OS ... I let you work on it !
>
>
> Jérôme
>
> --- En date de : Lun 9.11.09, Peter Landgren <peter.talken@...> a écrit :
> > De: Peter Landgren <peter.talken@...>
> > Objet: Re: [Gramps-devel] Extremly slow XML import.
> > À: romjerome@...
> > Cc: gramps-devel@..., "Stephen George"
> > <steve_geo@...>, "Benny Malengier"
> > <benny.malengier@...> Date: Lundi 9 Novembre 2009, 21h13
> > Some further notes:
> >
> > 1. I upgraded to python 2.6 4 on my Windows box and it's
> > still as slow as before!
> >
> > 2. I downloaded Mandriva 2010 Live and installed it and
> > gramps on a fresh partition.
> > I did the same XML import and now it took 1min 40 sec! With
> > Python 2.6.4.
> >
> > /Peter
> >
> > > > If you have seen the results from the profiling,
> > >
> > > I just have read that :
> > >  >"dbshelve.py is 93 times faster in linux
> >
> > than windows."
> >
> > > and your testing config :
> > >
> > > "Fast XML import  about 3 min with 6020 people.
> > >
> > >  > GRAMPS: 3.2.0-0.SVN13520M
> > >  > Python: 2.5.1 (r251:54863, Apr 18 2007,
> >
> > 08:51:08...
> >
> > >  > BSDDB: 4.4.5.2
> > >  > LANG: sv_SE:utf-8
> > >  > OS: win32
> > >  >
> > >  > Slow XML import > 50 minutes with same
> >
> > XML.
> >
> > >  > GRAMPS: 3.2.0-0.SVN13520M
> > >  > Python: 2.6.2 (r262:71605, Apr 14 2009,
> >
> > 22:40:02...
> >
> > >  > BSDDB: 4.7.3
> > >  > LANG: sv_SE:utf-8
> > >  > OS: win32"
> > >
> > > on this post ...
> > >
> > >
> > > True, there is change between python 2.5 and 2.6.
> > > "The bsddb.dbshelve module now uses the highest
> >
> > pickling protocol
> >
> > > available, instead of restricting itself to protocol
> >
> > 1. (Contributed by
> >
> > > W. Barnes.)"
> > > http://docs.python.org/whatsnew/2.6.html#new-improved-and-deprecated-mo
> > >dule s
> > >
> > > I just see some specific issues under Windows OS, like
> >
> > this one :
> > > http://bugs.python.org/issue6290
> > >
> > > Maybe you can try python 2.6.4 ?
> > > http://www.python.org/download/releases/2.6.4/NEWS.txt
> > >
> > >
> > > Jérôme
> > >
> > > Peter Landgren a écrit :
> > > >> Is it not a configuration issue ?
> > > >>
> > > >> It is "easy" to use multiple python versions
> >
> > under Linux. What about
> >
> > > >> under Windows OS ? Uninstall python + related
> >
> > bindings (pycairo, pygtk,
> >
> > > >> pygobject), then install a new python version
> >
> > + related bindings
> >
> > > >> (pycairo, pygtk, pygobject).
> > > >>
> > > >> Is there any wrong reference on Windows
> >
> > Registry ?
> >
> > > >> i.e using python 2.6 with 2.5 bindings !
> > > >>
> > > >> Also, some Linux distributions are loading
> >
> > python with session (or
> >
> > > >> start). Windows does this for MS Office or
> >
> > OpenOffice but not for python
> >
> > > >> libs. Maybe this will decrease performance
> >
> > the first time the Windows
> >
> > > >> user starts gramps but not so big difference
> >
> > and not on import.
> >
> > > > If you have seen the results from the profiling,
> >
> > you see that of the 100
> >
> > > > items listed 99 of them are about 5 times slower
> >
> > (partly depending on
> >
> > > > slower cpu by a factor of 1.4) with very small
> >
> > deviation. Only ONE item
> >
> > > > and that's a call to function put in pyshelve.py,
> >
> > which is part of bsddb,
> >
> > > > is 93 times slower.
> > > >
> > > > I have run with only python 2.6.2 installed'and
> >
> > it's still as slow as
> >
> > > > before, so it's not a configuration nissue.
> > > >
> > > > /Peter
> > > >
> > > >> --- En date de : Lun 9.11.09, Benny Malengier
> >
> > <benny.malengier@...>
> >
> > a écrit :
> > > >>> De: Benny Malengier <benny.malengier@...>
> > > >>> Objet: Re: [Gramps-devel] Extremly slow
> >
> > XML import.
> >
> > > >>> À: "Gramps Development List" <gramps-devel@...>,
> > > >>> "Stephen George" <steve_geo@...>
> >
> > Date: Lundi 9 Novembre
> >
> > > >>> 2009, 9h33
> > > >>> My wrong, it cannot be the filesystem,
> >
> > as
> >
> > > >>> windows with python 2.5 was also 3 min,
> >
> > so must be latest
> >
> > > >>> bsddb.
> > > >>>
> > > >>> Benny
> > > >>>
> > > >>> 2009/11/9 Benny Malengier <benny.malengier@...>
> > > >>>
> > > >>> Stephen, can you forward this
> > > >>> research to the windows list.
> > > >>>
> > > >>> So, dbshelve.py is 93 times faster in
> >
> > linux than windows.
> >
> > > >>> This means windows users have really bad
> >
> > performance on all
> >
> > > >>> batch operations (import, change tools,
> >
> > ....), just look at
> >
> > > >>> the output. For peter, an import of 3 min
> >
> > in linux is more
> >
> > > >>> than 50 min in windows.
> > > >>>
> > > >>>
> > > >>>
> > > >>> My guess is that or the filesystem is to
> >
> > blame (windows
> >
> > > >>> ntfs is not as good as what is present on
> >
> > linux) but cannot
> >
> > > >>> be for so much? Otherwise contact the
> >
> > bsddb pybsddb people
> >
> > > >>> to know if there is something windows
> >
> > users can do to have
> >
> > > >>> bsddb working at normal speed.
> > > >>>
> > > >>>
> > > >>>
> > > >>> Perhaps time this yourself once too.
> > > >>>
> > > >>> Benny
> > > >>>
> > > >>> 2009/11/8 Peter
> > > >>> Landgren <peter.talken@...>
> > > >>>
> > > >>>
> > > >>> Yes,
> > > >>>
> > > >>> the last hint made it.
> > > >>>
> > > >>> I have run in both my Windows box (1.7
> >
> > GHz 512 MB) and my
> >
> > > >>> Linux box (2.4 GHz 1 GB)
> > > >>>
> > > >>> and compared the result:
> > > >>>
> > > >>> This call:
> > > >>>
> > > >>>
> > > >>> dbshelve.py:256(put)takes
> > > >>> the longest time in both systems.
> > > >>>
> > > >>>
> > > >>>
> > > >>> If I compare the "tottime" the Linux
> >
> > system is
> >
> > > >>> about 5 times faster than the windows
> >
> > except
> >
> > > >>> for  the dbshelve.py:256(put) which
> >
> > is 93 times faster in
> >
> > > >>> the Linux box. I have attached the
> >
> > outputs from the
> >
> > > >>> profiling and the comparison.
> > > >>>
> > > >>>
> > > >>> I have no deeper knowledge  how to
> >
> > interpret the
> >
> > > >>> profiling.
> > > >>>
> > > >>>
> > > >>> /Peter
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> -----La pièce jointe associée
> >
> > suit-----
> >
> >
> > -----------------------------------------------------------------------
> >
> > > >>>-- ----- Let Crystal Reports handle the
> >
> > reporting - Free Crystal
> >
> > > >>> Reports 2008 30-Day
> > > >>> trial. Simplify your report design,
> >
> > integration and
> >
> > > >>> deployment - and focus on
> > > >>> what you do best, core application
> >
> > coding. Discover what's
> >
> > > >>> new with
> > > >>> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> > > >>> -----La pièce jointe associée
> >
> > suit-----
> >
> >
> > _______________________________________________
> >
> > > >>> Gramps-devel mailing list
> > > >>> Gramps-devel@...
> > > >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
> >
> > ------------------------------------------------------------------------
> >
> > > >>--- --- Let Crystal Reports handle the
> >
> > reporting - Free Crystal Reports
> >
> > > >> 2008 30-Day trial. Simplify your report
> >
> > design, integration and
> >
> > > >> deployment - and focus on what you do best,
> >
> > core application coding.
> >
> > > >> Discover what's new with Crystal Reports now.
> > > >>
> > > >> http://p.sf.net/sfu/bobj-july
> >
> > _______________________________________________
> >
> > > >> Gramps-devel mailing list
> > > >> Gramps-devel@...
> > > >> https://lists.sourceforge.net/lists/listinfo/gramps-devel


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Gramps-devel mailing list
Gramps-devel@...
https://lists.sourceforge.net/lists/listinfo/gramps-devel