[Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

View: New views
5 Messages — Rating Filter:   Alert me  

[Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

by Bdale Garbee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

A user of our Debian packages of tar points out that it might be more
efficient to use lseek() in some circumstances, particularly for
archives containing a small number of large files.

Bdale


Package: tar
Version: 1.20-1
Severity: normal

When running "tar tf" on a large tarball containing few large files, tar
becomes incredibly slow. The reason for this is that tar issues many
read()s to skip the file content, but as it doesn't need the content, it
throws it away. GNU tar could just use lseek() like libarchive and busybox
do.

Also a problem is extracting only one file from said tarball, where the
other files do not need to be read for just that one to be extracted.

The problem exists on all versions of GNU tar and was confirmed to exist
on at least all versions starting from etchthe one shipped with Etch.

-- System Information:
Debian Release: 5.0.2
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-amd64 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages tar depends on:
ii  libc6                         2.9-12     GNU C Library: Shared libraries

tar recommends no packages.

Versions of packages tar suggests:
ii  bzip2                         1.0.5-2    high-quality block-sorting file co
pn  ncompress                     <none>     (no description available)

-- no debconf information



Re: [Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

by Sergey Poznyakoff-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bdale Garbee <bdale@...> ha escrit:

> A user of our Debian packages of tar points out that it might be more
> efficient to use lseek() in some circumstances, particularly for
> archives containing a small number of large files.

The --seek (-n) command line option instructs tar to use lseeks
instead of reads. Use it.

Regards,
Sergey



Re: [Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

by Lars Stoltenow-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Aug 27, 2009 at 06:29:04PM +0300, Sergey Poznyakoff wrote:
> The --seek (-n) command line option instructs tar to use lseeks
> instead of reads. Use it.

Then probably GNU tar should detect automatically if a file is seekable
or not.




Re: [Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

by Tim Kientzle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Lars Stoltenow wrote:
> On Thu, Aug 27, 2009 at 06:29:04PM +0300, Sergey Poznyakoff wrote:
>> The --seek (-n) command line option instructs tar to use lseeks
>> instead of reads. Use it.
>
> Then probably GNU tar should detect automatically if a file is seekable
> or not.

This was discussed about two years ago:

http://www.mail-archive.com/bug-tar@.../msg01602.html



Re: [Fwd: Bug#543913: "tar tf" uses read() instead of lseek(), making it slow]

by Linda Walsh-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Tim Kientzle wrote:
> Lars Stoltenow wrote:
>> On Thu, Aug 27, 2009 at 06:29:04PM +0300, Sergey Poznyakoff wrote:
>>> The --seek (-n) command line option instructs tar to use lseeks instead
>>> of reads. Use it.
>> Then probably GNU tar should detect automatically if a file is seekable
>> or not.
> This was discussed about two years ago:
> http://www.mail-archive.com/bug-tar@.../msg01602.html
----

  Yes, it was, and the final resolution, ...

     Subject: Re: [Bug-tar] GNU tar, star and BSD tar speed comparision

     On Tue, 2007-Oct-23, 03:32, Sergey Poznyakoff wrote:
     > Tim Kientzle ha escrit:
     > > Sergey Poznyakoff Tue, 23 Oct 2007 02:31:28 -0700 wrote:
     > > > When reading uncompressed tar archives stored in regular
     > > > files, bsdtar uses lseek() operations to skip over the
     > > > bodies of files.
     > As a side note, the similar feature in GNU tar is enabled using
     > seek option.  Is there a reason GNU tar doesn't enable this by
     > default for all regular files?
     
     I forgot to implement it :) But I'll fix that soon.
     
     Regards, Sergey

  Not sure of exact circumstances, but should the user
have encountered the problem if he was using a fixed
version?