Alignment question for write callback

View: New views
5 Messages — Rating Filter:   Alert me  

Alignment question for write callback

by Goswin von Brederlow-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I wonder if it would be possible, from the kernel<->user protocol
side, to read write requests in such a way that the data itself ends
up being 512 byte (or page) aligned. Is there a fixed size request
followed by variable size data depending on the request type?

The reason that I ask is that libaio needs the data to be aligned and
I would rather not copy the data from the fuse buffer to an aligned
buffer for libaio. Would be better to patch libfuse to allow aligned
buffers.

MfG
        Goswin

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Alignment question for write callback

by Miklos Szeredi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 05 Nov 2009, Goswin von Brederlow wrote:

> I wonder if it would be possible, from the kernel<->user protocol
> side, to read write requests in such a way that the data itself ends
> up being 512 byte (or page) aligned. Is there a fixed size request
> followed by variable size data depending on the request type?

It would be possible to allocate the buffer so that data portion of
write requests get the required alignment.  But that obviously means
that other type of requests won't be aligned (which shouldn't be a
problem AFAICS).

To do that, you basically have to copy the fuse_loop* function that
you need and change the buffer allocation code to your needs.

Not the nicest solution, but should work.

Thanks,
Miklos

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Alignment question for write callback

by Goswin von Brederlow-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Miklos Szeredi <miklos@...> writes:

> On Thu, 05 Nov 2009, Goswin von Brederlow wrote:
>
>> I wonder if it would be possible, from the kernel<->user protocol
>> side, to read write requests in such a way that the data itself ends
>> up being 512 byte (or page) aligned. Is there a fixed size request
>> followed by variable size data depending on the request type?
>
> It would be possible to allocate the buffer so that data portion of
> write requests get the required alignment.  But that obviously means
> that other type of requests won't be aligned (which shouldn't be a
> problem AFAICS).
>
> To do that, you basically have to copy the fuse_loop* function that
> you need and change the buffer allocation code to your needs.
>
> Not the nicest solution, but should work.
>
> Thanks,
> Miklos

So the header of a write request is always the same size but the
header of other requests differ?

Write is probably the only common callback with large amounts of
data. Probably the only one I need alignment of the data as well. In
other cases (for me) the data needs to be copied anyway and is usualy
small.

I was hoping that there would be one fixed size header for all
requests followed by a variable number of bytes for data (for write,
setattr, symlink, ...). Idealy even have seperate buffer for the
header and the payload. Can fuse requests be read in chunks from
/dev/fuse or does every read() call have to read a full request?

MfG
        Goswin



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Alignment question for write callback

by Miklos Szeredi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 05 Nov 2009, Goswin von Brederlow wrote:
> So the header of a write request is always the same size but the
> header of other requests differ?

Right.

> Write is probably the only common callback with large amounts of
> data. Probably the only one I need alignment of the data as well. In
> other cases (for me) the data needs to be copied anyway and is usualy
> small.
>
> I was hoping that there would be one fixed size header for all
> requests followed by a variable number of bytes for data (for write,
> setattr, symlink, ...). Idealy even have seperate buffer for the
> header and the payload. Can fuse requests be read in chunks from
> /dev/fuse or does every read() call have to read a full request?

It's one read per request, currently.  That's sort of necessary if we
want to keep the multi-threaded parallel reader/writer capability.

There's a strong push towards allowing zero-copy capabilities, one way
of which would be to use splice(2).  So you'd do the following:

  splice(fuse_fd, NULL, private_pipe_fd, NULL, max_req_len, 0)

to transfer one request from the fuse device to a per-thread private
pipe.  Then you could read the header and handle the data accordingly.
The data could then be further spliced onto a filesystem, network
socket or other device in a zero-copy fashion.

The other direction could be done similarly just in the reverse:
assemble the request in a private pipe from separate buffers, then
splice that onto the fuse device in one go.

The downside is more system calls, which isn't a problem with large
read/write requests, but might slightly degrade the performance of
smaller (lookup, getattr, etc..) requests.

Thanks,
Miklos

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Alignment question for write callback

by Goswin von Brederlow-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Miklos Szeredi <miklos@...> writes:

> On Thu, 05 Nov 2009, Goswin von Brederlow wrote:
>> So the header of a write request is always the same size but the
>> header of other requests differ?
>
> Right.
>
>> Write is probably the only common callback with large amounts of
>> data. Probably the only one I need alignment of the data as well. In
>> other cases (for me) the data needs to be copied anyway and is usualy
>> small.
>>
>> I was hoping that there would be one fixed size header for all
>> requests followed by a variable number of bytes for data (for write,
>> setattr, symlink, ...). Idealy even have seperate buffer for the
>> header and the payload. Can fuse requests be read in chunks from
>> /dev/fuse or does every read() call have to read a full request?
>
> It's one read per request, currently.  That's sort of necessary if we
> want to keep the multi-threaded parallel reader/writer capability.

I was afraid of that.

> There's a strong push towards allowing zero-copy capabilities, one way
> of which would be to use splice(2).  So you'd do the following:
>
>   splice(fuse_fd, NULL, private_pipe_fd, NULL, max_req_len, 0)
>
> to transfer one request from the fuse device to a per-thread private
> pipe.  Then you could read the header and handle the data accordingly.
> The data could then be further spliced onto a filesystem, network
> socket or other device in a zero-copy fashion.
>
> The other direction could be done similarly just in the reverse:
> assemble the request in a private pipe from separate buffers, then
> splice that onto the fuse device in one go.
>
> The downside is more system calls, which isn't a problem with large
> read/write requests, but might slightly degrade the performance of
> smaller (lookup, getattr, etc..) requests.
>
> Thanks,
> Miklos

I would verry much like that. But maybe a mixed mode for single
threaded reads (or appropriately locked threaded):

read(fuse_fd, req, small_req_len);
if (req->large_request) {
   splice(fuse_fd, NULL, private_pipe_fd, NULL, max_data_len, 0);
}

That way lookup, getattr, release, ... only need the one syscall and
writes and setxattr with lots of data use zero-copy.

For read and getxattr there could also be

write(fuse_fd, reply, small_reply_length);
splice(user_fd, NULL, fuse_fd, NULL, data_len, 0);

That is if splicing from a file/device/socket FD to fuse_fd can be
made to work directly.


But as that breaks the current multithreaded read/write case that
might be difficult to implement or undesirable due to the neccessary
locking.

MfG
        Goswin

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel