stats in binary protocol

View: New views
5 Messages — Rating Filter:   Alert me  

stats in binary protocol

by Toru Maesaka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

So I've been working on implementing the stats command over binary
protocol the last couple of days and came across few questions that I
would like to ask. My approach is based on the discussion from the
last hackathon. Heres a paragraph from Dustin's notes:

--
A stats command is issued with a single string parameter, and the
server returns multiple responses, each containing a key, and a string
value.  A terminating packet indicates the server has nothing more to
say.  [We didn't really talk about the details of this, but I'd
recommend terminating with a stat with a 0 length key and 0 length
value].
--

What I would like to know is:

(1) Since the server sends back each row of an arbitrary stats output
at a time to the client, does this mean a given client should be able
to ask for any specific information? (e.g. a client could _only_ ask
for the pid of the server). This would make my life a little tricky
;-)

(2) What should we do if a key wasn't contained in the request? I'm
thinking that this case should be treated the same as "stats\r\n" in
the ascii protocol. Is there any objections...?

(3) What is the motivation behind sending row at a time?, when we
could serialize the entire output of an arbitrary stats output and
send it back to the client, like the current ascii approach (output
data is fairly small afterall). I'm asking this since this approach
would be relatively easy to pull off.

It would clear my mind a lot if I could get these questions answered :-)

Cheers,
Toru

Re: stats in binary protocol

by Dormando :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Toru Maesaka wrote:
> Hi all,
>
> So I've been working on implementing the stats command over binary
> protocol the last couple of days and came across few questions that I
> would like to ask. My approach is based on the discussion from the
> last hackathon. Heres a paragraph from Dustin's notes:

Thanks for doing this :) Sorry I suck. I'll comment, but Dustin/etc
should be able to comment on this as well :)

> (1) Since the server sends back each row of an arbitrary stats output
> at a time to the client, does this mean a given client should be able
> to ask for any specific information? (e.g. a client could _only_ ask
> for the pid of the server). This would make my life a little tricky
> ;-)

No, I believe the intent was that the key would provide sets of
information, not match specific information.

> (2) What should we do if a key wasn't contained in the request? I'm
> thinking that this case should be treated the same as "stats\r\n" in
> the ascii protocol. Is there any objections...?

Yeah. 'stats'.

> (3) What is the motivation behind sending row at a time?, when we
> could serialize the entire output of an arbitrary stats output and
> send it back to the client, like the current ascii approach (output
> data is fairly small afterall). I'm asking this since this approach
> would be relatively easy to pull off.

The idea was to make it simpler for the clients at a higher level. What
this does is allow the clients to return arrays of key/value responses
to their callers, instead of using some format that the client author
would then need to parse.

Presently it's a one-liner in perl to parse stats, the libmemcached API
"hates" the stats (but supports it, I think?), and designing it this way
would make it dead simple for all of the languages to return arrays or
hashmaps of the requested stats without any extra parsing code.

Of course, anyone's free to comment here :)

-Dormando

Re: stats in binary protocol

by Toru Maesaka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Oops, forgot to reply to all.

Thanks for the reply Dormando! and excuse me for sending the same email twice :)

> No, I believe the intent was that the key would provide sets of
> information, not match specific information.

Fantastic :-) I was worried about this for a moment.

> Yeah. 'stats'.

Awesome!

> The idea was to make it simpler for the clients at a higher level. What
> this does is allow the clients to return arrays of key/value responses
> to their callers, instead of using some format that the client author
> would then need to parse.

This was really good to hear as my original intention was to return,
say, 'STAT <name> <value>\r\n' row at a time but after reading your
reply, I assume what we reaaaally want to do is include the key in the
response packet, like the getk command in the binary protocol right?
This way clients won't need to do any string parsing.

Please do correct me if I'm wrong :(

Toru

On Fri, Jun 27, 2008 at 5:26 PM, dormando <dormando@...> wrote:

> Toru Maesaka wrote:
>> Hi all,
>>
>> So I've been working on implementing the stats command over binary
>> protocol the last couple of days and came across few questions that I
>> would like to ask. My approach is based on the discussion from the
>> last hackathon. Heres a paragraph from Dustin's notes:
>
> Thanks for doing this :) Sorry I suck. I'll comment, but Dustin/etc
> should be able to comment on this as well :)
>
>> (1) Since the server sends back each row of an arbitrary stats output
>> at a time to the client, does this mean a given client should be able
>> to ask for any specific information? (e.g. a client could _only_ ask
>> for the pid of the server). This would make my life a little tricky
>> ;-)
>
> No, I believe the intent was that the key would provide sets of
> information, not match specific information.
>
>> (2) What should we do if a key wasn't contained in the request? I'm
>> thinking that this case should be treated the same as "stats\r\n" in
>> the ascii protocol. Is there any objections...?
>
> Yeah. 'stats'.
>
>> (3) What is the motivation behind sending row at a time?, when we
>> could serialize the entire output of an arbitrary stats output and
>> send it back to the client, like the current ascii approach (output
>> data is fairly small afterall). I'm asking this since this approach
>> would be relatively easy to pull off.
>
> The idea was to make it simpler for the clients at a higher level. What
> this does is allow the clients to return arrays of key/value responses
> to their callers, instead of using some format that the client author
> would then need to parse.
>
> Presently it's a one-liner in perl to parse stats, the libmemcached API
> "hates" the stats (but supports it, I think?), and designing it this way
> would make it dead simple for all of the languages to return arrays or
> hashmaps of the requested stats without any extra parsing code.
>
> Of course, anyone's free to comment here :)
>
> -Dormando
>

Parent Message unknown Re: stats in binary protocol

by Dormando :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Knew I was forgetting an important detail there :)

Toru; Putting the key in the response packet is what I meant, yes. So
there's no encoding, except probably formatting everything (numbers,
other strings) as strings in the value. This is, as you said, in
alternative to encoding "<stat> = value" in the value part of the
response packets.

I'll also echo Dustin's sentiments that the implementation is more up to
you, I'm just echoing what we discussed prior, and Dustin's arguments
below are more of the reason why multi-response looks to be the right way.

No strong opinions on returning an error or not on unmatched stats
commands though.

-Dormando

> Well, as the one doing the implementation, I think you probably have a
> bit more leverage at the moment to decide what's wrong.  :)
>
> The general spirit of stats argument is that it's interpreted by the
> server.  So we *could* decide that it's a good idea to have an argument
> in the form of general:pid to return only the pid if it made sense.  The
> only downside is that it'd have to be supported for a while.
>
> Part of what we discussed was having the stats processed by multiple
> parts of the server.  That is, there are stats that are core (such as
> PID, rusage, number of hits and misses, etc...), stats that are
> engine-specific (such as how many items total exist, LRU eviction stats,
> etc...) and possibly others.  This means that a stats request has to be
> passed around with a consistent API to emit values back to the client.
>
> This is another reason the ``multi-response'' thing is necessary.  An
> engine may have a *lot* of stats it wants to emit, and buffering that
> all up and formatting it for a given protocol could be difficult and
> would slow things down/use more memory.  It all just kind of works its
> way out like this.  Formatting is handled by the protocol handler, and
> content is handled by the thing that knows.
>
> It is important at this point to determine what it means to provide an
> argument the server doesn't understand (and what it means to not
> understand it) in this case.  You could infer that if nobody emitted a
> response, there was nothing interesting to say about it, or you could
> have the API actually indicate whether or not the component actually
> understood the stat argument.  Does it even matter?  We could also just
> say, ``If the server has something to say about your argument, it will.''


Re: stats in binary protocol

by Toru Maesaka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dustin, Dormando,

Thanks for the feedback, you guys got rid of my concerns :)

> Toru; Putting the key in the response packet is what I meant, yes. So
> there's no encoding, except probably formatting everything (numbers, other
> strings) as strings in the value. This is, as you said, in alternative to
> encoding "<stat> = value" in the value part of the response packets.

Sounds good!

>> Part of what we discussed was having the stats processed by multiple parts
>> of the server.  That is, there are stats that are core (such as PID, rusage,
>> number of hits and misses, etc...), stats that are engine-specific (such as
>> how many items total exist, LRU eviction stats, etc...) and possibly others.
>>  This means that a stats request has to be passed around with a consistent
>> API to emit values back to the client

Gotcha. I will keep this in mind when I design the approach.

Okay, so let me get back to you guys when I have something decent to
show (though I assume we'll be chatting on IRC much sooner than that).

Have a good weekend,
Toru


On Sat, Jun 28, 2008 at 3:08 AM, dormando <dormando@...> wrote:

> Knew I was forgetting an important detail there :)
>
> Toru; Putting the key in the response packet is what I meant, yes. So
> there's no encoding, except probably formatting everything (numbers, other
> strings) as strings in the value. This is, as you said, in alternative to
> encoding "<stat> = value" in the value part of the response packets.
>
> I'll also echo Dustin's sentiments that the implementation is more up to
> you, I'm just echoing what we discussed prior, and Dustin's arguments below
> are more of the reason why multi-response looks to be the right way.
>
> No strong opinions on returning an error or not on unmatched stats commands
> though.
>
> -Dormando
>
>> Well, as the one doing the implementation, I think you probably have a bit
>> more leverage at the moment to decide what's wrong.  :)
>>
>> The general spirit of stats argument is that it's interpreted by the
>> server.  So we *could* decide that it's a good idea to have an argument in
>> the form of general:pid to return only the pid if it made sense.  The only
>> downside is that it'd have to be supported for a while.
>>
>> Part of what we discussed was having the stats processed by multiple parts
>> of the server.  That is, there are stats that are core (such as PID, rusage,
>> number of hits and misses, etc...), stats that are engine-specific (such as
>> how many items total exist, LRU eviction stats, etc...) and possibly others.
>>  This means that a stats request has to be passed around with a consistent
>> API to emit values back to the client.
>>
>> This is another reason the ``multi-response'' thing is necessary.  An
>> engine may have a *lot* of stats it wants to emit, and buffering that all up
>> and formatting it for a given protocol could be difficult and would slow
>> things down/use more memory.  It all just kind of works its way out like
>> this.  Formatting is handled by the protocol handler, and content is handled
>> by the thing that knows.
>>
>> It is important at this point to determine what it means to provide an
>> argument the server doesn't understand (and what it means to not understand
>> it) in this case.  You could infer that if nobody emitted a response, there
>> was nothing interesting to say about it, or you could have the API actually
>> indicate whether or not the component actually understood the stat argument.
>>  Does it even matter?  We could also just say, ``If the server has something
>> to say about your argument, it will.''
>
>