Caching file_get_contents output inside APC (Performance improvement).

View: New views
5 Messages — Rating Filter:   Alert me  

Caching file_get_contents output inside APC (Performance improvement).

by Basant Kukreja-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
   APC creates a mmap which is used for caching compiled php response. Besides
this APC provides API to cache user data into this mmap. There are many php
apps which uses file_get_contents to read the static files. If APC can cache
these files in it's mmap, it will improve file_get_contents performance.

Based on the above idea, I developed a patch which stores the file_get_contents
output in APC's cache. In this patch, APC intercepts the calls of
file_get_contents and replaces it with apc_file_get_contents function. If file
is a full path and entire file is requested then APC caches the content.

To enable the caching use the following in php.ini :
apc.cache_static_contents=1
apc.user_ttl = 30

apc.user_ttl will decide how long the entry will remain inside the cache.  If
apc.user_ttl is set to 0 then the entry will never be removed from cache.

Here is the performance data collected using Studio 12 on Solaris sparc for
ecommerce php benchmark. (3500 simultaneous users)

Excl.     Incl.     Excl.       Incl.       Excl.     Incl.      Name
User CPU  User CPU  Total LWP   Total LWP   Sys. CPU  Sys. CPU
    sec.      sec.        sec.        sec.      sec.      sec.
No Cache :
  0.831    28.660       1.141     540.018     0.030   344.311
zif_file_get_contents
With Cache :
  1.481   181.537       1.821     282.758     0.140    12.449
zif_apc_file_get_contents


User + System time before Caching = 372 seconds
User + System time after Caching = 193 seconds
Total time reduction in file_cache_contents = 48%

Here is my blog :
http://blogs.sun.com/basant/entry/caching_file_get_contents_output
Here is the link to the patch :
http://blogs.sun.com/basant/resource/apc_file_cache_trunk_patch.txt

There are many other enhancements possible in this patch. I would appreciate
any comments.

Regards,
Basant.

--
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [APC-DEV] Caching file_get_contents output inside APC (Performance improvement).

by Pierre Joye :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

hi,

As caching everything obviously helps, I wonder if it makes sense to
do it inside an opcode cache. What performence improvement does it
bring vs:

(pseudo code where my_cache* could be either APC user cache, memcached
or other caching lib like Cache_Lite)
if (!($cached = my_cache_get($key))) {
  $data = file_get_contents($path);
  my_cache_add($data, $ttl);
}

Cheers,
--
Pierre
On Mon, Oct 26, 2009 at 9:36 PM, Basant Kukreja
<basant.kukreja@...> wrote:

> Hi,
>   APC creates a mmap which is used for caching compiled php response. Besides
> this APC provides API to cache user data into this mmap. There are many php
> apps which uses file_get_contents to read the static files. If APC can cache
> these files in it's mmap, it will improve file_get_contents performance.
>
> Based on the above idea, I developed a patch which stores the file_get_contents
> output in APC's cache. In this patch, APC intercepts the calls of
> file_get_contents and replaces it with apc_file_get_contents function. If file
> is a full path and entire file is requested then APC caches the content.
>
> To enable the caching use the following in php.ini :
> apc.cache_static_contents=1
> apc.user_ttl = 30
>
> apc.user_ttl will decide how long the entry will remain inside the cache.  If
> apc.user_ttl is set to 0 then the entry will never be removed from cache.
>
> Here is the performance data collected using Studio 12 on Solaris sparc for
> ecommerce php benchmark. (3500 simultaneous users)
>
> Excl.     Incl.     Excl.       Incl.       Excl.     Incl.      Name
> User CPU  User CPU  Total LWP   Total LWP   Sys. CPU  Sys. CPU
>    sec.      sec.        sec.        sec.      sec.      sec.
> No Cache :
>  0.831    28.660       1.141     540.018     0.030   344.311
> zif_file_get_contents
> With Cache :
>  1.481   181.537       1.821     282.758     0.140    12.449
> zif_apc_file_get_contents
>
>
> User + System time before Caching = 372 seconds
> User + System time after Caching = 193 seconds
> Total time reduction in file_cache_contents = 48%
>
> Here is my blog :
> http://blogs.sun.com/basant/entry/caching_file_get_contents_output
> Here is the link to the patch :
> http://blogs.sun.com/basant/resource/apc_file_cache_trunk_patch.txt
>
> There are many other enhancements possible in this patch. I would appreciate
> any comments.
>
> Regards,
> Basant.
>
> --
> APC Development Mailing List (http://pecl.php.net/APC)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



--
Pierre

http://blog.thepimp.net | http://www.libgd.org

--
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [APC-DEV] Caching file_get_contents output inside APC (Performance improvement).

by Basant Kukreja-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
    Probably it will perform close to the version I attached but my
point is that application doesn't
need to be modified and can use this feature by modifying configuration.

More options which I think of which can be used to extend the caching feature :
apc.cache_max_file_size    # Maximum amount of files to be cached
apc.max_cache_use_bytes # Maximum amount of space to be occupied in cache.

Regards,
Basant.


On Mon, Oct 26, 2009 at 2:14 PM, Pierre Joye <pierre.php@...> wrote:

> hi,
>
> As caching everything obviously helps, I wonder if it makes sense to
> do it inside an opcode cache. What performence improvement does it
> bring vs:
>
> (pseudo code where my_cache* could be either APC user cache, memcached
> or other caching lib like Cache_Lite)
> if (!($cached = my_cache_get($key))) {
>  $data = file_get_contents($path);
>  my_cache_add($data, $ttl);
> }
>
> Cheers,
> --
> Pierre
> On Mon, Oct 26, 2009 at 9:36 PM, Basant Kukreja
> <basant.kukreja@...> wrote:
>> Hi,
>>   APC creates a mmap which is used for caching compiled php response. Besides
>> this APC provides API to cache user data into this mmap. There are many php
>> apps which uses file_get_contents to read the static files. If APC can cache
>> these files in it's mmap, it will improve file_get_contents performance.
>>
>> Based on the above idea, I developed a patch which stores the file_get_contents
>> output in APC's cache. In this patch, APC intercepts the calls of
>> file_get_contents and replaces it with apc_file_get_contents function. If file
>> is a full path and entire file is requested then APC caches the content.
>>
>> To enable the caching use the following in php.ini :
>> apc.cache_static_contents=1
>> apc.user_ttl = 30
>>
>> apc.user_ttl will decide how long the entry will remain inside the cache.  If
>> apc.user_ttl is set to 0 then the entry will never be removed from cache.
>>
>> Here is the performance data collected using Studio 12 on Solaris sparc for
>> ecommerce php benchmark. (3500 simultaneous users)
>>
>> Excl.     Incl.     Excl.       Incl.       Excl.     Incl.      Name
>> User CPU  User CPU  Total LWP   Total LWP   Sys. CPU  Sys. CPU
>>    sec.      sec.        sec.        sec.      sec.      sec.
>> No Cache :
>>  0.831    28.660       1.141     540.018     0.030   344.311
>> zif_file_get_contents
>> With Cache :
>>  1.481   181.537       1.821     282.758     0.140    12.449
>> zif_apc_file_get_contents
>>
>>
>> User + System time before Caching = 372 seconds
>> User + System time after Caching = 193 seconds
>> Total time reduction in file_cache_contents = 48%
>>
>> Here is my blog :
>> http://blogs.sun.com/basant/entry/caching_file_get_contents_output
>> Here is the link to the patch :
>> http://blogs.sun.com/basant/resource/apc_file_cache_trunk_patch.txt
>>
>> There are many other enhancements possible in this patch. I would appreciate
>> any comments.
>>
>> Regards,
>> Basant.
>>
>> --
>> APC Development Mailing List (http://pecl.php.net/APC)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
>
> --
> Pierre
>
> http://blog.thepimp.net | http://www.libgd.org
>

--
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [APC-DEV] Caching file_get_contents output inside APC (Performance improvement).

by shire :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Basant,

Basant Kukreja wrote:

> On Mon, Oct 26, 2009 at 2:14 PM, Pierre Joye <pierre.php@...> wrote:
>>
>> (pseudo code where my_cache* could be either APC user cache, memcached
>> or other caching lib like Cache_Lite)
>> if (!($cached = my_cache_get($key))) {
>>   $data = file_get_contents($path);
>>   my_cache_add($data, $ttl);
>> }
>>
>
>      Probably it will perform close to the version I attached but my
> point is that application doesn't
> need to be modified and can use this feature by modifying configuration.
>
> More options which I think of which can be used to extend the caching feature :
> apc.cache_max_file_size    # Maximum amount of files to be cached
> apc.max_cache_use_bytes # Maximum amount of space to be occupied in cache.
>


Thanks for the patch/suggested feature but I tend to agree with Pierre's suggestion for handling this in user space.  This is an extremely simple wrapper function to enable caching these values, and as your additional configuration items suggest, placing this functionality in APC creates some unnecessary complications and less flexibility for the PHP developer.

The intention of the APC user cache is to allow extending it in PHP user space for caching any number of items like this, using the very method Pierre suggested.  Heading in the direction you're suggesting would mean that APC should be responsible for caching all sorts of stream and file based activities that PHP and even other extensions could perform without knowing enough about the data to do it in an intelligent way.

Unless you can provide reasons why the APC extension in particular is in a situation to handle this better than anything else, I don't see a compelling reason to start adding this type of functionality.

-shire

--
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [APC-DEV] Caching file_get_contents output inside APC (Performance improvement).

by Basant Kukreja-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
   I did this effort to improve the specweb php performance improvement. I
can't of course change the php scripts to use APC specific functions to improve
specweb php performance improvement.

Here are few advantages :
* Similar to specweb php, I would expect many many php applications
will be using
  file_get_contents to load static content from php. We can't expect to modify
  these applications to improve their performance.
* Moreover modifying these applications to use APC functions makes them apc
  specific applications. These applications can't be run/tested without APC.
  (Application can test the apc availability dynamically but that is too much
   to ask from applications).
* I downloaded 3 applications mediawiki, joomla and wordpress and found that
  all 3 applications uses file_get_contents. Even though these applications
  can use the mechanism described by piere, they don't. Unless APC is bundled
  inside php, it is very unlikely that these applications will make apc specific
  calls.  Adding this feature inside APC can benefit those users who uses these
  kind of applications.
* Cache can be turn on/off by configuration.


Whether or not this feature goes in APC depends on what we think about the
importance of file caching. If we think that file cache is a important aspect
and many many customers are using it then integrating and maintaining it inside
APC makes sense to me.

Inspiration behind this feature :
---------------------------------
    File cache is a integrated part of Sun Web Server (Originated from Netscape
enterprise server). When jsp application includes a static file, Sun Web Server
serves the file from it's file cache.  Now I want to do the same thing for php
applications too. When php script loads a file, it should give an opportunity
to Web Server to serve content from it's file cache but unfortunately php
doesn't implement any such feature. Initially I tried to hook up the feature in
php and was able to do so but hooking inside APC was much easier and much
cleaner because APC already provide APIs for storing content. I could also
write an extension to do so but then I have to re-implement mmap APIs what APC
already does.

Regards,
Basant.

On Wed, Oct 28, 2009 at 12:56 AM, shire <shire@...> wrote:

>
> Hi Basant,
>
> Basant Kukreja wrote:
>
>> On Mon, Oct 26, 2009 at 2:14 PM, Pierre Joye <pierre.php@...> wrote:
>>>
>>> (pseudo code where my_cache* could be either APC user cache, memcached
>>> or other caching lib like Cache_Lite)
>>> if (!($cached = my_cache_get($key))) {
>>>  $data = file_get_contents($path);
>>>  my_cache_add($data, $ttl);
>>> }
>>>
>>
>>     Probably it will perform close to the version I attached but my
>> point is that application doesn't
>> need to be modified and can use this feature by modifying configuration.
>>
>> More options which I think of which can be used to extend the caching
>> feature :
>> apc.cache_max_file_size    # Maximum amount of files to be cached
>> apc.max_cache_use_bytes # Maximum amount of space to be occupied in cache.
>>
>
>
> Thanks for the patch/suggested feature but I tend to agree with Pierre's
> suggestion for handling this in user space.  This is an extremely simple
> wrapper function to enable caching these values, and as your additional
> configuration items suggest, placing this functionality in APC creates some
> unnecessary complications and less flexibility for the PHP developer.
>
> The intention of the APC user cache is to allow extending it in PHP user
> space for caching any number of items like this, using the very method
> Pierre suggested.  Heading in the direction you're suggesting would mean
> that APC should be responsible for caching all sorts of stream and file
> based activities that PHP and even other extensions could perform without
> knowing enough about the data to do it in an intelligent way.
>
> Unless you can provide reasons why the APC extension in particular is in a
> situation to handle this better than anything else, I don't see a compelling
> reason to start adding this type of functionality.
>
> -shire
>

--
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php