Re: [gnu.org #429351] Access statistics for Savannah

View: New views
10 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: [gnu.org #429351] Access statistics for Savannah

by Alex Fernandez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jun 23, 2009 at 9:57 PM, Sylvain Beucler<beuc@...> wrote:
>> >From my part I don't have any problems, now or ever. Let us ask
>> Sylvain first who was quite busy with the disk loss.
>
> No problem Alex, please go ahead.

Fine then. The objective of our little project is to create access
statistics for project developers. Since project web pages are stored
in the www.nongnu.org web server developers cannot access them now.
(There is another leg which consists of getting file download
statistics so they can also see how many people have downloaded a
particular version, but that is a different matter.)

As I understand it, Savannah (sv.nongnu.org) triggers an update to the
web pages by using a curl command. The gnu.org web server then reads
the corresponding web directories from sv.nongnu.org. What we would
need is for the web server to copy back the corresponding logs into
sv.nongnu.org, strictly for the www.nongnu.org domain (so there are no
conflicts with other gnu.org pages). Later on we might tackle GNU
projects in Savannah -- I understand they are served from
http://www.gnu.org/software/ so the log can probably be isolated too.

Then the log files will be processed by awstats and put at the
disposition of project developers if they want to see them or even
publish them. This latest part can be achieved by restricting the
pages analyzed by awstats to a certain project. (I don't know yet how
to publish the stats so that only project developers can see them,
perhaps they can even be mailed to developers if desired.)

I have created the directory
  /var/log/apache2/www.nongnu.org
in the cvs vserver so that we can do a test run. At the moment the
information will not be accessible to developers.

To summarize in a few questions:
  - Does the www.nongnu.org web server have ssh access to
sv.nongnu.org? Can it copy files there, in the specified directory?
  - Are the log files for www.nongnu.org kept separate from the
www.gnu.org domain so they can be copied as is? If not, can they be
separated, or at least grepped before copying?
  - How often are the web server logs rotated?
  - Can the copying task be automated?

I am cc'ing Savannah-hackers since they were interested, I hope
everyone is fine with it.

Alex.



Parent Message unknown [gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> [alejandrofer@... - Tue Jun 23 18:56:25 2009]:
> Fine then. The objective of our little project is to create access
> statistics for project developers. Since project web pages are stored
> in the www.nongnu.org web server developers cannot access them now.
> (There is another leg which consists of getting file download
> statistics so they can also see how many people have downloaded a
> particular version, but that is a different matter.)
>
> As I understand it, Savannah (sv.nongnu.org) triggers an update to the
> web pages by using a curl command. The gnu.org web server then reads
> the corresponding web directories from sv.nongnu.org. What we would
> need is for the web server to copy back the corresponding logs into
> sv.nongnu.org, strictly for the www.nongnu.org domain (so there are no
> conflicts with other gnu.org pages). Later on we might tackle GNU
> projects in Savannah -- I understand they are served from
> http://www.gnu.org/software/ so the log can probably be isolated too.
>
> Then the log files will be processed by awstats and put at the
> disposition of project developers if they want to see them or even
> publish them. This latest part can be achieved by restricting the
> pages analyzed by awstats to a certain project. (I don't know yet how
> to publish the stats so that only project developers can see them,
> perhaps they can even be mailed to developers if desired.)
>
> I have created the directory
>   /var/log/apache2/www.nongnu.org
> in the cvs vserver so that we can do a test run. At the moment the
> information will not be accessible to developers.
>
> To summarize in a few questions:
>   - Does the www.nongnu.org web server have ssh access to
> sv.nongnu.org? Can it copy files there, in the specified directory?

It can be made to.

>   - Are the log files for www.nongnu.org kept separate from the
> www.gnu.org domain so they can be copied as is? If not, can they be
> separated, or at least grepped before copying?

They are separate.

>   - How often are the web server logs rotated?

Daily.

>   - Can the copying task be automated?

Sure.

I can set up such a cron job if you want.

Thanks,
Ward.

--
Ward Vandewege <ward@...>
Free Software Foundation - Senior System Administrator




Re: [gnu.org #429351] Access statistics for Savannah

by Alex Fernandez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 24, 2009 at 11:16 PM, Ward Vandewege via RT<sysadmin@...> wrote:
>>   - Can the copying task be automated?
>
> Sure.
>
> I can set up such a cron job if you want.

Fine. Could you copy a couple of log files from www.nongnu.org to
sv.nongnu.org, for testing purposes? Directory is
  /vservers/vcs-noshell/var/log/apache2/www.nongnu.org

Thanks,

Alex.



Re: [gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 24, 2009 at 11:16 PM, Ward Vandewege via RT<sysadmin@...> wrote:
>>   - Can the copying task be automated?
>
> Sure.
>
> I can set up such a cron job if you want.

Fine. Could you copy a couple of log files from www.nongnu.org to
sv.nongnu.org, for testing purposes? Directory is
  /vservers/vcs-noshell/var/log/apache2/www.nongnu.org

Thanks,

Alex.






[gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> [alejandrofer@... - Sat Jun 27 08:22:15 2009]:
>
> On Wed, Jun 24, 2009 at 11:16 PM, Ward Vandewege via
> RT<sysadmin@...> wrote:
> >>   - Can the copying task be automated?
> >
> > Sure.
> >
> > I can set up such a cron job if you want.
>
> Fine. Could you copy a couple of log files from www.nongnu.org to
> sv.nongnu.org, for testing purposes? Directory is
>   /vservers/vcs-noshell/var/log/apache2/www.nongnu.org

Sorry - this ticket got buried and then overlooked :/

I've now copied all our old logfiles for nongnu.org to

  /vservers/vcs-noshell/var/log/apache2/www.nongnu.org

I have installed a cron job that will copy the previous days files every
day, after log rotation. I've created a new user on savannah for this
purpose, with name nongnulogcopy.

You will see there are 4 files for every day:

  nongnu-access.log.1
  nongnu-error.log.1
  nongnu-projects.log.1
  nongnu-projects-error.log.1

The first 2 catch all access to (www.)nongnu.org/. The latter 2 files
catch all access to *.nongnu.org.

I think those nongnu-projects files will probably not be very useful
because they don't log which hostname the access is from. If you want, I
can add a field to those files for that purpose. Let us know.

OK?

Thanks,
Ward.

--
Ward Vandewege <ward@...>
Free Software Foundation - Senior System Administrator




Re: [gnu.org #429351] Access statistics for Savannah

by Alex Fernandez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ward,

On Mon, Sep 28, 2009 at 6:41 PM, Ward Vandewege via RT <sysadmin@...> wrote:
> Sorry - this ticket got buried and then overlooked :/

I was thinking of pinging you again -- thanks for being faster.

> I have installed a cron job that will copy the previous days files every
> day, after log rotation. I've created a new user on savannah for this
> purpose, with name nongnulogcopy.

Most useful.

> You will see there are 4 files for every day:
>
>  nongnu-access.log.1
>  nongnu-error.log.1
>  nongnu-projects.log.1
>  nongnu-projects-error.log.1
>
> The first 2 catch all access to (www.)nongnu.org/. The latter 2 files
> catch all access to *.nongnu.org.

Due to my (immense) ignorance I didn't even know that projects can be
reached from project.nongnu.org!

> I think those nongnu-projects files will probably not be very useful
> because they don't log which hostname the access is from. If you want, I
> can add a field to those files for that purpose. Let us know.

I think it would be nice. If you are sure about some format that
awstat likes just do it; otherwise let me look into it.

It would seem that some projects prefer
  http://www.nongnu.org/project
and others
  http://project.nongnu.org/
with a majority in the first group. In my case I used the first option
due to ignorance (the second is much cleaner); perhaps it can be
extrapolated to others.

> OK?

Great. Now as soon as I have some time, let's start awstating those logs!

Alex.



Re: [gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ward,

On Mon, Sep 28, 2009 at 6:41 PM, Ward Vandewege via RT <sysadmin@...> wrote:
> Sorry - this ticket got buried and then overlooked :/

I was thinking of pinging you again -- thanks for being faster.

> I have installed a cron job that will copy the previous days files every
> day, after log rotation. I've created a new user on savannah for this
> purpose, with name nongnulogcopy.

Most useful.

> You will see there are 4 files for every day:
>
>  nongnu-access.log.1
>  nongnu-error.log.1
>  nongnu-projects.log.1
>  nongnu-projects-error.log.1
>
> The first 2 catch all access to (www.)nongnu.org/. The latter 2 files
> catch all access to *.nongnu.org.

Due to my (immense) ignorance I didn't even know that projects can be
reached from project.nongnu.org!

> I think those nongnu-projects files will probably not be very useful
> because they don't log which hostname the access is from. If you want, I
> can add a field to those files for that purpose. Let us know.

I think it would be nice. If you are sure about some format that
awstat likes just do it; otherwise let me look into it.

It would seem that some projects prefer
  http://www.nongnu.org/project
and others
  http://project.nongnu.org/
with a majority in the first group. In my case I used the first option
due to ignorance (the second is much cleaner); perhaps it can be
extrapolated to others.

> OK?

Great. Now as soon as I have some time, let's start awstating those logs!

Alex.






[gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Alex,

> [alejandrofer@... - Mon Sep 28 19:46:22 2009]:
> Due to my (immense) ignorance I didn't even know that projects can be
> reached from project.nongnu.org!

I don't think that form is used very much, for whatever reason. Perhaps
it is undocumented...

> > I think those nongnu-projects files will probably not be very useful
> > because they don't log which hostname the access is from. If you
> want, I
> > can add a field to those files for that purpose. Let us know.
>
> I think it would be nice. If you are sure about some format that
> awstat likes just do it; otherwise let me look into it.
 
I modified the log format to:

  CustomLog /var/log/apache2/nongnu-projects.log "%v %h %l %u %t \"%r\"
%>s %b \"%{Referer}i\" \"%{User-Agent}i\""

It shouldn't be hard to make awstats do the right thing with that by
telling it which fields are which.

> Great. Now as soon as I have some time, let's start awstating those
> logs!

Sounds good. Note that I made a change to the cron script that runs on
nadesico, it will now dump all log files into the same directory but
with names like the following, so as not to overwrite older files:

   nongnu-projects-error.log.20090929

etc.

Thanks,
Ward.

--
Ward Vandewege <ward@...>
Free Software Foundation - Senior System Administrator




Re: [gnu.org #429351] Access statistics for Savannah

by Alex Fernandez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ward,

On Wed, Sep 30, 2009 at 11:23 PM, Ward Vandewege via RT
<sysadmin@...> wrote:
>> [alejandrofer@... - Mon Sep 28 19:46:22 2009]:
>> Due to my (immense) ignorance I didn't even know that projects can be
>> reached from project.nongnu.org!
>
> I don't think that form is used very much, for whatever reason. Perhaps
> it is undocumented...

The homepage link on Savannah certainly points to
www.nongnu.org/project. I don't know what is better... with
project.nongnu.org you save 4 characters.

> I modified the log format to:
>
>  CustomLog /var/log/apache2/nongnu-projects.log "%v %h %l %u %t \"%r\"
> %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
>
> It shouldn't be hard to make awstats do the right thing with that by
> telling it which fields are which.

Thanks again.

> Sounds good. Note that I made a change to the cron script that runs on
> nadesico, it will now dump all log files into the same directory but
> with names like the following, so as not to overwrite older files:
>
>   nongnu-projects-error.log.20090929

Much better. Thank bog for ISO date format!

Alex.



Re: [gnu.org #429351] Access statistics for Savannah

by Ward Vandewege via RT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ward,

On Wed, Sep 30, 2009 at 11:23 PM, Ward Vandewege via RT
<sysadmin@...> wrote:
>> [alejandrofer@... - Mon Sep 28 19:46:22 2009]:
>> Due to my (immense) ignorance I didn't even know that projects can be
>> reached from project.nongnu.org!
>
> I don't think that form is used very much, for whatever reason. Perhaps
> it is undocumented...

The homepage link on Savannah certainly points to
www.nongnu.org/project. I don't know what is better... with
project.nongnu.org you save 4 characters.

> I modified the log format to:
>
>  CustomLog /var/log/apache2/nongnu-projects.log "%v %h %l %u %t \"%r\"
> %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
>
> It shouldn't be hard to make awstats do the right thing with that by
> telling it which fields are which.

Thanks again.

> Sounds good. Note that I made a change to the cron script that runs on
> nadesico, it will now dump all log files into the same directory but
> with names like the following, so as not to overwrite older files:
>
>   nongnu-projects-error.log.20090929

Much better. Thank bog for ISO date format!

Alex.