Is there a Drupal "Web Crawler"?

View: New views
9 Messages — Rating Filter:   Alert me  

Is there a Drupal "Web Crawler"?

by brendan, fresh-off.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hello,

I have a client that wants to know if there are any Drupal modules that search the web for content related to him and his company, and can then return the results (full articles or links to the content) to his drupal website.  For example, search the web for instances where "john doe" + "XYZ Company" both appear in the same piece of content. 

 

Creating the crawler is way beyond my technical ability, so I'm hoping there are some good open source (preferably a Drupal module) options for this functionality.  Wikipedia has a list of open source web crawlers, but since this is a subject I'm unfamiliar with, I'm unsure about whether or not they can be integrated with Drupal - or if any open source web crawlers are even meant to be integrated with a CMS.

 

A little bit more info about the use case: He and his company operate in the education field and are constantly being featured in articles (interviews, write-ups, etc) across the web.  In addition - and most importantly -  he and his company produce several papers/articles that are featured in articles and education related blogs across the internet as well.  He is finding that searching manually for this content to be impractical and thus, would love to have it automatically aggregated and sent to his Drupal site.

 

Any thoughts, ideas, or pointers in the right direction would be apprecaiated!

 

 

----

 

brendan, fresh-off.com

Creative Direction & Consultation: Web | Print | Brand

 

http://fresh-off.com

hello@...

206.328.1067

 

 


_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by My Mailing List Account :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

How's about this?  


I'm not familiar with these at all, but you may want to consider getting the output of one of those crawlers to display inside of Drupal.

Cheers



On 2009-10-13, at 9:09 PM, brendan, fresh-off.com wrote:

Hello,
I have a client that wants to know if there are any Drupal modules that search the web for content related to him and his company, and can then return the results (full articles or links to the content) to his drupal website.  For example, search the web for instances where "john doe" + "XYZ Company" both appear in the same piece of content. 
 
Creating the crawler is way beyond my technical ability, so I'm hoping there are some good open source (preferably a Drupal module) options for this functionality.  Wikipedia has a list of open source web crawlers, but since this is a subject I'm unfamiliar with, I'm unsure about whether or not they can be integrated with Drupal - or if any open source web crawlers are even meant to be integrated with a CMS.
 
A little bit more info about the use case: He and his company operate in the education field and are constantly being featured in articles (interviews, write-ups, etc) across the web.  In addition - and most importantly -  he and his company produce several papers/articles that are featured in articles and education related blogs across the internet as well.  He is finding that searching manually for this content to be impractical and thus, would love to have it automatically aggregated and sent to his Drupal site.
 
Any thoughts, ideas, or pointers in the right direction would be apprecaiated!
 
 
----
 
brendan, fresh-off.com
Creative Direction & Consultation: Web | Print | Brand
 
206.328.1067
 
 
_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting


_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by Laura Scott-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

You're limited by the quality of the search. Much better to create a  
Google search or use some other service where finding is their  
business. Then use aggregator or FeedAPI or some such solution to pull  
in the feed.

Laura

On Oct 13, 2009, at Tue 10/13/09 7:09pm, brendan, fresh-off.com wrote:

> Hello,
> I have a client that wants to know if there are any Drupal modules  
> that search the web for content related to him and his company, and  
> can then return the results (full articles or links to the content)  
> to his drupal website.  For example, search the web for instances  
> where "john doe" + "XYZ Company" both appear in the same piece of  
> content.
>
> Creating the crawler is way beyond my technical ability, so I'm  
> hoping there are some good open source (preferably a Drupal module)  
> options for this functionality.  Wikipedia has a list of open source  
> web crawlers, but since this is a subject I'm unfamiliar with, I'm  
> unsure about whether or not they can be integrated with Drupal - or  
> if any open source web crawlers are even meant to be integrated with  
> a CMS.
>
> A little bit more info about the use case: He and his company  
> operate in the education field and are constantly being featured in  
> articles (interviews, write-ups, etc) across the web.  In addition -  
> and most importantly -  he and his company produce several papers/
> articles that are featured in articles and education related blogs  
> across the internet as well.  He is finding that searching manually  
> for this content to be impractical and thus, would love to have it  
> automatically aggregated and sent to his Drupal site.
>
> Any thoughts, ideas, or pointers in the right direction would be  
> apprecaiated!
>
>
> ----
>
> brendan, fresh-off.com
> Creative Direction & Consultation: Web | Print | Brand
>
> http://fresh-off.com
> hello@...
> 206.328.1067
>
>
> _______________________________________________
> consulting mailing list
> consulting@...
> http://lists.drupal.org/mailman/listinfo/consulting

_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by David Hazel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

+1 to Laura's comment.

I've written a crawler in php (non drupal project) and there are lots of considerations to think of, many of which don't crop up until your adding "features" to your crawler to make your results meaningful and more "consumable".

On Tue, Oct 13, 2009 at 6:27 PM, Laura <pinglaura@...> wrote:
You're limited by the quality of the search. Much better to create a
Google search or use some other service where finding is their
business. Then use aggregator or FeedAPI or some such solution to pull
in the feed.

Laura

On Oct 13, 2009, at Tue 10/13/09 7:09pm, brendan, fresh-off.com wrote:

> Hello,
> I have a client that wants to know if there are any Drupal modules
> that search the web for content related to him and his company, and
> can then return the results (full articles or links to the content)
> to his drupal website.  For example, search the web for instances
> where "john doe" + "XYZ Company" both appear in the same piece of
> content.
>
> Creating the crawler is way beyond my technical ability, so I'm
> hoping there are some good open source (preferably a Drupal module)
> options for this functionality.  Wikipedia has a list of open source
> web crawlers, but since this is a subject I'm unfamiliar with, I'm
> unsure about whether or not they can be integrated with Drupal - or
> if any open source web crawlers are even meant to be integrated with
> a CMS.
>
> A little bit more info about the use case: He and his company
> operate in the education field and are constantly being featured in
> articles (interviews, write-ups, etc) across the web.  In addition -
> and most importantly -  he and his company produce several papers/
> articles that are featured in articles and education related blogs
> across the internet as well.  He is finding that searching manually
> for this content to be impractical and thus, would love to have it
> automatically aggregated and sent to his Drupal site.
>
> Any thoughts, ideas, or pointers in the right direction would be
> apprecaiated!
>
>
> ----
>
> brendan, fresh-off.com
> Creative Direction & Consultation: Web | Print | Brand
>
> http://fresh-off.com
> hello@...
> 206.328.1067
>
>
> _______________________________________________
> consulting mailing list
> consulting@...
> http://lists.drupal.org/mailman/listinfo/consulting

_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting



--
Email is not a secure form of communication!

Drupal Consultant
http://www.hazelconsulting.com/
253.686.0296
dave@...
skype: hazelconsulting
gtalk:kananii
http://www.facebook.com/davidhazel
ICQ: 366587185

_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by Khalid Baheyeldin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, Oct 13, 2009 at 9:09 PM, brendan, fresh-off.com <hello@...> wrote:

Hello,

I have a client that wants to know if there are any Drupal modules that search the web for content related to him and his company, and can then return the results (full articles or links to the content) to his drupal website.  For example, search the web for instances where "john doe" + "XYZ Company" both appear in the same piece of content. 


How about a Google Alert for the terms you are interested in?

--
Khalid M. Baheyeldin
2bits.com, Inc.
http://2bits.com
Drupal optimization, development, customization and consulting.
Simplicity is prerequisite for reliability. --  Edsger W.Dijkstra
Simplicity is the ultimate sophistication. --   Leonardo da Vinci

_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by My Mailing List Account :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes!  And have it delivered to the Drupal site as an RSS feed? 



On 2009-10-13, at 9:40 PM, Khalid Baheyeldin wrote:


On Tue, Oct 13, 2009 at 9:09 PM, brendan, fresh-off.com <hello@...> wrote:

Hello,

I have a client that wants to know if there are any Drupal modules that search the web for content related to him and his company, and can then return the results (full articles or links to the content) to his drupal website.  For example, search the web for instances where "john doe" + "XYZ Company" both appear in the same piece of content. 


How about a Google Alert for the terms you are interested in?

--
Khalid M. Baheyeldin
2bits.com, Inc.
http://2bits.com
Drupal optimization, development, customization and consulting.
Simplicity is prerequisite for reliability. --  Edsger W.Dijkstra
Simplicity is the ultimate sophistication. --   Leonardo da Vinci
_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting


_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by Tony Zielinski :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I was just writing to say the same thing, although I am not aware of a  
feed for Google Web searches, there are feeds for the Google News,  
Blogs and other searches.  I wonder if your client's idea was inspired  
by the "trackback" feature used on some blogs.  The Trackback module  
will automatically post the URL of a referring webpage, which would  
eliminate the remedial task of searching for external sites linking to  
"xyzcomany.com" with the intention of posting it to your site.


On Oct 13, 2009, at 6:27 PM, Laura wrote:

> You're limited by the quality of the search. Much better to create a
> Google search or use some other service where finding is their
> business. Then use aggregator or FeedAPI or some such solution to pull
> in the feed.
>
> Laura
>
> On Oct 13, 2009, at Tue 10/13/09 7:09pm, brendan, fresh-off.com wrote:
>
>> Hello,
>> I have a client that wants to know if there are any Drupal modules
>> that search the web for content related to him and his company, and
>> can then return the results (full articles or links to the content)
>> to his drupal website.  For example, search the web for instances
>> where "john doe" + "XYZ Company" both appear in the same piece of
>> content.
>>
>> Creating the crawler is way beyond my technical ability, so I'm
>> hoping there are some good open source (preferably a Drupal module)
>> options for this functionality.  Wikipedia has a list of open source
>> web crawlers, but since this is a subject I'm unfamiliar with, I'm
>> unsure about whether or not they can be integrated with Drupal - or
>> if any open source web crawlers are even meant to be integrated with
>> a CMS.
>>
>> A little bit more info about the use case: He and his company
>> operate in the education field and are constantly being featured in
>> articles (interviews, write-ups, etc) across the web.  In addition -
>> and most importantly -  he and his company produce several papers/
>> articles that are featured in articles and education related blogs
>> across the internet as well.  He is finding that searching manually
>> for this content to be impractical and thus, would love to have it
>> automatically aggregated and sent to his Drupal site.
>>
>> Any thoughts, ideas, or pointers in the right direction would be
>> apprecaiated!
>>
>>
>> ----
>>
>> brendan, fresh-off.com
>> Creative Direction & Consultation: Web | Print | Brand
>>
>> http://fresh-off.com
>> hello@...
>> 206.328.1067
>>
>>
>> _______________________________________________
>> consulting mailing list
>> consulting@...
>> http://lists.drupal.org/mailman/listinfo/consulting
>
> _______________________________________________
> consulting mailing list
> consulting@...
> http://lists.drupal.org/mailman/listinfo/consulting

_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by Ian Ward-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Check out http://drupal.org/project/managingnews  (disclaimer: I work at Development Seed, which is the company that built this.)

On Tue, Oct 13, 2009 at 9:09 PM, brendan, fresh-off.com <hello@...> wrote:

Hello,

I have a client that wants to know if there are any Drupal modules that search the web for content related to him and his company, and can then return the results (full articles or links to the content) to his drupal website.  For example, search the web for instances where "john doe" + "XYZ Company" both appear in the same piece of content. 

 

Creating the crawler is way beyond my technical ability, so I'm hoping there are some good open source (preferably a Drupal module) options for this functionality.  Wikipedia has a list of open source web crawlers, but since this is a subject I'm unfamiliar with, I'm unsure about whether or not they can be integrated with Drupal - or if any open source web crawlers are even meant to be integrated with a CMS.

 

A little bit more info about the use case: He and his company operate in the education field and are constantly being featured in articles (interviews, write-ups, etc) across the web.  In addition - and most importantly -  he and his company produce several papers/articles that are featured in articles and education related blogs across the internet as well.  He is finding that searching manually for this content to be impractical and thus, would love to have it automatically aggregated and sent to his Drupal site.

 

Any thoughts, ideas, or pointers in the right direction would be apprecaiated!

 

 

----

 

brendan, fresh-off.com

Creative Direction & Consultation: Web | Print | Brand

 

http://fresh-off.com

hello@...

206.328.1067

 

 


_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting



_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting

Re: Is there a Drupal "Web Crawler"?

by Matt Chapman-10 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I haven't tried it, but I think Phase2's Tattler was created for this
purpose:

http://tattlerapp.com/

-Matt



brendan, fresh-off.com wrote:

>
> Hello,
>
> I have a client that wants to know if there are any Drupal modules
> that search the web for content related to him and his company, and
> can then return the results (full articles or links to the content) to
> his drupal website.  For example, search the web for instances where
> "*john doe*" + "*XYZ Company*" both appear in the same piece of content.
>
>  
>
> Creating the crawler is way beyond my technical ability, so I'm hoping
> there are some good open source (preferably a Drupal module) options
> for this functionality.  Wikipedia has a list of open source web
> crawlers, but since this is a subject I'm unfamiliar with, I'm unsure
> about whether or not they can be integrated with Drupal - or if any
> open source web crawlers are even meant to be integrated with a CMS.
>
>  
>
> A little bit more info about the use case: He and his company operate
> in the education field and are constantly being featured in articles
> (interviews, write-ups, etc) across the web.  In addition - and most
> importantly -  he and his company produce several papers/articles that
> are featured in articles and education related blogs across the
> internet as well.  He is finding that searching manually for this
> content to be impractical and thus, would love to have it
> automatically aggregated and sent to his Drupal site.
>
>  
>
> Any thoughts, ideas, or pointers in the right direction would be
> apprecaiated!
>
>  
>
>  
>
> ----
>
>  
>
> *brendan, fresh-off.com*
>
> Creative Direction & Consultation: Web | Print | Brand
>
>  
>
> http://fresh-off.com
>
> hello@... <mailto:hello@...>
>
> 206.328.1067
>
>  
>
>  
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> consulting mailing list
> consulting@...
> http://lists.drupal.org/mailman/listinfo/consulting
>  
_______________________________________________
consulting mailing list
consulting@...
http://lists.drupal.org/mailman/listinfo/consulting