Should I propose this class to pear?

View: New views
5 Messages — Rating Filter:   Alert me  

Should I propose this class to pear?

by Tech Support-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello!

I wrote a very simple class to convert html into plaintext format.
It's a very simple class that consists of one file and xsl template. It
uses XSL template for transforming html into
a usable plaintext format, dropping tags that cannot possibly be used in
plaintext like <style>, <script>, <head>, <object>
and applying some meaningful styles to some other tags like <h1>, <b>,
<br> into newline, <p> into 2 new lines + tab + text + 2 more new lines
<ol> items converted into numbered items (with actual numbers), the
links are extracted so that the <a href...>title</a> tags are converted
like this:
Link title [Link: http://..... ]

<img> tags are converted to links to img src tag.

The XSL template also makes use of some php functions using
registerPHPFunctions() on XSLProcessor class


Anyway, is anyone interested in making this into a pear class?


--
PEAR Development Mailing List (http://pear.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: Should I propose this class to pear?

by tfk :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 6, 2009 at 2:48 PM, Dmitri <message@...> wrote:

> Hello!
>
> I wrote a very simple class to convert html into plaintext format.
> It's a very simple class that consists of one file and xsl template. It uses
> XSL template for transforming html into
> a usable plaintext format, dropping tags that cannot possibly be used in
> plaintext like <style>, <script>, <head>, <object>
> and applying some meaningful styles to some other tags like <h1>, <b>, <br>
> into newline, <p> into 2 new lines + tab + text + 2 more new lines
> <ol> items converted into numbered items (with actual numbers), the links
> are extracted so that the <a href...>title</a> tags are converted like this:
> Link title [Link: http://..... ]
>
> <img> tags are converted to links to img src tag.

Maybe you could take over and extend XML_XSLT_Wrapper?
http://pear.php.net/package/XML_XSLT_Wrapper

Till

--
PEAR Development Mailing List (http://pear.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: Should I propose this class to pear?

by Tech Support-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I was not even aware of that PEAR class, but now that I've looked at it, it
looks too complicated. I don't know anything about all these external
command like xslt processors (I've hear of Sablotron), I only worked with php's XSLProcessor
class and with browser-based XSL transformations. Also that
XML_XSLT_Wrapper is not a php 5 class, so it will have to be rewritten
anyway.

My class for HTML2Text uses XSL template, DOMDocument and XSLTProcessor
in a very simple way - so the whole package is just one class file under 200 lines long including the comments, and
one xsl template, no need to complicate it with another wrapper.

If anyone is interested, I can upload it somewhere and post a link.





till wrote:

> On Tue, Oct 6, 2009 at 2:48 PM, Dmitri <message@...> wrote:
>  
>> Hello!
>>
>> I wrote a very simple class to convert html into plaintext format.
>> It's a very simple class that consists of one file and xsl template. It uses
>> XSL template for transforming html into
>> a usable plaintext format, dropping tags that cannot possibly be used in
>> plaintext like <style>, <script>, <head>, <object>
>> and applying some meaningful styles to some other tags like <h1>, <b>, <br>
>> into newline, <p> into 2 new lines + tab + text + 2 more new lines
>> <ol> items converted into numbered items (with actual numbers), the links
>> are extracted so that the <a href...>title</a> tags are converted like this:
>> Link title [Link: http://..... ]
>>
>> <img> tags are converted to links to img src tag.
>>    
>
> Maybe you could take over and extend XML_XSLT_Wrapper?
> http://pear.php.net/package/XML_XSLT_Wrapper
>
> Till
>
>
>  


--
PEAR Development Mailing List (http://pear.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: Should I propose this class to pear?

by tfk :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Just a suggestion, you don't have to. :-)

Probably wouldn't hurt to upload the code somewhere and add an example
or two so people see what it's build for.


Till

On Tue, Oct 6, 2009 at 10:26 AM, Dmitri <message@...> wrote:

> I was not even aware of that PEAR class, but now that I've looked at it, it
> looks too complicated. I don't know anything about all these external
> command like xslt processors (I've hear of Sablotron), I only worked with
> php's XSLProcessor class and with browser-based XSL transformations. Also
> that XML_XSLT_Wrapper is not a php 5 class, so it will have to be rewritten
> anyway.
>
> My class for HTML2Text uses XSL template, DOMDocument and XSLTProcessor in a
> very simple way - so the whole package is just one class file under 200
> lines long including the comments, and one xsl template, no need to
> complicate it with another wrapper.
>
> If anyone is interested, I can upload it somewhere and post a link.
>
>
>
>
>
> till wrote:
>>
>> On Tue, Oct 6, 2009 at 2:48 PM, Dmitri <message@...> wrote:
>>
>>>
>>> Hello!
>>>
>>> I wrote a very simple class to convert html into plaintext format.
>>> It's a very simple class that consists of one file and xsl template. It
>>> uses
>>> XSL template for transforming html into
>>> a usable plaintext format, dropping tags that cannot possibly be used in
>>> plaintext like <style>, <script>, <head>, <object>
>>> and applying some meaningful styles to some other tags like <h1>, <b>,
>>> <br>
>>> into newline, <p> into 2 new lines + tab + text + 2 more new lines
>>> <ol> items converted into numbered items (with actual numbers), the links
>>> are extracted so that the <a href...>title</a> tags are converted like
>>> this:
>>> Link title [Link: http://..... ]
>>>
>>> <img> tags are converted to links to img src tag.
>>>
>>
>> Maybe you could take over and extend XML_XSLT_Wrapper?
>> http://pear.php.net/package/XML_XSLT_Wrapper
>>
>> Till
>>
>>
>>
>
>

--
PEAR Development Mailing List (http://pear.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: Should I propose this class to pear?

by Tech Support-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OK, here is the link, you can download and try it out. It's the very
first release, tested with only few html files as input.

http://webmaster.lampcms.com/p86486-HTML_to_Text_with_XSLT/


till wrote:
> Just a suggestion, you don't have to. :-)
>
> Probably wouldn't hurt to upload the code somewhere and add an example
> or two so people see what it's build for.
>
>  


--
PEAR Development Mailing List (http://pear.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php