an extension of the mizar language for dealing with article metadata

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

an extension of the mizar language for dealing with article metadata

by Jesse Alama :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

As it stands now, the title of an article, its author(s), and the name
of sections aren't actually a formal part of a mizar text: we
customarily include this information as comments

  :: Introduction to XYZ Theory
  ::
  :: by Joe the MIZAR Writer

  environ

  ...

  begin :: Preliminaries

  ...

  begin :: Main Lemma

  ...

  begin :: Main Theorem

  ...

I propose to put this information into the body of the article, to move
it from comments to non-comments.  Perhaps we could introduce new
keywords like `title', `author', and `section'.  Then, articles could
look like

 title{Introduction to XYZ Theory}
 author{Joe the MIZAR Writer}

 environ

   <stuff>

 begin{Preliminaries}

 definition ...

 theorem ...

 theorem ...

 begin{Main Lemma}

 definition ...

 begin{Main Theorem}

 theorem ...

This would make the sections and article title part of the text; as it
stands, these features are not represented in the mizar grammar (because
customarily they are tucked away within comments).

The primary motivation for this idea is to help improve Josef Urban's
semantic MML presentation.  Currently he has to do some hacking to
extract metadata like article titles and authors; extracting section
information (currently unsupported) would be similarly complicated.  The
problem is that the article metadata isn't represented in the XML
representation of an article; it has to be inferred from the original
mizar source.

Additionally, this new syntax might help to simplify processing of FM
articles: with this approach one wouldn't need the BibTeX fields
SECTION1, SECTION2, etc., because these could be inferred from the
article.

Thoughts?

Best,

Jesse

--
Jesse Alama (alama@...)

Re: an extension of the mizar language for dealing with article metadata

by Piotr Rudnicki :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I suggest, following Jesse's suggestion that the entire .bib files for
Mizar articles be incorporated into the .miz files.

Cheers, PR

On Wed, Nov 19, 2008 at 01:12:34PM -0800, Jesse Alama wrote:

> As it stands now, the title of an article, its author(s), and the name
> of sections aren't actually a formal part of a mizar text: we
> customarily include this information as comments
>
>   :: Introduction to XYZ Theory
>   ::
>   :: by Joe the MIZAR Writer
>
>   environ
>
>   ...
>
>   begin :: Preliminaries
>
>   ...
>
>   begin :: Main Lemma
>
>   ...
>
>   begin :: Main Theorem
>
>   ...
>
> I propose to put this information into the body of the article, to move
> it from comments to non-comments.  Perhaps we could introduce new
> keywords like `title', `author', and `section'.  Then, articles could
> look like
>
>  title{Introduction to XYZ Theory}
>  author{Joe the MIZAR Writer}
>
>  environ
>
>    <stuff>
>
>  begin{Preliminaries}
>
>  definition ...
>
>  theorem ...
>
>  theorem ...
>
>  begin{Main Lemma}
>
>  definition ...
>
>  begin{Main Theorem}
>
>  theorem ...
>
> This would make the sections and article title part of the text; as it
> stands, these features are not represented in the mizar grammar (because
> customarily they are tucked away within comments).
>
> The primary motivation for this idea is to help improve Josef Urban's
> semantic MML presentation.  Currently he has to do some hacking to
> extract metadata like article titles and authors; extracting section
> information (currently unsupported) would be similarly complicated.  The
> problem is that the article metadata isn't represented in the XML
> representation of an article; it has to be inferred from the original
> mizar source.
>
> Additionally, this new syntax might help to simplify processing of FM
> articles: with this approach one wouldn't need the BibTeX fields
> SECTION1, SECTION2, etc., because these could be inferred from the
> article.
>
> Thoughts?
>
> Best,
>
> Jesse
>
> --
> Jesse Alama (alama@...)

--
Piotr Rudnicki                                http://web.cs.ualberta.ca/~piotr

Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Jesse,

In Algol 60 a comment was a part of the language. The syntax was
     comment   TEXT ;

where TEXT was an arbitrary string of characters ( the semicolon itself
was not allowed)
We treat now comments as something that actually is not a part of the
language. I would rather
have "pure Mizar" with only these informations that are necessary for
the verifier.
We have now some features that should not be in pure Mizar and probably
should be replaced by pragmas:
they may be classified in the following categories:
i. temporary pragmas:   '@proof'  - to suspend verification
ii. library pragmas:   'canceled' - to keep numeration correct
iii. editorial pragmas:  repeated 'begin' - to divide article into section.
iv. maybe some others

We tried - on our seminar - to find common solution, but the categories
are so different that probably
there is none.

Recently, after reading your letter, we discussed the problem again with
Czeslaw and it seems that for the metadata the proper solution is to
have an XML-envelope of the Mizar article. Maybe something like
      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
          "the Mizar article"
      </ARTICLE>
      </DOCUMENT>
(mind you, I know almost nothing on XML, so it might quite naive).

Then from such XML document we may extract the Mizar article and process
it in the usual way.
"the Mizar article" is probably not the exact citation: we have '<' as a
symbol in Mizar. Still extracting the Mizar article should not be too
complicated. Maybe we can be more precise and have the form

      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
      <ENVIRONMENT DECLARATION>
          "directives"
      </ENVIRONMENT DECLARATION>
       <TEXT PROPER>
          "the text proper"
      </TEXT PROPER>
      </ARTICLE>
      </DOCUMENT>

Then, 'environ' and 'begin' (if necessary) may be added while extracting
the Mizar article.  The directives and the text proper are written in
two different languages, anyway.

      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
      <ENVIRONMENT DECLARATION>
          "directives"
      </ENVIRONMENT DECLARATION>
       <TEXT PROPER>
        <SECTION>
          "the text proper"
        </SECTION>
          ...
        <SECTION>
          "the text proper"
        </SECTION>
      </TEXT PROPER>
      </ARTICLE>
      </DOCUMENT>

Then we may put in the <SECTION> the title and maybe even the summary of
the section. And the grammar of Mizar will be a little bit simpler: no
repetition of 'begin'.

Regards,
Andrzej

Jesse Alama wrote:

>As it stands now, the title of an article, its author(s), and the name
>of sections aren't actually a formal part of a mizar text: we
>customarily include this information as comments
>
>  :: Introduction to XYZ Theory
>  ::
>  :: by Joe the MIZAR Writer
>
>  environ
>
>  ...
>
>  begin :: Preliminaries
>
>  ...
>
>  begin :: Main Lemma
>
>  ...
>
>  begin :: Main Theorem
>
>  ...
>
>I propose to put this information into the body of the article, to move
>it from comments to non-comments.  Perhaps we could introduce new
>keywords like `title', `author', and `section'.  Then, articles could
>look like
>
> title{Introduction to XYZ Theory}
> author{Joe the MIZAR Writer}
>
> environ
>
>   <stuff>
>
> begin{Preliminaries}
>
> definition ...
>
> theorem ...
>
> theorem ...
>
> begin{Main Lemma}
>
> definition ...
>
> begin{Main Theorem}
>
> theorem ...
>
>This would make the sections and article title part of the text; as it
>stands, these features are not represented in the mizar grammar (because
>customarily they are tucked away within comments).
>
>The primary motivation for this idea is to help improve Josef Urban's
>semantic MML presentation.  Currently he has to do some hacking to
>extract metadata like article titles and authors; extracting section
>information (currently unsupported) would be similarly complicated.  The
>problem is that the article metadata isn't represented in the XML
>representation of an article; it has to be inferred from the original
>mizar source.
>
>Additionally, this new syntax might help to simplify processing of FM
>articles: with this approach one wouldn't need the BibTeX fields
>SECTION1, SECTION2, etc., because these could be inferred from the
>article.
>
>Thoughts?
>
>Best,
>
>Jesse
>
>  
>


Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi:

Piotr Rudnicki wrote:

>Hi,
>
>I suggest, following Jesse's suggestion that the entire .bib files for
>Mizar articles be incorporated into the .miz files.
>
>  
>
I am not against, I am not pro.  Could me tell me what would be the
gain, if we did it?

Regards,
Andrzej

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

I think the new directives should not use XML syntax, simply because it is
ugly, verbose, and unnecessary. I don't think XML should be used for
direct human authoring and reading, just as assembly language should not.
The new directives should be normal Mizar keywords like "author", "title",
etc.

Josef

On Tue, 25 Nov 2008, trybulec wrote:

> Dear Jesse,
>
> In Algol 60 a comment was a part of the language. The syntax was
>    comment   TEXT ;
>
> where TEXT was an arbitrary string of characters ( the semicolon itself
> was not allowed)
> We treat now comments as something that actually is not a part of the
> language. I would rather
> have "pure Mizar" with only these informations that are necessary for the
> verifier.
> We have now some features that should not be in pure Mizar and probably
> should be replaced by pragmas:
> they may be classified in the following categories:
> i. temporary pragmas:   '@proof'  - to suspend verification
> ii. library pragmas:   'canceled' - to keep numeration correct
> iii. editorial pragmas:  repeated 'begin' - to divide article into
> section.
> iv. maybe some others
>
> We tried - on our seminar - to find common solution, but the categories
> are so different that probably
> there is none.
>
> Recently, after reading your letter, we discussed the problem again with
> Czeslaw and it seems that for the metadata the proper solution is to have
> an XML-envelope of the Mizar article. Maybe something like
>     <DOCUMENT>
>        "the place for metadata"
>     <ARTICLE>
>         "the Mizar article"
>     </ARTICLE>
>     </DOCUMENT>
> (mind you, I know almost nothing on XML, so it might quite naive).
>
> Then from such XML document we may extract the Mizar article and process
> it in the usual way.
> "the Mizar article" is probably not the exact citation: we have '<' as a
> symbol in Mizar. Still extracting the Mizar article should not be too
> complicated. Maybe we can be more precise and have the form
>
>     <DOCUMENT>
>        "the place for metadata"
>     <ARTICLE>
>     <ENVIRONMENT DECLARATION>
>         "directives"
>     </ENVIRONMENT DECLARATION>
>      <TEXT PROPER>
>         "the text proper"
>     </TEXT PROPER>
>     </ARTICLE>
>     </DOCUMENT>
>
> Then, 'environ' and 'begin' (if necessary) may be added while extracting
> the Mizar article.  The directives and the text proper are written in two
> different languages, anyway.
>
>     <DOCUMENT>
>        "the place for metadata"
>     <ARTICLE>
>     <ENVIRONMENT DECLARATION>
>         "directives"
>     </ENVIRONMENT DECLARATION>
>      <TEXT PROPER>
>       <SECTION>
>         "the text proper"
>       </SECTION>
>         ...
>       <SECTION>
>         "the text proper"
>       </SECTION>
>     </TEXT PROPER>
>     </ARTICLE>
>     </DOCUMENT>
>
> Then we may put in the <SECTION> the title and maybe even the summary of
> the section. And the grammar of Mizar will be a little bit simpler: no
> repetition of 'begin'.
>
> Regards,
> Andrzej
>
> Jesse Alama wrote:
>
>> As it stands now, the title of an article, its author(s), and the name
>> of sections aren't actually a formal part of a mizar text: we
>> customarily include this information as comments
>>
>>  :: Introduction to XYZ Theory
>>  ::
>>  :: by Joe the MIZAR Writer
>>
>>  environ
>>
>>  ...
>>
>>  begin :: Preliminaries
>>
>>  ...
>>
>>  begin :: Main Lemma
>>
>>  ...
>>
>>  begin :: Main Theorem
>>
>>  ...
>>
>> I propose to put this information into the body of the article, to move
>> it from comments to non-comments.  Perhaps we could introduce new
>> keywords like `title', `author', and `section'.  Then, articles could
>> look like
>>
>> title{Introduction to XYZ Theory}
>> author{Joe the MIZAR Writer}
>>
>> environ
>>
>>   <stuff>
>>
>> begin{Preliminaries}
>>
>> definition ...
>>
>> theorem ...
>>
>> theorem ...
>>
>> begin{Main Lemma}
>>
>> definition ...
>>
>> begin{Main Theorem}
>>
>> theorem ...
>>
>> This would make the sections and article title part of the text; as it
>> stands, these features are not represented in the mizar grammar (because
>> customarily they are tucked away within comments).
>>
>> The primary motivation for this idea is to help improve Josef Urban's
>> semantic MML presentation.  Currently he has to do some hacking to
>> extract metadata like article titles and authors; extracting section
>> information (currently unsupported) would be similarly complicated.  The
>> problem is that the article metadata isn't represented in the XML
>> representation of an article; it has to be inferred from the original
>> mizar source.
>>
>> Additionally, this new syntax might help to simplify processing of FM
>> articles: with this approach one wouldn't need the BibTeX fields
>> SECTION1, SECTION2, etc., because these could be inferred from the
>> article.
>>
>> Thoughts?
>>
>> Best,
>>
>> Jesse
>>
>>
>
>

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Wed, 26 Nov 2008, trybulec wrote:

>> I suggest, following Jesse's suggestion that the entire .bib files for
>> Mizar articles be incorporated into the .miz files.
>>
>>
> I am not against, I am not pro.  Could me tell me what would be the gain,
> if we did it?

This is written in Jesse's e-mail: the metadata could be processed by
standard Mizar-based tools that are currently used for creating the HTML
representation, are (going to be) used for collecting data for MML Query,
and possibly also other things like FM and MPTP. It is not practical to
keep these data separately and process them by separate tools, especially
when they are quite a natural part of the articles.

Josef

Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Josef Urban wrote:

>
> Hi,
>
> I think the new directives should not use XML syntax, simply because
> it is ugly, verbose, and unnecessary. I don't think XML should be used
> for direct human authoring and reading, just as assembly language
> should not. The new directives should be normal Mizar keywords like
> "author", "title", etc.
>
Hi:

It is not exactly to the point but we talk about 'reserved words' not
'keywords'. A keyword is a word that characterize
the semantics of the article like 'compact',  'homology group' and so
on. So, in Mizar - the symbols. Not all of them.
'Function' is hardly a keyword - too often used. The tradition of the
misuse of the word goes back to Algol 60, the Authors of the declaration
did not know what 'keyword' means.

Some of my colleagues tried to convince me that we should use XML
because of ready tools like browsers that present the text in human
readable form. It is true?

The Mizar system, at least for me, is first a Math Assistant. When I
work on an article (it happens) I am not interested in the authors' name
and I have yet to invent the title. I would like to keep the Mizar
syntax, in this role, as simple as possible.
I do not see why the pragmas should be part of the language. The only
their 'operational semantics' is to ignore them.

We discuss only the library pragmas. What about temporary pragmas, used
to suspend the processing. We need
more of them. Piotr Rudnicki, if I recall, argued for '@now' or '@by' to
suspend the checker . '@proof' does not help,
if we have large Diffuse Statement. In old Mizar (on Cyber 72} we have
pragma 'virtual  EOF' , useful because sometimes ANALYZER take more time
that the CHECKER.
So what about them, again new 'reserved' words?

What about library pragmas? Again we need more, not only 'canceled',
missing for schemes anyway. E.g. 'obsolete' for theorems or schemes that
are later proved in a stronger form, and original ones should not be
used. For instance
 FUNCT_1:sch 2
that it is still used, even if
CLASSES1:sch 1
should be used instead.

So we need, for different purposes various systems of pragmas that may
be developed separately and do not think they should be mixed with Mizar.

Regards,
Andrzej





Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Josef Urban wrote:

>
> This is written in Jesse's e-mail: the metadata could be processed by
> standard Mizar-based tools that are currently used for creating the
> HTML representation, are (going to be) used for collecting data for
> MML Query, and possibly also other things like FM and MPTP. It is not
> practical to keep these data separately and process them by separate
> tools, especially when they are quite a natural part of the articles.
>
'natural' ?

Well, if you mean by an 'article' a mathematical paper you're right, if
it is thought as a script for the verifier, you're not.

Regards,
Andrzej

Re: an extension of the mizar language for dealing with article metadata

by Freek Wiedijk :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Josef:

>It is not practical to keep these data separately and
>process them by separate tools, especially when they are
>quite a natural part of the articles.

Why not?  The vocabulary files are exactly like that too.

Of course _I_ always want the vocabulary files to be part
of the articles.  It is a pain that they have their own
name space, and that you need two files instead of one.

However, "not practical" seems too strong.

Freek

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


XML is easy to process by machines, and it is for this reason used as the
Mizar internal layer and input for other systems. Mizar parser produces
the XML from the Mizar human-like text, and I think it should stay this
way. That is: human-like keywords for humans, processed into
machine-understandable XML by Mizar parser.

The metadata info should certainly be optional during verification and
only required (if at all) when the article is submitted. I don't think
that broadening this discussion to other "pragmas" is going to help to fix
the problem Jesse asked about in a reasonable time. Inability to present
the metadata in HTML is a bug that should be fixed quickly, discussing new
Mizar features is useful but should be done separately and not in this
context .

Josef


On Fri, 28 Nov 2008, trybulec wrote:

> Josef Urban wrote:
>
>>
>> Hi,
>>
>> I think the new directives should not use XML syntax, simply because it
>> is ugly, verbose, and unnecessary. I don't think XML should be used for
>> direct human authoring and reading, just as assembly language should
>> not. The new directives should be normal Mizar keywords like "author",
>> "title", etc.
>>
> Hi:
>
> It is not exactly to the point but we talk about 'reserved words' not
> 'keywords'. A keyword is a word that characterize
> the semantics of the article like 'compact',  'homology group' and so on.
> So, in Mizar - the symbols. Not all of them.
> 'Function' is hardly a keyword - too often used. The tradition of the
> misuse of the word goes back to Algol 60, the Authors of the declaration
> did not know what 'keyword' means.
>
> Some of my colleagues tried to convince me that we should use XML because
> of ready tools like browsers that present the text in human readable form.
> It is true?
>
> The Mizar system, at least for me, is first a Math Assistant. When I work
> on an article (it happens) I am not interested in the authors' name and I
> have yet to invent the title. I would like to keep the Mizar syntax, in
> this role, as simple as possible.
> I do not see why the pragmas should be part of the language. The only
> their 'operational semantics' is to ignore them.
>
> We discuss only the library pragmas. What about temporary pragmas, used to
> suspend the processing. We need
> more of them. Piotr Rudnicki, if I recall, argued for '@now' or '@by' to
> suspend the checker . '@proof' does not help,
> if we have large Diffuse Statement. In old Mizar (on Cyber 72} we have
> pragma 'virtual  EOF' , useful because sometimes ANALYZER take more time
> that the CHECKER.
> So what about them, again new 'reserved' words?
>
> What about library pragmas? Again we need more, not only 'canceled',
> missing for schemes anyway. E.g. 'obsolete' for theorems or schemes that
> are later proved in a stronger form, and original ones should not be used.
> For instance
> FUNCT_1:sch 2
> that it is still used, even if
> CLASSES1:sch 1
> should be used instead.
>
> So we need, for different purposes various systems of pragmas that may be
> developed separately and do not think they should be mixed with Mizar.
>
> Regards,
> Andrzej
>
>
>
>
>

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Fri, 28 Nov 2008, trybulec wrote:

> Josef Urban wrote:
>
>>
>> This is written in Jesse's e-mail: the metadata could be processed by
>> standard Mizar-based tools that are currently used for creating the HTML
>> representation, are (going to be) used for collecting data for MML
>> Query, and possibly also other things like FM and MPTP. It is not
>> practical to keep these data separately and process them by separate
>> tools, especially when they are quite a natural part of the articles.
>>
> 'natural' ?
>
> Well, if you mean by an 'article' a mathematical paper you're right, if it
> is thought as a script for the verifier, you're not.

So why is the keyword "begin" allowed multiple times to introduce
sections, and why do all articles start with the standard header
containing author and title, why are comments allowed at all? If there
is no need to process and present these article parts in HTML, MML Query,
FM, MPTP, etc., I suggest to disallow these "unnatural" parts also in the
article (.miz) files. Or perhaps they are useful, and we should properly
care about them? Writing fragile Perl hacks that will hardly be usable on
Windoze is then hardly the right way: the Mizar parser should do this.

Josef

>
> Regards,
> Andrzej
>

Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Josef Urban wrote:

>
> XML is easy to process by machines, and it is for this reason used as
> the Mizar internal layer and input for other systems. Mizar parser
> produces the XML from the Mizar human-like text, and I think it should
> stay this way. That is: human-like keywords for humans, processed into
> machine-understandable XML by Mizar parser.
>
> The metadata info should certainly be optional during verification and
> only required (if at all) when the article is submitted. I don't think
> that broadening this discussion to other "pragmas" is going to help to
> fix the problem Jesse asked about in a reasonable time. Inability to
> present the metadata in HTML is a bug that should be fixed quickly,
> discussing new Mizar features is useful but should be done separately
> and not in this context .


As I wrote I have no opinion on the topic. I hoped that you would
support XML :-)

On the other hand why not put the metadata into an unremovable comment,
starting e.g. with
::$

More serious problem: what do you mean by 'Inability to present the
metadata in HTML' ?

Regards.
Andrzej


Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Josef Urban wrote:

>
> So why is the keyword "begin" allowed multiple times to introduce
> sections, and why do all articles start with the standard header
> containing author and title, why are comments allowed at all? If there
> is no need to process and present these article

That is my complaint, the repetition of 'begin' has nothing to do with
Mizar, and should not be allowed. I would prefer
something like:

::$ SECTION

maybe with a name, as a cross reference to the place where description
of the section is kept:
   the title, maybe a summary

BTW. In Mizar HPF (around 1993) only one kind of comments was allowed:
written by a pen on the printout.
It did not work.

Regards,
Andrzej

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Sun, 30 Nov 2008, trybulec wrote:

> Josef Urban wrote:
>
>>
>> XML is easy to process by machines, and it is for this reason used as the
>> Mizar internal layer and input for other systems. Mizar parser produces the
>> XML from the Mizar human-like text, and I think it should stay this way.
>> That is: human-like keywords for humans, processed into
>> machine-understandable XML by Mizar parser.
>>
>> The metadata info should certainly be optional during verification and only
>> required (if at all) when the article is submitted. I don't think that
>> broadening this discussion to other "pragmas" is going to help to fix the
>> problem Jesse asked about in a reasonable time. Inability to present the
>> metadata in HTML is a bug that should be fixed quickly, discussing new
>> Mizar features is useful but should be done separately and not in this
>> context .
>
>
> As I wrote I have no opinion on the topic. I hoped that you would support XML
> :-)

See above: using XML for direct human reading/writing is in my opinion
quite often abuse of a good technology developed for machines.

> On the other hand why not put the metadata into an unremovable comment,
> starting e.g. with
> ::$

That could be a first step, if we don't want to bother with enhancing the
Mizar parser. But if there is important information in such comments
(title, author, sections, nice names of theorems, important comments,
etc.) that will be used for presentation/searching/etc in HTML, FM, MML
Query, etc., a parser for these things is needed anyway, so why not
re-use the Mizar parser? Also, these things should appear in the XML form
of the Mizar article, which is currently produced by the Mizar parser.

> More serious problem: what do you mean by 'Inability to present the metadata
> in HTML' ?

Metadata like title, author, etc. are not present in the XML
representation of an article produced by the Mizar parser (because
formally they are just ignored comments).

Josef

Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Sun, 30 Nov 2008, trybulec wrote:

> Josef Urban wrote:
>
>>
>> So why is the keyword "begin" allowed multiple times to introduce sections,
>> and why do all articles start with the standard header containing author
>> and title, why are comments allowed at all? If there is no need to process
>> and present these article
>
> That is my complaint, the repetition of 'begin' has nothing to do with Mizar,
> and should not be allowed. I would prefer
> something like:
>
> ::$ SECTION
>
> maybe with a name, as a cross reference to the place where description of the
> section is kept:
>  the title, maybe a summary
>
> BTW. In Mizar HPF (around 1993) only one kind of comments was allowed:
> written by a pen on the printout.
> It did not work.

No wonder. http://en.wikipedia.org/wiki/Literate_programming

Comments in programming languages serve as a general storage for anything
that the language does not capture yet. If there is a frequent need for
expressing certain annotations, such annotations should be eventually made
more formal and taken care of by extending the language's syntax. I think
Isabelle/Isar has done this, and the result is that it is much easier to
write publishable papers about Isar formalizations (or more precisely: the
formalizations can already contain all the TeX documentation/paper).

Josef




vocabullaries

by trybulec :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

FreeK:

Are you really serious about it?

Freek Wiedijk wrote:

>Of course _I_ always want the vocabulary files to be part
>of the articles.  It is a pain that they have their own
>name space, and that you need two files instead of one.
>
>  
>
Suppose that we want to implement program that extracts from the article
its vocabulary. Syntactically we know
where the new symbols should occur. Almost E.g.
 
func a+b ...

there are 4 possibilities

a constant: the symbol 'a+b'
a prefix symbol: 'a+'
a postfix symbol: '+b'
an infix symbol: '+'

Either we have to adopt rule that new symbol must be separated by
spaces, the it will be easy to extract it, or
to use the first identification (identification of identifiers) to find
out what is the symbol. A small AI program.
What you would choose?

Of course we should do regular parsing (and before the tokenization)
only in the place where a new symbol
may occur we suspend it.

Regards,
Andrzej

Re: an extension of the mizar language for dealing with article metadata

by Makarius :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 1 Dec 2008, Josef Urban wrote:

> Comments in programming languages serve as a general storage for
> anything that the language does not capture yet. If there is a frequent
> need for expressing certain annotations, such annotations should be
> eventually made more formal and taken care of by extending the
> language's syntax. I think Isabelle/Isar has done this, and the result
> is that it is much easier to write publishable papers about Isar
> formalizations (or more precisely: the formalizations can already
> contain all the TeX documentation/paper).

Isabelle/Isar supports basically three kinds of "comments":

  (1) Source comments  (* like this *)  which are stripped from the text
      as it is being processed, i.e. they behave like comments in most
      programming languages, or % in TeX.

  (2) Document comments  -- {* like this *}  which are considered part of
      the formal langaguage and are included in the final typesetting.

  (3) Explicit markup commands, notably

      chapter {* ... *}
      section {* ... *}
      subsection {* ... *}
      text {* ... *}

      These are full commands, just like 'theorem' or 'proof', but the do
      not affect the logical context.

The argument text {* ... *} of document comments and markup commands can
refer to logical entities from the context of the text, by using document
antiquotations.  For example:

  text {* blah blah @{term "x + y = z"} blah blah @{thm lemma1} ... *}

These antiquotations are checked according to the full logical context,
and pretty printed with the syntax of the context (although I would prefer
annotating the original input source, instead of printing internal
entities).


Since this thread was initially about meta data for theories, what we have
in Isabelle/Isar is only this:

        header {* ... *}

        theory A ...
        begin
           ...
        end

Here the 'header' provides free-form text, it is usually typeset like
'section'.  Adding extra commands for 'author' etc. would be straight
forward, but we have never really needed this yet.  In Isabelle/Isar
documents the title/author is usually part of the outermost root.tex that
assembles all the generated LaTeX sources in the end.


        Makarius

Re: vocabullaries

by Freek Wiedijk :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Andrzej,

>>Of course _I_ always want the vocabulary files to be part
>>of the articles.
>
>Are you really serious about it?

Yes!  Completely serious.

I think I would like the articles to have three parts:

        environ
          ...
        vocabulary
          ...
        begin
          ...

So there wouldn't change too much, but there would be
just one file that one would need to work on, and one
"name space" that would be shared for naming both articles
and vocabularies.

For instance we would have (random example):

  :: Some Basic Properties of Sets
  ::  by Czes{\l}aw Byli\'nski
  ::
  :: Received February 1, 1989
  :: Copyright (c) 1990 Association of Mizar Users

  environ

   vocabularies BOOLE, TARSKI, INCPROJ;
   notations TARSKI, XBOOLE_0, ENUMSET1;
   constructors TARSKI, XBOOLE_0, ENUMSET1;
   registrations XBOOLE_0;
   definitions TARSKI, XBOOLE_0;
   theorems TARSKI, XBOOLE_0, ENUMSET1, XBOOLE_1;
   schemes XBOOLE_0;

  vocabulary

    Rare_c=-comparable

  begin

  reserve v,x,x1,x2,y,y1,y2,z,A,B,X,X1,X2,X3,X4,Y,Y1,Y2,Z,N,M for set;

  registration
    let x, y be set;
  [etc. etc.]

I think it would be much better this way.

Freek

Re: an extension of the mizar language for dealing with article metadata

by Adam Naumowicz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

On Mon, 1 Dec 2008, Josef Urban wrote:

> On Sun, 30 Nov 2008, trybulec wrote:
>
>> Josef Urban wrote:
>>
>>>
>>> XML is easy to process by machines, and it is for this reason used as the
>>> Mizar internal layer and input for other systems. Mizar parser produces
>>> the XML from the Mizar human-like text, and I think it should stay this
>>> way. That is: human-like keywords for humans, processed into
>>> machine-understandable XML by Mizar parser.
>>>
>>> The metadata info should certainly be optional during verification and
>>> only required (if at all) when the article is submitted. I don't think
>>> that broadening this discussion to other "pragmas" is going to help to fix
>>> the problem Jesse asked about in a reasonable time. Inability to present
>>> the metadata in HTML is a bug that should be fixed quickly, discussing new
>>> Mizar features is useful but should be done separately and not in this
>>> context .
>>
>>
>> As I wrote I have no opinion on the topic. I hoped that you would support
>> XML :-)
>
> See above: using XML for direct human reading/writing is in my opinion quite
> often abuse of a good technology developed for machines.

Yesterday I spoke to Grzegorz Bancerek (who I believe is another person
'suspected' for supporting XML-style comments :-)) who tried to work out
some sort of XML metadata standard within comments in his articles like
YELLOW18, WAYBEL34 to be used by the FM presentation. But now Grzegorz is
convinced that the XML part should be produced by the parser rather than
by the users themselves, based on more user-friendly Mizar constructs.

>> On the other hand why not put the metadata into an unremovable comment,
>> starting e.g. with
>> ::$
>
> That could be a first step, if we don't want to bother with enhancing the
> Mizar parser. But if there is important information in such comments (title,
> author, sections, nice names of theorems, important comments, etc.) that will
> be used for presentation/searching/etc in HTML, FM, MML Query, etc., a parser
> for these things is needed anyway, so why not re-use the Mizar parser? Also,
> these things should appear in the XML form of the Mizar article, which is
> currently produced by the Mizar parser.
>
>> More serious problem: what do you mean by 'Inability to present the
>> metadata in HTML' ?
>
> Metadata like title, author, etc. are not present in the XML representation
> of an article produced by the Mizar parser (because formally they are just
> ignored comments).

That's another point that Grzegorz and I fully agree with. The more
information about the article we can have within its source file, the
better. Having the information split into several files like it is now
(Mizar text, BIB-file metadata, FM translation patterns, maybe sth more)
unnecessarily complicates all the presentation methods.

I believe the 'cluttering' of Mizar files as a result of putting a lot of
data into the comments is in fact a problem only for those who work a lot
on revising/improving the texts in the library, as they have to work on
the 'raw' Mizar source. 'Normal' Mizar users, however, should rather (and
I hope they do) use the presentation form for reading the articles - where
it is possible to control the level of details in proofs, but also it
would be very easy to suppress displaying various kinds of extra
information put in the comments, were they not stripped away by the
parser.

Regards,

Adam Naumowicz

======================================================================
Department of Applied Logic            fax. +48 (85) 745-7662
Institute of Computer Science          tel. +48 (85) 745-7559 (office)
University of Bialystok                e-mail: adamn@...
Sosnowa 64, 15-887 Bialystok, Poland   http://math.uwb.edu.pl/~adamn/
======================================================================


Re: an extension of the mizar language for dealing with article metadata

by Josef Urban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

On Wed, 3 Dec 2008, Adam Naumowicz wrote:

>>>> XML is easy to process by machines, and it is for this reason used as the
>>>> Mizar internal layer and input for other systems. Mizar parser produces
>>>> the XML from the Mizar human-like text, and I think it should stay this
>>>> way. That is: human-like keywords for humans, processed into
>>>> machine-understandable XML by Mizar parser.
>>>>
>>>> The metadata info should certainly be optional during verification and
>>>> only required (if at all) when the article is submitted. I don't think
>>>> that broadening this discussion to other "pragmas" is going to help to
>>>> fix the problem Jesse asked about in a reasonable time. Inability to
>>>> present the metadata in HTML is a bug that should be fixed quickly,
>>>> discussing new Mizar features is useful but should be done separately and
>>>> not in this context .
>>>
>>>
>>> As I wrote I have no opinion on the topic. I hoped that you would support
>>> XML :-)
>>
>> See above: using XML for direct human reading/writing is in my opinion
>> quite often abuse of a good technology developed for machines.
>
> Yesterday I spoke to Grzegorz Bancerek (who I believe is another person
> 'suspected' for supporting XML-style comments :-)) who tried to work out some
> sort of XML metadata standard within comments in his articles like YELLOW18,
> WAYBEL34 to be used by the FM presentation. But now Grzegorz is convinced
> that the XML part should be produced by the parser rather than by the users
> themselves, based on more user-friendly Mizar constructs.

I think what Grzegorz did was at least some solution for improving
presentation and including metadata, and it is certainly much better than
doing nothing.

>>> On the other hand why not put the metadata into an unremovable comment,
>>> starting e.g. with
>>> ::$
>>
>> That could be a first step, if we don't want to bother with enhancing the
>> Mizar parser. But if there is important information in such comments
>> (title, author, sections, nice names of theorems, important comments, etc.)
>> that will be used for presentation/searching/etc in HTML, FM, MML Query,
>> etc., a parser for these things is needed anyway, so why not re-use the
>> Mizar parser? Also, these things should appear in the XML form of the Mizar
>> article, which is currently produced by the Mizar parser.
>>
>>> More serious problem: what do you mean by 'Inability to present the
>>> metadata in HTML' ?
>>
>> Metadata like title, author, etc. are not present in the XML representation
>> of an article produced by the Mizar parser (because formally they are just
>> ignored comments).
>
> That's another point that Grzegorz and I fully agree with. The more
> information about the article we can have within its source file, the better.
> Having the information split into several files like it is now (Mizar text,
> BIB-file metadata, FM translation patterns, maybe sth more) unnecessarily
> complicates all the presentation methods.
>
> I believe the 'cluttering' of Mizar files as a result of putting a lot of
> data into the comments is in fact a problem only for those who work a lot on
> revising/improving the texts in the library, as they have to work on the
> 'raw' Mizar source.

This is (finally!) in this discussion one good real point against too much
documentation and having whole Isar-style papers mixing TeX and formal
text: Mizar articles form a library that is sometimes refactored, and
theorems and definitions can be moved to other places and articles.
Refactoring the corresponding natural language is additional effort that
in some (very rare) cases could equal to re-writing a large part of a
published paper :-).

But I don't think this reasoning should be used to forbid interested
people to put more value into the Mizar formalizations. MML is a mess
anyway, and so far nobody said "let's stop writing articles, we cannot
maintain the library already". Some amount of chaos is natural, and we
should rather think about ways and tools for letting more people improve
MML, which is certainly some wiki-like functionality. Fixing obvious bugs
and inaccuracies in natural language is easy, and if there are people
interested in reading the articles, they will do the job (at least in
Wikipedia they very often do).

> 'Normal' Mizar users, however, should rather (and I hope
> they do) use the presentation form for reading the articles - where it is
> possible to control the level of details in proofs, but also it would be very
> easy to suppress displaying various kinds of extra information put in the
> comments, were they not stripped away by the parser.

I think that if "stripping presentation" is needed at all (and I doubt it:
nobody strips comments now before doing revisions - I think even contrary
might be closer to truth: comments often provide useful clues), having a
tool for it is easy.

Josef
< Prev | 1 - 2 - 3 | Next >