« Return to Thread: an extension of the mizar language for dealing with article metadata

Re: an extension of the mizar language for dealing with article metadata

by trybulec :: Rate this Message:

Reply to Author | View in Thread

Dear Jesse,

In Algol 60 a comment was a part of the language. The syntax was
     comment   TEXT ;

where TEXT was an arbitrary string of characters ( the semicolon itself
was not allowed)
We treat now comments as something that actually is not a part of the
language. I would rather
have "pure Mizar" with only these informations that are necessary for
the verifier.
We have now some features that should not be in pure Mizar and probably
should be replaced by pragmas:
they may be classified in the following categories:
i. temporary pragmas:   '@proof'  - to suspend verification
ii. library pragmas:   'canceled' - to keep numeration correct
iii. editorial pragmas:  repeated 'begin' - to divide article into section.
iv. maybe some others

We tried - on our seminar - to find common solution, but the categories
are so different that probably
there is none.

Recently, after reading your letter, we discussed the problem again with
Czeslaw and it seems that for the metadata the proper solution is to
have an XML-envelope of the Mizar article. Maybe something like
      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
          "the Mizar article"
      </ARTICLE>
      </DOCUMENT>
(mind you, I know almost nothing on XML, so it might quite naive).

Then from such XML document we may extract the Mizar article and process
it in the usual way.
"the Mizar article" is probably not the exact citation: we have '<' as a
symbol in Mizar. Still extracting the Mizar article should not be too
complicated. Maybe we can be more precise and have the form

      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
      <ENVIRONMENT DECLARATION>
          "directives"
      </ENVIRONMENT DECLARATION>
       <TEXT PROPER>
          "the text proper"
      </TEXT PROPER>
      </ARTICLE>
      </DOCUMENT>

Then, 'environ' and 'begin' (if necessary) may be added while extracting
the Mizar article.  The directives and the text proper are written in
two different languages, anyway.

      <DOCUMENT>
         "the place for metadata"
      <ARTICLE>
      <ENVIRONMENT DECLARATION>
          "directives"
      </ENVIRONMENT DECLARATION>
       <TEXT PROPER>
        <SECTION>
          "the text proper"
        </SECTION>
          ...
        <SECTION>
          "the text proper"
        </SECTION>
      </TEXT PROPER>
      </ARTICLE>
      </DOCUMENT>

Then we may put in the <SECTION> the title and maybe even the summary of
the section. And the grammar of Mizar will be a little bit simpler: no
repetition of 'begin'.

Regards,
Andrzej

Jesse Alama wrote:

>As it stands now, the title of an article, its author(s), and the name
>of sections aren't actually a formal part of a mizar text: we
>customarily include this information as comments
>
>  :: Introduction to XYZ Theory
>  ::
>  :: by Joe the MIZAR Writer
>
>  environ
>
>  ...
>
>  begin :: Preliminaries
>
>  ...
>
>  begin :: Main Lemma
>
>  ...
>
>  begin :: Main Theorem
>
>  ...
>
>I propose to put this information into the body of the article, to move
>it from comments to non-comments.  Perhaps we could introduce new
>keywords like `title', `author', and `section'.  Then, articles could
>look like
>
> title{Introduction to XYZ Theory}
> author{Joe the MIZAR Writer}
>
> environ
>
>   <stuff>
>
> begin{Preliminaries}
>
> definition ...
>
> theorem ...
>
> theorem ...
>
> begin{Main Lemma}
>
> definition ...
>
> begin{Main Theorem}
>
> theorem ...
>
>This would make the sections and article title part of the text; as it
>stands, these features are not represented in the mizar grammar (because
>customarily they are tucked away within comments).
>
>The primary motivation for this idea is to help improve Josef Urban's
>semantic MML presentation.  Currently he has to do some hacking to
>extract metadata like article titles and authors; extracting section
>information (currently unsupported) would be similarly complicated.  The
>problem is that the article metadata isn't represented in the XML
>representation of an article; it has to be inferred from the original
>mizar source.
>
>Additionally, this new syntax might help to simplify processing of FM
>articles: with this approach one wouldn't need the BibTeX fields
>SECTION1, SECTION2, etc., because these could be inferred from the
>article.
>
>Thoughts?
>
>Best,
>
>Jesse
>
>  
>

 « Return to Thread: an extension of the mizar language for dealing with article metadata