Paul and Michael, thank you for raising these important issues along
with your extensive commentary :-)
All - feel free to send any thoughts to this list or privately to
cwe@....
I will respond to a small piece of Paul's proposal to help get the
ball rolling. This post will concentrate on where the MITRE team is
with respect to the maturity of existing CWE definitions. Later posts
will cover Paul's suggestions from a process standpoint.
> We propose, first, that one element of a CWE be designated the prime
> definition.
Our approach for the past year has been to treat the CWE
Description_Summary element as an emerging "prime" definition. We use
the Extended_Summary to provide some additional details, informal
examples, etc.
For Description_Summary, our goal has been to create a focused
description that is as clear as possible, avoids vague terms, and
omits as much extraneous information as possible. We do this by
trying to identify the core error, and to avoid conflating these with
closely-related errors (e.g. chains) and attacks. Recently, we have
been avoiding identifying the typical consequence of the weakness,
which is informally covered in the Extended_Summary element (and often
detailed in the Common_Consequences element).
We also try to write Description_Summary as a single sentence, which
forces us to concentrate on the core issue as much as possible. One
emerging style that has worked for us is to describe the software's
erroneous behavior with respect to any associated resources that are
affected by that behavior. The reason *why* the behavior is erroneous
is moved to the Extended_Summary.
We also believe that there needs to be very close alignment between
the Name attribute and the Description_Summary.
Many of our description summaries have been inherited from previous
efforts in which the names or definitions were not so precise; we did
not help our own cause in earlier years when we wrote our own
descriptions, either.
Starting with Draft 9, we have made significant improvements to many
descriptions. Since Draft 9 was released, we have modified the
descriptions for 311 different entries, and we've modified the names
for 98 entries. If you read the difference reports
(
http://cwe.mitre.org/data/reports.html) you can see that Name and
Description are often one of the most frequently updated elements for
each new version.
There has been some tension in how to capture the Description_Summary
effectively and clearly, without being too verbose or too difficult to
understand. As Paul has mentioned, CWE has multiple audiences. In
some cases, we can write a terse Description_Summary using
vulnerability theory terms, but the casual reader might not understand
most of the words in that description.
There are also significant challenges when trying to be more precise,
in that we often realize that we're not sure what the actual CWE entry
is really about! Consider CWE-377 (currently named "Insecure
Temporary File"). Its description is: "Creating and using insecure
temporary files can leave application and system data vulnerable to
attack." The immediate question that comes to mind is - what is
"insecure" about the temporary file? It has bad permissions? It's
created in a directory with bad permissions? Its name contains
sensitive information? It's subject to symlink following? It's not
deleted after use? It's accidentally packaged up with other files
before it's sent somewhere else? This vague name and description are
actually hiding several lower-level weaknesses - yet for certain
audiences, these distinctions are not important, and to adopt a more
precise definition would not help certain audiences.
The labor to come up with an acceptable Description_Summary for all
CWE entries, estimating 30 minutes each, would be approximately 8
staff weeks for weakness-focused entries. Some of this labor has been
spread over time, but there is still room for improvement. And note
that in many cases, it will take more than 30 minutes to get clarity -
for example, the buffer overflow examples given by Paul.
To take Paul's example of the current CWE-121 description:
A stack-based buffer overflow condition is a condition where the
buffer being overwritten is allocated on the stack (i.e., is a local
variable or, rarely, a parameter to a function).
Under current MITRE practices, ideally we would change this
Description_Summary, perhaps in the following fashion (note: this is
an informal example only):
- the phrase regarding local variables and function parameters would
be relocated elsewhere in CWE - probably to the
Extended_Description
- the "overwrite" phrase would be avoided or otherwise clarified
- a next-generation description might be:
The software allocates a buffer using memory that is stored on
the stack, but it writes to an address that is outside of the
implied boundaries of that buffer.
While such a description may still have some subtle problems, it is
closer to our current guidelines for CWE descriptions: it focuses
largely on behavior ("allocate"/"write") and resources
("buffer"/"stack"), while avoiding specific examples or variants
("local variable"), while also attempting to more clearly identify
concepts such as "overflow" and "overwrite" whose terms have many
audience-specific interpretations.
Note that CWE-121 has another larger problem that we frequently
wrestle with. This CWE's abstraction has a particular perspective
focused on the type of resource (i.e. "buffer on stack") that leads to
conceptual overlap with other resource-independent CWEs (like CWE-125
"out-of-bounds read"). This part of the tree, like many other parts
of CWE, still needs additional research that more clearly identifies
the various perspectives and "layers" that come from various CWE
audiences. The CWE content team is getting better at identifying when
perspective/layering problems are preventing a clear identification of
a weakness or set of related weaknesses, but fixing these problems is
sometimes difficult.
As Paul has noted, there are many other fields that are related to
other descriptive aspects of a weakness, such as relationship notes,
background details, terminology notes, and white box definitions. All
of these have specific roles. They have not been the highest priority
to us, but we clean them up or fill them out when we run across them
(especially when they are in Other_Notes, which historically was a
general-purpose field before we extended the schema for Draft 9 and
version 1.0).
> If there is inconsistency between the prime definition and a
> description, example, or definition in another element, the prime
> definition is considered to be correct.
This is roughly the approach that we have taken. However, in some
cases, we may believe that the prime definition is excessively vague
or inconsistent with the rest of the CWE entry. We will sometimes
deprecate an entry outright if, as a whole, the entry can be
interpreted to cover multiple distinct weaknesses (assuming the entry
is not a category or higher-level asbtraction). As a result of the
deprecation, we may split it into newer entries that are more clearly
written.
Also, when there is a mismatch between the Name and the Description,
we will strongly consider deprecating the entry, since it is clear to
me that many people map to CWE identifiers based only on the name
without reading the description.
A current example of this problem is CWE-217, which has a very general
name of "Failure to Protect Stored Data from Modification," which was
inherited from CLASP. But the rest of the original CLASP item - and
much of CWE-217 itself - is focused on a very low-level Java problem
(actually, if you look at the demonstrative examples, it's about a
couple distinct problems). CWE-217 will be deprecated in the next
version, with a couple replacements that are much more clear.
Another example of definitional confusion/vagueness is CWE-391. As
its maintenance notes currently say: "This entry needs significant
modification. It currently combines information from three different
taxonomies, but each taxonomy is talking about a slightly different
issue." However, for us to fully deal with CWE-391, we have been
forced into defining a general model for error handling, which is
incomplete because of perspective/layering problems (see CWE-754 and
CWE-755 for a start).
>A clearly defined vocabulary is needed. Likely there will be stock
>phrases, too. This vocabulary and the definitions themselves should
>be based on the work already done by many contributors to the CWE.
We expect to be updating the vulnerability theory document
(
http://cwe.mitre.org/documents/vulnerability_theory/intro.html) in
the coming months, which handles some of the gaps.
In addition, we have begun maintaining a glossary of terms here:
http://cwe.mitre.org/documents/glossary/index.htmlWhile these are not perfect, and they are often still evolving, these
reflect some of the ongoing vocabulary that we are trying to develop.
As Paul has noted, there is other work being done by other
contributors to CWE that can be leveraged or consulted. Since CWE
operates within multiple audiences, this will sometimes be a difficult
task. For example, the current ISO/IEC Project 22.24772 document
appears to use the term "encapsulation" in a different way than
McGraw/Chess/Tsipenyuk did, which was also different than how others
may define it. (Look for my "what is encapsulation?" post coming
sometime in 2009). When terms like this become overloaded, the
solution for one audience will not necessarily work for another
audience.
The main point of this post was to provide some context: what our
current approach to CWE descriptions has been, and some of the
challenges that will be faced if the community decides that it is
important to adopt a "prime definition."
Any and all feedback is welcome. Thanks again to Paul and Mike for
raising this question. I look forward to everyone's thoughts.
- Steve