http://stackoverflow.com/questions/33746/xml-attribute-vs-xml-element
At work we are being asked to create XML files to pass data to another offline application that will then create a second XML file to pass back in order to update some of our data. During the process we have been discussing with the team of the other application about the structure of the XML file. The sample I came up with is essentially something like: <INVENTORY> <ITEM serialNumber="something" location="something" barcode="something"> <TYPE modelNumber="something" vendor="something"/> </ITEM> </INVENTORY> The other team said that this was not industry standard and that attributes should only be used for Meta data. They suggested: <INVENTORY> <ITEM> <SERIALNUMBER>something</SERIALNUMBER> <LOCATION>something</LOCATION> <BARCODE>something</BARCODE> <TYPE> <MODELNUMBER>something</MODELNUMBER> <VENDOR>something</VENDOR> </TYPE> </ITEM> </INVENTORY> The reason I suggested the first is that the size of the file created is much smaller. There will be roughly 80000 items that will be in the file during transfer. There suggestion in reality turns out to be three times larger than the one I suggested. I searched for the mysterIoUs "Industry Standard" that was mentioned but the closest I could find was the XML attributes should only be used for Meta data,but said the debate was about what was actually Meta data. After the long winded explanation (sorry) how do you determine what is Meta data,and when designing the structure of an XML document how should you decide when to use an attribute or an element? |
|||||||||||||
|
I use this rule of thumb:
So yours is close. I would have done something like: EDIT: Updated the original example based on Feedback below. <ITEM serialNumber="something"> <BARCODE encoding="Code39">something</BARCODE> <LOCATION>XYX</LOCATION> <TYPE modelNumber="something"> <VENDOR>YYZ</VENDOR> </TYPE> </ITEM> |
|||||||||||||||||||||
|
Some of the problems with attributes are:
If you use attributes as containers for data,you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. Don't end up like this (this is not how XML should be used): <note day="12" month="11" year="2002" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note> |
|||||||||||||||||
|
"XML" stands for "eXtensible Markup Language". A markup language implies that the data is text,marked up with Metadata about structure or formatting. XHTML is an example of XML used the way it was intended: <p><span lang="es">El Jefe</span> insists that you <em class="urgent">MUST</em> complete your project by Friday.</p> Here,the distinction between elements and attributes is clear. Text elements are displayed in the browser,and attributes are instructions about how to display them (although there are a few tags that don't work that way). Confusion arises when XML is used not as a markup language,but as a data serialization language,in which the distinction between "data" and "Metadata" is more vague. So the choice between elements and attributes is more-or-less arbitrary except for things that can't be represented with attributes (see feenster's answer). |
|||
XML Element vs XML AttributeXML is all about agreement. First defer to any existing XML schemas or established conventions within your community or industry. If you are truly in a situation to define your schema from the ground up,here are some general considerations that should inform the element vs attribute decision: <versus> <element attribute="Meta content"> Content </element> <element attribute="Flat"> <parent> <child>Hierarchical</child> </parent> </element> <element attribute="Unordered"> <ol> <li>Has</li> <li>order</li> </ol> </element> <element attribute="Must copy to reuse"> Can reference to re-use </element> <element attribute="For software"> For humans </element> <element attribute="Extreme use leads to micro-parsing"> Extreme use leads to document bloat </element> <element attribute="Unique names"> Unique or non-unique names </element> <element attribute="SAX parse: read first"> SAX parse: read later </element> <element attribute="DTD: default value"> DTD: no default value </element> </versus> |
|||
It may depend on your usage. XML that is used to represent stuctured data generated from a database may work well with ultimately field values being placed as attributes. However XML used as a message transport would often be better using more elements. For example lets say we had this XML as proposed in the answer:- <INVENTORY> <ITEM serialNumber="something" barcode="something"> <Location>XYX</LOCATION> <TYPE modelNumber="something"> <VENDOR>YYZ</VENDOR> </TYPE> </ITEM> </INVENTORY> Now we want to send the ITEM element to a device to print he barcode however there is a choice of encoding types. How do we represent the encoding type required? Suddenly we realise,somewhat belatedly,that the barcode wasn't a single automic value but rather it may be qualified with the encoding required when printed. <ITEM serialNumber="something"> <barcode encoding="Code39">something</barcode> <Location>XYX</LOCATION> <TYPE modelNumber="something"> <VENDOR>YYZ</VENDOR> </TYPE> </ITEM> The point is unless you building some kind of XSD or DTD along with a namespace to fix the structure in stone,you may be best served leaving your options open. IMO XML is at its most useful when it can be flexed without breaking existing code using it. |
|||||
|
I use the following guidelines in my schema design with regards to attributes vs. elements:
The preference for attributes is it provides the following:
I added when technically possible because there are times where the use of attributes are not possible. For example,attribute set choices. For example use (startDate and endDate) xor (startTS and endTS) is not possible with the current schema language If XML Schema starts allowing the "all" content model to be restricted or extended then I would probably drop it |
||||
When in doubt, KISS -- why mix attributes and elements when you don't have a clear reason to use attributes. If you later decide to define an XSD,that will end up being cleaner as well. Then if you even later decide to generate a class structure from your XSD,that will be simpler as well. |
|||
There is no universal answer to this question (I was heavily involved in the creation of the W3C spec). XML can be used for many purposes - text-like documents,data and declarative code are three of the most common. I also use it a lot as a data model. There are aspects of these applications where attributes are more common and others where child elements are more natural. There are also features of varIoUs tools that make it easier or harder to use them. XHTML is one area where attributes have a natural use (e.g. in class='foo'). Attributes have no order and this may make it easier for some people to develop tools. OTOH attributes are harder to type without a schema. I also find namespaced attributes (foo:bar="zork") are often harder to manage in varIoUs toolsets. But have a look at some of the W3C languages to see the mixture that is common. SVG,XSLT,XSD,MathML are some examples of well-known languages and all have a rich supply of attributes and elements. Some languages even allow more-than-one-way to do it,e.g. <foo title="bar"/>; or <foo> <title>bar</title>; </foo>; Note that these are NOT equivalent syntactically and require explicit support in processing tools) My advice would be to have a look at common practice in the area closest to your application and also consider what toolsets you may wish to apply. Finally make sure that you differentiate namespaces from attributes. Some XML systems (e.g. Linq) represent namespaces as attributes in the API. IMO this is ugly and potentially confusing. |
||||