Friday, 2025-07-18, 6:44 PM
Main Registration RSS
Welcome, Guest

Visit My Blog For More Information

Login form
Site menu
Login form
Statics.
Calendar
«  July 2025  »
SuMoTuWeThFrSa
  12345
6789101112
13141516171819
20212223242526
2728293031

XHTML (Extensible HyperText Markup Language) is a family of XML markup languages that mirror or extend versions of the widely used Hypertext Markup Language (HTML), the language in which web pages are written.

While HTML (prior to HTML5) was defined as an application of Standard Generalized Markup Language (SGML), a very flexible markup language framework, XHTML is an application of XML, a more restrictive subset of SGML. Because XHTML documents need to be well-formed, they can be parsed using standard XML parsers—unlike HTML, which requires a lenient HTML-specific parser.

XHTML 1.0 became a World Wide Web Consortium (W3C) Recommendation on January 26, 2000. XHTML 1.1 became a W3C Recommendation on May 31, 2001. XHTML5 is undergoing development as of September 2009, as part of the HTML5 specification.

Overview

XHTML 1.0 is "a reformulation of the three HTML 4 document types as applications of XML 1.0".[1] The World Wide Web Consortium (W3C) also continues to maintain the HTML 4.01 Recommendation, and the specifications for HTML5 and XHTML5 are being actively developed. In the current XHTML 1.0 Recommendation document, as published and revised to August 2002, the W3C commented that, "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility."[1]

However, in 2005, the Web Hypertext Application Technology Working Group (WHATWG) formed, independently of the W3C, to work on advancing ordinary HTML not based on XHTML. The WHATWG eventually began working on a standard that supported both XML and non-XML serializations, HTML5, in parallel to W3C standards such as XHTML 2. In 2007, the W3C's HTML working group voted to officially recognize HTML5 and work on it as the next-generated HTML standard.[2] In 2009, the W3C allowed the XHTML 2 Working Group's charter to expire, acknowledging that HTML5 would be the sole next-generation HTML standard, including both XML and non-XML serializations.[3] Of the two serializations, the W3C suggests that most authors use the HTML syntax, rather than the XHTML syntax.[4]

Motivation

XHTML was developed to make HTML more extensible and increase interoperability with other data formats.[5] HTML 4 was ostensibly an application of Standard Generalized Markup Language (SGML); however the specification for SGML was complex, and neither web browsers nor the HTML 4 Recommendation were fully conformant to it.[6] The XML standard, approved in 1998, provided a simpler data format closer in simplicity to HTML 4.[7] By shifting to an XML format, it was hoped HTML would become compatible with common XML tools;[8] servers and proxies would be able to transform content, as necessary, for constrained devices such as mobile phones.[9] By utilizing namespaces, XHTML documents could provide extensibility by including fragments from other XML-based languages such as Scalable Vector Graphics and MathML.[10] Finally, the renewed work would provide an opportunity to divide HTML into reusable components (XHTML Modularization) and clean up untidy parts of the language.[11]

Relationship to HTML

There are various differences between XHTML and HTML. The Document Object Model is a tree structure that represents the page internally in applications, and XHTML and HTML are two different ways of representing that in markup (serializations). Both are less expressive than the DOM (for example, "--" may be placed in comments in the DOM, but cannot be represented in a comment in either XHTML or HTML), and generally XHTML's XML syntax is a little more expressive than HTML (for example, arbitrary namespaces are not allowed in HTML). So, firstly one source of differences is immediate: XHTML uses an XML syntax, while HTML uses a pseudo-SGML syntax (officially SGML for HTML 4 and under, but never in practice, and standardised away from SGML in HTML5). Secondly however, because the expressible contents of the DOM in syntax are slightly different, there are some changes in actual behavior between the two models.