XML for Computer Science and MCA students

What is XML? Explain its features.

What is XML?
  • XML stands for Extensible Markup Language.
  • It is a markup language like HTML.
  • It was designed to store and transport data.
  • It has no predefined tags like HTML.
  • XML simplifies the data sharing, data transport, platform changes and data availability.
  • It stores the data in a plain text.
  • It provides a software and hardware way of storing, transporting and sharing data.
Syntax:
<root>
     <child>
          <subchild>. . . .</subchild>
     </child>
</root>

Example:
<?xml version="1.0" encoding="UTF-8"?>
<employee>
     <employee-name>ABC</employee-name>
     <employee-address>Pune</employee-address>
     <employee-phone>+91**********</employee-phone>
</employee>

Features of XML:
  • It provides a software and hardware way of storing, transporting and sharing data.
  • It separates data from HTML.
  • It increases data availability.
  • It can be used to create new Internet languages like XHTML, WSDL, WAP, WML etc.
  • It is heavily used as a format for document storage and processing both online and offline.
  • It does not allow references to external data entities.
  • It does not allow empty comment declaration.
  • It is extensible, because it only specifies the structural rules of tags. No specification on tags them self.
What is XSL?
  • XSL stands for Extensible Style Sheet Language.
  • XSL is the future of XML display.
  • It is a special declaration in XML for linking with the stylesheets.
  • XSL is an XML-based languages for expressing the Style Sheets.
  • You can make context-sensitive display decisions using with XSL.
  • For example, you could automatically display the document one way in a Web browser and another on a PDA.
  • XSL can also transform XML into HTML, so that older browsers can view XML documents.
What is XSLT?
  • It is a language which transforms XML documents into XHTML documents or to other XML documents.
  • XSLT stands for Extensible Stylesheet Language Transformations.
  • It uses XPATH to navigate in XML documents.
  • XSLT uses XPATH to identify subsets of the source document tree and perform calculations.
  • XSLT language is used for transforming XML documents into XHTML documents.
  • It also transforms XML into another XML document.
  • It specifies a language definition for XML data presentation and data transformations.
  • It is used to describe how to transform the source tree or data structure of an XML document into the result tree for a new XML document.
  • It has two attributes:
    1. Type: It indicates the type of file being linked to. It will be using the value 'text/xsl' to specify XSLT.
    2. href: It indicates the location of the file. If you saved your XSLT and XML file in the same directory, you can simply use the XSLT filename.
What is XML namespace?
  • It is a mechanism to avoid name conflicts by differentiating elements or attributes within an element XML document.
  • It is used to avoid element name conflict in an XML document.
  • It is declared using the reserved XML attribute.
  • This attribute name must be started with 'xmlns'.
Syntax:

<element xmlns:name = “URL”>

Example:
<?xml version="1.0" encoding = "UTF-8"?>
<cont:contact xmlns:cont = "www.careerride.com">
     <cont:name>ABC</cont:name>
     <cont:company>CareerRide Info</cont:company>
     <cont:phone>+91**********</cont:phone>
</cont:contact>
  • In the above example, Namespace Prefix is 'cont' and Namespace Identifiers is 'www.careerride.com'.
  • It specifies that the element name and attribute names with 'cont' prefix belongs to 'www.careerride.com' namespace.
  • The elements name are defined by the developer so there is a chance to conflict in name of the elements.
  • To avoid these types of confliction, XML namespace is used.
  • XML namespace provides a method to avoid element name conflict.

Give difference between DTD and XML schema.

DTD:
  • It is a set of markup declarations which define a document type for the markup language, such as SGML, XML, HTML.
  • DTD stands for Document Type Definition.
  • It supports two types of data:
    1. PCDATA (Parsed Character Data): XML parsers normally parse all the text in an XML document.
    2. CDATA (Character Data): It is used about text data that should not be parsed by the XML parser.
  • It defines the elements, attributes, ordering and nesting elements.
  • DTD lacks strong typing capabilities and it has no way of validating the content to data types.
  • It does not define the order for child elements.
  • It does not support data types.
Schema:
  • It is the structure of an XML document.
  • It supports numeric, Boolean and String data types.
  • It is suitable for applications that developed in a programming language.
  • Schema supports custom data types.
  • It supports encapsulation and inheritance concepts.
  • Schema supports for WEB services, XSLT.
  • It defines order for child elements.

Explain DOM parser.

What is DOM parser?
  • It is a cross-platform and language independent.
  • DOM is an open standard.
  • It is an official recommendation of the World Wide Web Consortium (W3C).
  • The HTML DOM API specializes and adds the functionality to relate to HTML documents and elements.
  • It addresses the issues of backwards compatibility with the Level 0 of DOM.
Advantages of DOM parser
  • It provides mechanisms for common and frequent operations on HTML documents.
  • It is a programming interface for HTML and XML documents.
  • The DOM provides a structured representation of the document.
  • It provides a representation of the document as a structured group of nodes and objects which have properties and methods.
  • DOM defines a way that the structure can be accessed from programs so that they can change the document structure, style and content.
  • It provides interfaces on components of a tree which is a DOM document.
  • It creates a tree structure in memory from the input document and then waits for requests from the client.
  • It always serves the client application with the entire document no matter how much is actually needed by the client.
  • The XML file is arranged in a tree fashion.
  • DOM supports random access to the data of an XML file.

Explain DTD in detail with its example.

What is DTD?
  • It is a set of markup declarations which define a document type for the markup language, such as SGML, XML, HTML.
  • DTD stands for Document Type Definition.
  • It supports two types of data:
    1. PCDATA
    2. CDATA

    1. PCDATA (Parsed Character Data): XML parsers normally parse all the text in an XML document.
    2. CDATA (Character Data): It is used about text data that should not be parsed by the XML parser.
  • It defines the elements, attributes, ordering and nesting elements.
  • DTD lacks strong typing capabilities and it has no way of validating the content to data types.
  • It does not define the order for child elements.
  • It does not support data types.

Explain SOAP.

What is an XML Web service?
  • XML Web Service is a unit of code that can be accessed independent of platforms and systems.
  • These services are the web application components.
  • They are used to interchange data between different systems in different machines for interoperability using HTTP protocols.
  • Requests are made and responses are returned in the form of XML as XML is a language and platform independent.
  • It exposes useful functionality to web users through a standard web protocol.
  • It provides a way to describe their interfaces in enough detail to allow a user to build a client application to talk to them.
  • These services are registered so that potential users can find them easily.
Web service components are as follows:
1. SOAP
2. WSDL
3. UDDI

What is SOAP?
  • SOAP stands for Simple Object Access Protocol.
  • It is a communication protocol that communicates between applications.
  • It is a platform and language independent.
  • It can run on any operating system.
  • It describes how to encode an HTTP header and an XML file for making communication between two computers.
  • SOAP is an XML-based protocol that enables two components to communicate each other.
  • It has the rules to translate platform specific data into the XML format.
Advantages of SOAP:
  • It defines its own security which is known as Web Service Security.
  • It has a benefit to allow server firewalls.
  • It can be written in any programming language and executed on any platform.
  • Web services are based on SOAP protocol in order to expose their functionality to disparate application and platform.
Disadvantages of SOAP:
  • It consumes more bandwidth and resource.
  • It is slow.
  • It does not have any other mechanism to discover the service.