Web Basics

Programming Languages

The most popular Web languages available

HTML

The files has the extension(s) – .html, .htm, .html4. HyperText Markup Language (HTML), invented by Tim Berners-Lee in 1989, is the undergirding framework of the Internet. Almost every Web site in existence uses the HTML language to display text, graphics, sounds, and animations. The caretaker of the specification is the World Wide Web Consortium (W3C) (http://www.w3.org). HTML language supports hyperlinking, or the ability to link from one word or area of a page to another area or page. HTML is made up of a series of “elements”, which tells the receiving browser to display certain elements on the screen.

Dynamic HTML (DHTML)

It’s file extension is .dhtml. DHTML is often considered the object version of HTML. This language extends the HTML language to allow for increased control over page elements by allowing them to be accessed and modified by a scripting language such as Javascript or VBScript. As a result, an image tag may have an “OnMouseOver” event triggered when a user places his mouse over the image tag creating an object that comes to life with animation. Briefly, DHTML allows a Web developer to rapidly develop enhanced animations and effects such as mouseover effects, animated text, and dynamic color changes. Because DHTML is based on HTML, its security implications are similar to those for HTML.

XML

It’s file extension is .xml. A derivative of SGML, eXtensible Markup Language (XML) is less restrictive than HTML’s well-defined, standardized elements. XML allows anyone to create her own elements and extend the language herself. At the heart of this element extension are the Document Type Definitions (DTDs), which are similar in function to the data definition library of a relational database. A DTD defines the beginning and ending tags of an XML file, allowing the viewer to make sense of the data.

XHTML

It’s file extension is .xhtml According to the W3C, HTML 4 is the final release of the ubiquitous language as we know it. The next version of HTML is being reformulated to include the XML language’s definition and structure. In other words, HTML and XML are being combined to form the XHTML language and give the W3C continued control over its design. The complete XHTML 1.1 specification can be found at http://www.w3.org/TR/xhtml11/. As with XML, XHTML is in its infancy, and its security hasn’t been tested to any significant degree.

Perl

It’s file extension is .pl or anything. The Practical Extraction and Report Language (Perl) is a high-level, (often considered a scripting) programming language written by Larry Wall in 1987. It is arguably the most ported scripting language to date with versions for AS/400, Windows 9x/NT/2000/XP, OS/2, Novell Netware, IBM’s MVS, Cray supercomputers, Digital’s VMS, Linux, Tandem, HP’s MPE/ix, MacOS, and all versions of Unix. The portability of the Perl language, combined with its low price (it’s free!) and robustness, has created a truly ubiquitous language for the Internet and is largely responsible for the Internet’s tremendous growth.

Perl is remarkably robust and flexible, because it can be written to accommodate server-side actions, scripted to perform functions locally on a system, or used to create entire standalone applications such as majordomo, the universal mail list manager. However, its primary use is handling the server-side scripting of Web sites. Security never has been a fundamental component of the language. As a result, many security vulnerabilities are present in Web sites that utilize the Perl language. But, if you use Perl there are ways to harden it.

PHP

It’s file extensions are .php, .php3, and potentially no extension. Although there have been a number of PHP language authors, the origins of PHP start with Rasmus Lerdorf. He originally wrote the first PHP parsing engine in 1995 as a Perl CGI program, which he called “Personal Home Page,” or simply PHP. His original purpose was to log visitors to his résumé page on the Web. He later rewrote the whole thing in C and made it much larger, enhancing the program with greater parsing capabilities and adding database connectivity.

CGI

It’s file extensions are .cgi, .pl. Common Gateway Interface (CGI) is one of the oldest and most mature standards on the Internet for passing information from a Web server to a program (such as Perl) and back to the Web browser in the proper format. Combined with languages such as Perl, CGI offered one of the first platforms for delivering server-side, dynamic content to the Web. Unlike ASP or PHP, CGI is not a language per se but rather a set of guidelines to be used for other languages. In fact, numerous languages can be used to create a CGI program, including Perl, C/C++, Java, Shell script language such as sh, csh, and ksh (Unix) and even Visual Basic (Windows).

Environmental Variables – Finally, a discussion of CGI isn’t complete without covering environment variables. They are the conduit between the form being used by the Web browser and the back-end CGI program in that they allow certain information to pass to the running CGI program. The purpose of these variables is to better interpret the environment of the running browser.

Server-Side Includes (SSI)- HTML and SHTML

It’s file extensions are .shtml, .shtm, .stm SHTML is considered a CGI file because it typically uses server-side includes (SSI) for server-side processing. SHTML is considered the grandfather of server-side processing, because it has been around since the beginning of the Web and is still in use today (but sparingly). Like most of the languages we discussed so far, SSI are included in the HTML and get picked up by the application mappings of the Web server. The commands are acted on, with the output of the programs being sent to the browser.

Client-Based Java – Client-based Java is designed to run in the processor space of the client system or end-user browser. Client-based Java code comes in two formats: applets and scripting languages.

Applets – It is indicated by the <applet> tag embedded in HTML, Java applets are downloaded and run by the client Web browser. One of the main risks with applets is that, although they are obfuscated by bytecode, the typical Java class can be downloaded separately and decompiled, allowing an attacker to view readable Java source code. This technique allows someone to search for security weaknesses in your Java code. Java applets can be written in any text file, as can most of the other languages.

JavaScript

JavaScript is a non-compiled, interpreted scripting language originally created by Netscape. The link between JavaScript and Java, however, is by name only. Java is compiled, object-oriented, and somewhat difficult to master. JavaScript is a simple, semi-object-oriented scripting language meant for rapid Web development. Although JavaScript has some basic object-oriented capabilities, it more closely resembles a scripting language, such as Perl. JavaScript is useful for simple tasks such as checking form data, adding HTML code on demand, and performing user specific computations such as date/time and browser considerations.

HTTP

Every Web browser and server must communicate over this protocol in order to exchange information. There have been three major versions of the protocol, all of which maintained the same fundamental structure. HTTP is a request/response stateless protocol that allows computers to talk to each other rather efficiently and carry on conversations lasting hours, days, and weeks at a time.

HTTP Request methods

HTTP defines methods (sometimes referred to as verbs) to indicate the desired action to be performed on the identified resource. What this resource represents, whether pre-existing data or data that is generated dynamically, depends on the implementation of the server. Often, the resource corresponds to a file or the output of an executable residing on the server. The HTTP/1.0 specification defined the GET, POST and HEAD methods and the HTTP/1.1 specification added 5 new methods: OPTIONS, PUT, DELETE, TRACE and CONNECT. By being specified in these documents their semantics are well known and can be depended upon. Any client can use any method and the server can be configured to support any combination of methods. If a method is unknown to an intermediate it will be treated as an unsafe and non-idempotent method. There is no limit to the number of methods that can be defined and this allows for future methods to be specified without breaking existing infrastructure. For example, WebDAV defined 7 new methods and RFC 5789 specified the PATCH method.

GET – Requests a representation of the specified resource. Requests using GET should only retrieve data and should have no other effect. (This is also true of some other HTTP methods.) The W3C has published guidance principles on this distinction, saying, “Web application design should be informed by the above principles, but also by the relevant limitations.”
HEAD – Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
POST – Requests that the server accept the entity enclosed in the request as a new subordinate of the web resource identified by the URI. The data POSTed might be, for example, an annotation for existing resources; a message for a bulletin board, newsgroup, mailing list, or comment thread; a block of data that is the result of submitting a web form to a data-handling process; or an item to add to a database.
PUT – Requests that the enclosed entity be stored under the supplied URI. If the URI refers to an already existing resource, it is modified; if the URI does not point to an existing resource, then the server can create the resource with that URI.
DELETE – Deletes the specified resource.
TRACE – Echoes back the received request so that a client can see what (if any) changes or additions have been made by intermediate servers.
OPTIONS – Returns the HTTP methods that the server supports for the specified URL. This can be used to check the functionality of a web server by requesting ‘*’ instead of a specific resource.
CONNECT – Converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.

Status codes

In HTTP/1.0 and since, the first line of the HTTP response is called the status line and includes a numeric status code (such as “404”) and a textual reason phrase (such as “Not Found”). The way the user agent handles the response primarily depends on the code and secondarily on the other response header fields. Custom status codes can be used since, if the user agent encounters a code it does not recognize, it can use the first digit of the code to determine the general class of the response.

The standard reason phrases are only recommendations and can be replaced with “local equivalents” at the web developer’s discretion. If the status code indicated a problem, the user agent might display the reason phrase to the user to provide further information about the nature of the problem. The standard also allows the user agent to attempt to interpret the reason phrase, though this might be unwise since the standard explicitly specifies that status codes are machine-readable and reason phrases are human-readable. HTTP status code is primarily divided into five groups for better explanation of request and responses between client and server as named: Informational 1XX, Successful 2XX, Redirection 3XX, Client Error 4XX and Server Error 5XX.

HTTP Session

An HTTP session is a sequence of network request-response transactions. An HTTP client initiates a request by establishing a Transmission Control Protocol (TCP) connection to a particular port on a server (typically port 80, occasionally port 8080). An HTTP server listening on that port waits for a client’s request message. Upon receiving the request, the server sends back a status line, such as “HTTP/1.1 200 OK”, and a message of its own. The body of this message is typically the requested resource, although an error message or other information may also be returned.

The request response model of HTTP works as per the model below.

HTTPS (HTTP over SSL)

HTTPS is a protocol used for encrypted traffic within an HTTP stream. The entire message is encrypted when Secure Sockets Layer (SSL) is used. Many versions of SSL and its related protocols (Transport Layer Security, TLS, and RFC2246) are available, including SSLv1, SSLv2, and SSLv3. And to make things even more confusing, SSL offers a variety of choices for the encryption standard used within a particular version of SSL. For example, with SSLv3, you can choose from DES to RSA (RC2 and RC4).

HTTPS provides authentication of the website and associated web server that one is communicating with, which protects against man-in-the-middle attacks. Additionally, it provides bidirectional encryption of communications between a client and server, which protects against eavesdropping and tampering with and/or forging the contents of the communication. In practice, this provides a reasonable guarantee that one is communicating with precisely the website that one intended to communicate with (as opposed to an impostor), as well as ensuring that the contents of communications between the user and site cannot be read or forged by any third party.

Historically, HTTPS connections were primarily used for payment transactions on the World Wide Web, e-mail and for sensitive transactions in corporate information systems. In the late 2000s and early 2010s, HTTPS began to see widespread use for protecting page authenticity on all types of websites, securing accounts and keeping user communications, identity and web browsing private.

Web Basics

Get Govt. Certified Secure Assured Job Interview

Level Up Your Job Skills Now!

Get industry recognized certification – Contact us

Get Govt. Certified
Secure Assured Job Interview