Universal Resource Locators
 


A URL (Uniform Resource Locator, previously Universal Resource Locator) - usually pronounced by sounding out each letter but, in some quarters, pronounced "Earl" - is the unique address for a file that is accessible on the Internet. A common way to get to a Web site is to enter the URL of its home page file in your Web browser's address line. However, any file within that Web site can also be specified with a URL. Such a file might be any Web (HTML) page other than the home page, an image file, or a program such as a common gateway interface application or Java applet. The URL contains the name of the protocol to be used to access the file resource, a domain name that identifies a specific computer on the Internet, and a pathname, a hierarchical description that specifies the location of a file in that computer.

On the Web (which uses the Hypertext Transfer Protocol, or HTTP), an example of a URL is:

 https://www.ietf.org/rfc/rfc2396.txt

which specifies the use of a HTTP (Web browser) application, a unique computer named www.ietf.org, and the location of a text file or page to be accessed on that computer whose pathname is /rfc/rfc2396.txt.

Usage

Every URL consists of some of the following: the scheme name (commonly called protocol), followed by a colon, two slashes, then, depending on scheme, a server name (exp. ftp., www., smtp., etc.) followed by a dot (.) then a domain name |group="note"}} (alternatively, IP address), a port number, the path of the resource to be fetched or the program to be run, then, for programs such as Common Gateway Interface (CGI) scripts, a query string, and an optional fragment identifier.

The syntax is:
scheme://domain:port/path?query_string#fragment_id

  • The scheme name defines the namespace, purpose, and the syntax of the remaining part of the URL. Software will try to process a URL according to its scheme and context. For example, a web browser will usually dereference the URL https://example.org:80 by performing an HTTP request to the host at example.org, using port number 80. The URL mailto:bob@example.com may start an e-mail composer with the address bob@example.com in the To field.

Other examples of scheme names include https:, gopher:, wais:, ftp:. URLs with https as a scheme (such as https://example.com/) require that requests and responses will be made over a secure connection to the website. Some schemes that require authentication allow a username, and perhaps a password too, to be embedded in the URL, for example ftp://asmith@ftp.example.org. Passwords embedded in this way are not conducive to secure working, but the full possible syntax is
scheme://username:password@domain:port/path?query_string#fragment_id

  • The domain name or IP address gives the destination location for the URL. The domain google.com, or its IP address 72.14.207.99, is the address of Google's website.
  • The domain name portion of a URL is not case sensitive since DNS ignores case: https://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ both open the same page.
  • The port number is optional; if omitted, the default for the scheme is used. For example, https://vnc.example.com:5800 connects to port 5800 of vnc.example.com, which may be appropriate for a VNC remote control session. If the port number is omitted for an https: URL, the browser will connect on port 80, the default HTTP port. The default port for an https: request is 443.
  • The path is used to specify and perhaps find the resource requested. It is case-sensitive, though it may be treated as case-insensitive by some servers, especially those based on Microsoft Windows. If the server is case sensitive and https://en.example.org/wiki/URL is correct, https://en.example.org/WIKI/URL or https://en.example.org/wiki/url will display an HTTP 404 error page, unless these URLs point to valid resources themselves.
  • The query string contains data to be passed to software running on the server. It may contain name/value pairs separated by ampersands, for example ?first_name=John&last_name=Doe.
  • The fragment identifier, if present, specifies a part or a position within the overall resource or document. When used with HTTP, it usually specifies a section or location within the page, and the browser may scroll to display that part of the page.