Using traceroute, e-mail tracking and web spider

Go back to Tutorial

traceroute

It is a computer network diagnostic tool for displaying the route (path) and measuring transit delays of packets across an Internet Protocol (IP) network. The history of the route is recorded as the round-trip times of the packets received from each successive host (remote node) in the route (path); the sum of the mean times in each hop indicates the total time spent to establish the connection. Traceroute proceeds unless all (three) sent packets are lost more than twice, then the connection is lost and the route cannot be evaluated. Ping, on the other hand, only computes the final round-trip times from the destination point.

The traceroute command is available on a number of modern operating systems. On Apple Mac OS, it is available by opening “Network Utilities” and selecting “Traceroute” tab, as well as by typing the “traceroute” command in the terminal. On other Unix systems, such as FreeBSD or Linux, it is available as a traceroute(8) command in a terminal. On Microsoft Windows, it is named tracert. Windows NT-based operating systems also provide PathPing, with similar functionality. For Internet Protocol Version 6 (IPv6) the tool sometimes has the name traceroute6 or tracert6.

The traceroute utility is used to determine the path to a target computer. Just as with nslookup, traceroute is available on Windows and UNIX platforms. In Windows, it is known as tracert. It displays the path a packet follows from its source to its destination. Traceroute owes its functionality to the IP header time-to-live (TTL) field. The TTL field is used to limit IP datagram’s. Without a TTL, some IP datagram’s might travel the Internet forever as there would be no means of timeout. TTL functions as a decrementing counter. Each hop that a datagram passes through reduces the TTL field by one. If the TTL value reaches 0, the datagram is discarded and a time exceeded in transit Internet Control Message Protocol (ICMP) message is created to inform the source of the failure. Linux traceroute is based on UDP, whereas Windows uses ICMP.

using-traceroute-e-mail-tracking-and-web-spider

E-mail Tracking

Email tracking is a method for monitoring the email delivery to intended recipient. Most tracking technologies use some form of digitally time-stamped record to reveal the exact time and date that an email was received or opened, as well the IP address of the recipient.

Email tracking is useful when the sender wants to know if the intended recipient actually received the email, or if they clicked the links. However, due to the nature of the technology, email tracking cannot be considered an absolutely accurate indicator that a message was opened or read by the recipient. Most email marketing software provides tracking features, sometimes in aggregate (e.g., click-through rate), and sometimes on an individual basis.

Email tracking is used by individuals, email marketers, spammers and phishers, to verify that emails are actually read by recipients, that email addresses are valid, and that the content of emails has made it past spam filters. It can sometimes reveal if emails get forwarded (but not usually to whom). When used maliciously, it can be used to collect confidential information about businesses and individuals and to create more effective phishing schemes.

The tracking mechanisms employed are typically first-party cookies and web bugs. If you are using email tracking or email marketing software, your company’s privacy policy should state that you may use tracking devices such as cookies and web beacons.

Web Spider

A Web spider or crawler is an Internet bot which systematically browses the World Wide Web, typically for the purpose of Web indexing. A Web crawler may also be called a Web spider, an ant, an automatic indexer, or a Web scutter. Web search engines and some other sites use Web crawling or spidering software to update their web content or indexes of others sites’ web content. Web crawlers can copy all the pages they visit for later processing by a search engine which indexes the downloaded pages so the users can search much more efficiently.

It can also be used by hackers for extracting information from websites. Usually, such software programs simulate human exploration of the World Wide Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding a fully-fledged web browser, such as Internet Explorer or Mozilla Firefox. A web spider scans websites, collecting certain information such as email addresses. The web spider uses syntax such as the @ symbol to locate email addresses and then copies them into a list. These addresses are then added to a database and may be used later to send unsolicited emails.

Uses of web spider’s data include online price comparison, contact scraping, weather data monitoring, website change detection, research, web mashup and web data integration.

The administrator of a website can use various measures to stop or slow a bot or spider. Some techniques are

  • Blocking an IP address. This will also block all browsing from that address.
  • Disabling any web service API that the website’s system might expose.
  • Bots sometimes declare who they are (using user agent strings) and can be blocked on that basis (using robots.txt); ‘googlebot’ is an example. Some bots make no distinction between themselves and a human browser.
  • Bots can be blocked by excess traffic monitoring.
  • Bots can sometimes be blocked with tools to verify that it is a real person accessing the site, like a CAPTCHA. Bots are sometimes coded to explicitly break specific Captcha patterns.
  • Commercial anti-bot services: Companies offer anti-bot and anti-scraping services for websites. A few web application firewalls have limited bot detection capabilities as well.
  • Locating bots with a honeypot or other method to identify the IP addresses of automated crawlers.
  • Using CSS sprites to display such data as phone numbers or email addresses, at the cost of accessibility to screen reader users.

Go back to Tutorial

Get industry recognized certification – Contact us

Menu