Tools and Techniques to Investigate the Ownership of Websites

Discover tools & techniques to investigate website ownership. Uncover hidden info behind websites, from domain WHOIS to IP analysis.

Websites are designed to show information to the public as their owners want; however, there is plenty of hidden technical information behind their fancy design. There are many reasons to investigate the ownership of websites. For instance, on the legal side, we may want to contact the webmaster for a link-building campaign, ask the webmaster to remove a link or a piece of content, or contact the website owner to put advertisement or to purchase the domain name! Journalists and online investigators need to investigate a web presence to connect their findings with other information collected from other sources (e.g., social media profiles, blogs, discussion forums, data collected from the dark web such as the TOR network).

On the illegal side, threat actors (e.g. cyber criminals, phishers, and spammers) aim to collect as much information about their targets before launching direct attacks against them. For example, by investigating website owner information, an adversary can know other websites running by the same owner, their private contact information, discover technology used to build the target website to establish a beachhead to execute further attacks.  
This article will introduce various tools & techniques to investigate the websites’ various elements to reveal the valuable information hidden underneath the websites’ public interface. 

HISTORICAL WEB DATA

Anything published online may disappear suddenly. For example, a webpage, an image, video, or an entire website may fade over time. Wayback Machine is a free online service for retrieving historical data about any website (see Figure 1).

DOMAIN NAME WHOIS LOOKUP

A website domain name can reveal important information about its owners. A domain name is unique. For instance, there is only one domain name that have the name “yahoo.com.” When a person purchases a domain name, a record containing information (mainly personal and contact information) about the owner is created.  Sometimes this information is made private by the owner. However, we can still try our luck and search for historical domain data to see previous historical changes that may contain information about the owner before making the registration private. There are many services to retrieve domain name information. 

Domaintools: This service allows you to access historical Whois records.

Whoisology: This service offers reverse Whois lookups and reveals connections between domain names and their owners. 

Domainbigdata: In addition to finding domain name info, this service finds other domains owned by the same person.

Godaddy: Find information about website owners.   

Icann: This tool gives you the ability to look up the registration data for domain names. (ICANN is a non-profit organization that governs the registration of new domain names worldwide)

WEBSITE HOSTING

Every website has files stored on a webserver that existed somewhere in the world. To find who is hosting any website, use any of the following services: 

  1. Hostingchecker
  2. Hostadvice
  3. webhostinghero

REVERSE IP SEARCH

Many websites use shared hosting to reduce the hosting costs; websites utilizing shared hosting have the same IP address. After knowing the target website IP address, we can conduct a reverse IP search to see all websites hosted on the same IP address. Before listing reverse IP search sites, let us learn how to retrieve any website’s IP address using Windows command prompt (see Figure 2).

Analyzing other domain names sharing the same IP address can lead to the owner of the target website (e.g., many web administrators host websites belonging to the same person on the same IP address).
The following are some services to conduct reverse IP search:

  1. ViewDNSinfo (see Figure 3)
  2. We can use the Bing search engine to conduct a reverse IP search by typing the following search query:ip:166.62.28.88
  3. Robtext
  4. Netcraft: Display useful information about the target website, including domain info, technology used to build the website, web trackers hosting history (see Figure 4). 

ANALYZING WEBPAGES SOURCE CODE

What you see in your web browser when visiting a website is the graphical representation of HTML and JavaScript code that is used to build web pages. We can see any webpage source code on most web browsers by right-clicking anywhere on the web page and selecting “View Page Source”. A web page source may contains comments left by the website developer (HTML comments begin with ) that mention the developer company name, contact information and plugins used to create the website (e.g. many WordPress plugins add comments automatically on webpages source code) (see Figure 5).

Let us imagine this scenario, if we discovered that our target website is using a specific WordPress plugin from its webpages source code, we can go to www.cvedetails.com and check if there are any associated vulnerabilities with this plugin (see Figure 6).

FINDING CONNECTED WEBSITES USING REVERSE GOOGLE ANALYTICS ID

Website owners use Google Analytics service to monitor the traffic to their website. Commonly, webmasters uses the same analytical account to monitor multiple websites. By conducting a reverse Google analytical search, we can find websites belonging to the same owner. We can search for a website Google Analytical ID via its source code by searching for “UA-“(see Figure 7). However, there are many free services to conduct this search automatically: 
dnslytics
domainiq

Finding related websites using Google Analytical ID is an efficient method. However, keep in mind that some web developers may copy a webpage source code and paste it in their website without deleting the original website’s Google Analytical ID number. Such a thing will lead to confusing results. Make sure to use multiple sources when conducting a reverse search,  and check if related websites are relevant to the target (e.g. gaming or a website offering free internet programs downloads may not be relevant to a government entity or big enterpirse even though they are using the same Google Analytical ID).

ROBOTS.TXT

webmasters use robots.txt to instruct search engines on how they should crewel pages when visiting their websites. For instance, this file allows website administrators to limit search engine crawlers from indexing some parts or files in their websites. This practice is commonly used to restrict some services –such as the Way Back Machine robots- from archiving part of website content or files.
As an investigator, analyzing robots.txt may reveal important information. Some webmasters include sensitive URLs (to files and directories) within this file to hide them from search engine crawlers. robots.txt file is public and can be accessed by appending /robots.txt after the domain name (see Figure 8).

SUMMARY

To successfully collect information about any website, you need to use a plethora of tools and online services in addition to understanding where to search for such info online. This article gives you a solid introduction to begin expanding your knowledge about website investigation techniques.
Citation:

Khera, V., 2021. [online] Available at: https://www.linkedin.com/pulse/beneath-surface-tools-techniques-investigate-ownership-khera/?trackingId=c435Jn8GTde27daavtnO8Q%3D%3D [Accessed 30 June 2021].


Connect with us today at www.cloudsecasia.com to safeguard your organization against cyber threats.

We are your premier cybersecurity solution and consulting provider in the APAC region