By Priyanjana Bengani (@acookiecrumbles) and Jon Keegan (@jonkeegan) IRE NICAR Conference – March 4, 2022 Slides: English | Russian
The towing center would like to thank Dr Svetlana Borodina and the Harriman Institute for translating this presentation into Russian.
What is that?
This checklist is intended to be used as a reporting tool to help journalists and researchers when trying to find out who published a website. This is intended for use in conjunction with offline reporting techniques.
Following this checklist doesn’t guarantee you can unmask a website owner who doesn’t want to be found, but it can help reveal crucial clues and connections that can serve as leads for further reporting.
🌟 Strong recommendation: while performing this checklist, create a data log: it can be a TextEdit document, a Google document, just the Notes app, whatever. It is important to be able to retrace your steps.
Content of the site
Features and Functions
Photos, pictures and documents
If there are any social media profiles mentioned on the site, they are worth investigating.
On the Facebook profile, go to Page Transparency:
On Twitter, the account can be part of a pod or a network that boosts it. Using en.whotwi.comworth checking out:
Don’t forget to check if the site has accounts on Youtube, Instagram, Reddit, Github…
🗄 Have you archived the website? (You always should!)
- you can do this at archive.org or use their browser extension.
- you can grab the whole website on Terminal with
🖥 What does the website use?
- Does it use WordPress, Squarespace, anything else?
☁️ Where is it hosted?
- Is it on Google Cloud, AWS, Cloudflare, something else?
🪳 Are there any trackers present?
🛍 How is the site monetized?
- Are there affiliate links (Amazon, etc.)?
🧬 What are the different tracking IDs, and are they shared with other domains?
- Check Google Analytics, Facebook Pixel, Quantcast, NewRelic, etc.
- Use tools like built with, RiskIQWhere dnslytics to see if other domains share the same ID.
Are there any relevant subdomains?
📜 Are there historical WHOIS records?
⌛️ Has the site evolved over time?
- Look archive.org to see if the domain has changed dramatically – and if so, when.
🗑 Did the previous version of the site contain more information?
- Users can delete information when a site has been online for a while.
Resources and tools
Open Source Intelligence Techniques – Michael Bazzell https://inteltechniques.com/book1.html
Verification Manual – edited by Craig Silverman https://datajournalism.com/read/handbook/verification-3
- Black light: Markup’s real-time website privacy inspector.
- builtwith.com: gives you the infrastructure of the site, including IP addresses, scan codes, technology stack, etc. Freemium model.
- DNSDBScout: Allows you to search and “flex search” for passive DNS lookups, including IP domain mapping .
- dnslytics: offers a range of tools including reverse analysis and reverse DNS lookups, as well as WHOIS data. Freemium.
- RiskIQ: a “threat intelligence” tool that allows you to obtain reverse IP, reverse analytics, WHOIS, SSL, subdomains, etc.
- Whoxy: a tool that allows you to view the history of WHOIS records. Free.
- The Internet Archive browser extension.
Social media accounts
- Sensitivity AI: checks whether an image is generated by the GAN or not. Freemium.
- whotwi.com: Create a profile at a glance for any account on Twitter. Free.
Check out this checklist on GitHub.
Priyanjana Bengani and Jon Keegan
TOP IMAGE: Hana Joy