Written by Tim Tenckhoff – tt031 | Computer Science and Media
The mysterious dark part of the internet – hidden in depths of the world wide web, is well known as a lawless space for shady online drug deals or other criminal activities. But in times of continuous tracking on the Internet, personalized advertising or digital censorship by governments, the (almost) invisible part of the web promises to bring back lost anonymity and privacy as well. This blogpost aims to shed light into the dark corners of the deep web and primarily deals with the explanation of how TOR works.
2. The Deep Web
So, what exactly is the deep web? To explain this, it makes sense to cast a glance at the overall picture. The internet as most people know it, forms only a minimal proportion of the overall 7.9 Zettabyte (1 ZB = 10007 bytes = 1021 bytes = 1000000000000000000000 bytes = 1 trillion Gigabytes?) of data available online (Hidden Internet 2018). This huge amount of data can be separated into three parts:
As seen in the picture above, we are accessing only 4% available on search engines like Google or Bing. The remaining 96% (90% + 4%) are protected by passwords, hidden behind paywalls or can be accessed via special tools (Hidden Internet 2018). But what separates the hidden parts into Deep Web and Dark Web by definition?
The Deep Web is fundamentally referred to data which are not indexed by any standard search engines as e.g. Google or Yahoo. This includes all web pages that search engines cannot find, such as user databases, registration-required web forums, webmail pages, and pages behind paywalls. Thus, the Deep Web can, of course, contain content that is totally legal (e.g. governmental records). The Dark Web is a small unit of the Deep Web – which refers to web pages that cannot be found by common search engines. The collection of websites that belongs to this dark web only exists on an encrypted network that cannot be reached by regular browsers (such as Chrome, Firefox, Internet Explorer, etc.). In conclusion, this area is the well-suited scene of cybercrime. Accessing these Dark Websites requires the usage of the Tor Browser.
…hidden crime bazaars that can only be accessed through special software that obscures one’s true location online.– Brian Krebs, Reference: (Krebs On Security 2016)
2. 1 What is the Tor Browser?
The pre-alpha version of the Tor Browser was released on September 2002 (Onion Pre Alpha 2002 and the Tor Project, the company maintaining Tor, was started in 2006. The name Tor consists of three subterms and is the abbreviation of The onion router. The underlying Onion Routing Protocol was initially developed by the US Navy in the mid-1990s at the U.S Naval Research Laboratory (Anonymous Connections 1990). The protocol basically describes a technique for anonymous communication over a public network: By encapsulating each message carried in several layers of encryption and redirecting Internet traffic through a free, worldwide overlay network. It is called onion routing because of the layers in this network and the layers of an onion. Developed as free and open-source software for enabling anonymous communication, the Tor-Browser still follows the intended use today: protecting personal privacy and communication by protecting internet activities from being monitored.
With the Tor Browser, barely anyone can get access to The Onion Router (Tor) network by downloading and running the software. The browser does not need to be installed in the system and can be unpacked and transported as portable software via USB stick (Tor Browser 2019). As soon as this is done, the browser is able to connect to the Tor network. This is a network of many servers, the Tor nodes. While surfing, the traffic is encrypted by each of these Tor nodes. Only at the last server in the chain of nodes, the so-called exit node, the data stream is decrypted again and normally routed via the Internet to the target server, which is located in the address bar of the Tor browser. In concrete terms, the Tor browser first downloads a list of all available Tor servers for the connection over the Tor network and then defines a random route from server to server for data traffic, which is called Onion Routing as said before. These routes consist of a total of three Tor nodes, with the last server being the Tor exit node (Tor Browser 2019).
For the reason that traffic to the Onion service runs across multiple servers from the Tor Project, the traces that users usually leave while surfing with a normal Internet browser or exchanging data such as email and messenger messages become blurred. Even though the payload of normal Internet traffic is encrypted, e.g. via https, the header containing routing source, destination, size, timing etc. can simply be spied by attackers or Internet providers. Onion routing in contrast also obscures the IP address of Tor users and keeps their computer location anonymous. To continuously disguise the data route, a new route through the Tor network is chosen every ten (Tor Browser 2019) minutes. The exact functionality of the underlying encryption will be described later in section Onion Routing – How Does Tor Work?.
3. The Tor-Network
For those concerned about the privacy of their digital communications in times of large-scale surveillance, the Tor network provides the optimal obfuscation. The following section explains which content can be found on websites hidden in the dark web, how the multi-layered encryption works in detail, and what kind of anonymity it actually offers.
Most of the content in relation to the darknet involves nefarious or illegal activity. With the provided possibility of anonymity, there are many criminals trying to take advantage of it. This results in a large volume of darknet sites revolving around drugs, darknet markets (sites for the purchase and sale of services and goods), and fraud. Some examples found within minutes using the Tor browser are listed in the following:
- Drug or other illegal substance dealers: Darknet markets (black markets) allow the anonymous purchase and sale of medicines and other illegal or controlled substances such as pharmaceuticals. Almost everything can be found here, quite simply in exchange for bitcoins.
- Hackers: Individuals or groups, looking for ways to bypass and exploit security measures for their personal benefit or out of anger for a company or action (Krebs On Security 2016), communicate and collaborate with other hackers in forums, share security attacks (use a bug or vulnerability to gain access to software, hardware, data, etc.) and brag about attacks. Some hackers offer their individual service in exchange for bitcoins.
- Terrorist organizations use the network for anonymous Internet access, recruitment, information exchange and organisation (What is the darknet?).
- Counterfeiters: Offer document forgeries and currency imitations via the darknet.
- Merchants of stolen information offer e.g. credit card numbers and other personally identifiable information can be accessed and ordered for theft and fraud activities.
- Weapon dealers: Some dark markets allow the anonymous, illegal purchase and sale of weapons.
- Gamblers play or connect in the darknet to bypass their local gambling laws.
- Murderers/assassins: Despite of existing discussions about whether these services are real or legitimate, created by the law enforcement or just fictitious websites, some dark websites exist, that offer murder for rent.
- Providers of illegal explicit material e.g. child pornography: We will not go into detail here.
But the same anonymity also offers a bright side: freedom of expression. It offers the availability to speak freely without fear about persecution in countries where this is no fundamental right. According to the Tor project, hidden services allowed regime dissidents in Lebanon, Mauritania and the Arab Spring to host blogs in countries where the exchange of those ideas would be punished (Meet Darknet 2013). Some other use-cases are:
- To use it as a censorship circumvention tool, to reach otherwise blocked content (in countries without free access to information)
- Socially sensitive communication: Chat rooms and web forums where rape and abuse survivors or people with illnesses can communicate freely, without being afraid of being judged.
A further example of that is the New Yorker’s Strongbox, which allows whistleblowers to upload documents and offers a way to communicate anonymously with the magazine (Meet Darknet 2013).
3.2 Accessing the Network
The hidden sites of the dark web can be accessed via special onion-domains. These addresses are not part of the normal DNS, but can be interpreted by the Tor browser if they are sent into the network through a proxy (Interaction with Tor 2018). In order to create an onion-domain, a Tor daemon first creates an RSA key pair, calculates the SHA-1 hash over the generated public RSA key, shortens it to 80 bits, and encodes the result into a 16-digit base32 string (e.g. expyuzz4waqyqbqhcn) (Interaction with Tor 2018). For the reason that onion-domains directly derive from their public key, they are self-certifying. That implements, that if a user knows a domain, he automatically knows the corresponding public key. Unfortunately, onion-domains are therefore difficult to read, write, or to remember. In February 2018, the Tor Project introduced the next generation of onion-domains, which can now be 56 characters long, use a base32 encoding of the public key, and includes a checksum and version number (Interaction with Tor 2018). The new onion services also use elliptic curve cryptography so that the entire public key can now be embedded in the domain, while it could only be the hash in previous versions. These changes led to enhanced security of onion-services, but long and unreadable domain names interfered the usability again (Interaction with Tor 2018). Therefore, it is a common procedure, to repeatedly generate RSA keys until the domain randomly contains the desired string (e.g. facebook). These vanity onion domains look like this for e.g. Facebook (facebookcorewwwi.onion) or the New York Times (nytimes3xbfgragh.onion) (Interaction with Tor 2018). In contrast to the rest of the Worldwide Web, where navigation is primarily done via search engines, the darknet often contains pages with lists of these domains for further navigation. The darknet deliberately tries to hide from the eyes of the searchable web (Meet Darknet 2013)
3.3 Onion Routing – How Does Tor Work?
So how exactly does the anonymizing encryption technology behind Onion Routing work? As said before, the Tor browser chooses an encrypted path through the network and builds a circuit in which each onion router only knows (is able to decrypt) its predecessor and the successor, but no other nodes in the circuit. Tor thereby uses the Diffie-Hellman algorithm to generate keys between the user and different onion routers in the network (How does Tor work 2018). The algortihm is one possible application of Public Key Cryptography that makes use of two large prime numbers which are mathematically linked:
- A public-key — public and visible to others
- A private-key — private and kept secret
The public key can be used to encrypt messages and the private key is in return used to decrypt the encrypted content. This implicates, that anyone is able to encrypt content for a specific recipient, but this recipient alone can decrypt it again (How does Tor work 2018).
Tor normally uses 3 nodes by default, so 3 layers of encryption are required to encrypt a message (How does Tor work 2018). It is important to say, that every single Tor packet (called cell) is exactly 512kb large. This is done for the reason, that attackers cannot guess which cells are larger cells e.g images/media (How does Tor work 2018). On every step, the transferred message/package reaches, one layer of encryption is decrypted, revealing the position of the next successor in the circuit. This makes it possible, that nodes in the circuit do not know where the previous message originated or where its final destination is (How does Tor work 2018). A simplified visualization of this procedure can be seen in the picture below.
But how does the network allow different users to connect without knowing each other’s network identity? The answer are so-called “rendezvous points”, formerly known as hidden services. (Onion Service Protocol 2019). The following steps are mainly extracted and summarized from the official documentation of Tor about the Onion Service Protocol 2019 and describe the technical details of how this is made possible: Step 1: Before a client is able to contact an onion service in the network, it needs to broadcast its existence. Therefore, the service randomly selects relays in the network and requests them to act as introduction points by sending its public key. The picture below shows these circuit connections in the first step as green lines. It is important to mention, that these lines mark Tor circuits and not direct connections. The full three-step circuit makes it hard to associate an introduction point with the IP address of an onion server: Even though the introduction point is aware of the onion servers identity (public key) it does never know the onion server’s location (IP address)(Onion Service Protocol 2019).
Step 2: Step two: The service creates a so-called onion service descriptor that contains its public key and a summary of each introductory point (Onion Service Protocol 2019). This descriptor is signed with the private key of the service and then uploaded to a distributed hash database table in the network. If a client requests an onion domain as described in section Accessing the Network the respective descriptor is found. If e.g. “abc.onion” is requested, “abc” is a 16 or 32 character string derived by the service’s public key as seen in the picture below.
Step 3: When a client contacts an onion-service it needs to initiate the connection by downloading the descriptor from the distributed hash table as described before. If that certain descriptor exists for the address abc.onion, the client receives the set of introduction points and the respective public key. This action can be seen in the picture below. At the same time, the client establishes a connection circuit to another randomly selected node in the network and asks it to act as a rendezvous point by submitting a one time-secret key (Onion Service Protocol 2019).
Step 4: Now the client creates a so-called introduce message (encrypted with the public key of the onion service), containing the address of the rendezvous point and the one-time secret key. This message is sent to one of the introduction points, requesting the onion service as its final target. For the reason that the communication is again realized by a gate circuit, it is not possible to uncover the clients IP address and thus its identity.
Step 5: At this point, the onion service decodes the introduce message including the address of the rendezvous point and the one-time secret key. The service is then able to establish a circuit connection to the now revealed rendezvous point and communicates the one-time secret in a rendezvous message to the node. Thereby, the service remains with the same set of entry guards for the creation of new circuits (Onion Service Protocol 2019). By application of this technique, an attacker is not able to create his own relay to force the onion service to create an optional number of circuits, so that the corrupt relay might be randomly selected as the entry node. This attack scenario which is able to uncover the anonymity in the Deep Web networks was described by Øverlier and Syverson in their paper (Locating Hidden Servers 2006).
Step 6: As seen in the last picture below, the rendezvous point informs the client about the successfully established connection. Afterwards, both the client and onion service are able to use their circuits to the rendezvous point to communicate. The (end-to-end encrypted) messages are forwarded through the rendezvous point from client to the service or vice versa (Onion Service Protocol 2019). The initial introduction circuit is never used for the actual communication for one important reason mainly: A relay should not be attributable to a particular onion service. The rendezvous point is therefore never aware of the identity of any onion service (Onion Service Protocol 2019). Altogether, the complete connection between service and onion service and client consists of six nodes: three selected by the client, whereas the third is the rendezvous point and the other three are selected by the service.
4. Conclusion – Weaknesses
Different from what many people believe (How does Tor work 2018) Tor is no completely decentralized peer-to-peer system. If it was, it wouldn’t be very useful, as the system requires a number of directory servers that continuously manage and maintain the state of the network.
Furthermore, Tor is not secured against end-to-end attacks. While it does provide protection against traffic analysis, it cannot and does not attempt to protect against monitoring of traffic at the boundaries of the Tor network (the traffic entering and exiting the network), which is a problem that cyber security experts were unable to solve yet (How does Tor work 2018). Researchers from the University of Michigan even developed a network scanner allowing identification of 86% of worldwide live Tor “bridges” with a single scan (Zmap Scan 2013). Another disadvantage of Tor is its speed – because the data packages are randomly sent through a number of nodes, and each of them could be anywhere in the world, the usage of Tor is very slow. Despite its weaknesses, the Tor browser is an effective, powerful tool for the protection of the user’s privacy online, but it is good to keep in mind that a Virtual Private Network (VPN) can also provide security and anonymity, without the significant speed decrease of the Tor browser (Tor or VPN 2019) . If total obfuscation and anonymity regardless of the performance play a decisive role, a combination of both is recommended.
Hidden Internet , Manu Mathur, Exploring the Hidden Internet – The Deep Web [Online]
Available at: https://whereispillmythoughts.com/exploring-hidden-internet-deep-web/
[Accessed 27 August 2019].
Search Engines , Julia Sowells, Top 10 Deep Web Search Engines of 2017 [Online]
Available at: https://hackercombat.com/the-best-10-deep-web-search-engines-of-2017/
[Accessed 24 July 2019].
Krebs On Security , Brian Krebs, Krebs on Security: Rise of Darknet Stokes Fear of The Insider [Online]
Available at: https://krebsonsecurity.com/2016/06/rise-of-darknet-stokes-fear-of-the-insider/
[Accessed 14 August 2019].
Anonymous Connections , Michchael G.Reed, Paul F. Syversion, and David M. Goldschlag Naval Research Laboratory Anonymous Connections and Online Routing [Online]
Available at: https://www.onion-router.net/Publications/JSAC-1998.pdf
[Accessed 18 August 2019].
Onion Pre Alpha , Roger Dingledine, pre-alpha: run an onion proxy now! [Online]
Available at: https://archives.seul.org/or/dev/Sep-2002/msg00019.html
[Accessed 18 August 2019].
Tor Browser , Heise Download, Tor Browser 8.5.4 [Online]
Available at: https://www.heise.de/download/product/tor-browser-40042
[Accessed 29 August 2019].
Interaction with Tor , Philipp Winter, Anne Edmundson, Laura M. Roberts, Agnieszka Dutkowska-Zuk, Marshini Chetty, Nick Feamster, How Do Tor Users Interact With Onion Services? [Online]
Available at: https://arxiv.org/pdf/1806.11278.pdf
[Accessed 16. August 2019].
What is the darknet?, Darkowl, What is THE DARKNET? [Online]
Available at: https://www.darkowl.com/what-is-the-darknet/
[Accessed 22. August 2019].
Meet Darknet , PCWorld: Brad Chacos ,Meet Darknet, the hidden, anonymous underbelly of the searchable Web [Online]
Available at: https://www.pcworld.com/article/2046227/meet-darknet-the-hidden-anonymous-underbelly-of-the-searchable-web.html
[Accessed 23. August 2019].
Onion Service Protocol , Tor Documentation, Tor: Onion Service Protocol [Online]
Available at: https://2019.www.torproject.org/docs/onion-services
[Accessed 8. July 2019].
How does Tor work , Brandon Skerritt, How does Tor *really* work? [Online]
Available at: https://hackernoon.com/how-does-tor-really-work-c3242844e11f
[Accessed 8. July 2019].
Locating Hidden Servers , Lasse Øverlier, Paul Syverson, Locating Hidden Servers [Online]
Available at: https://www.onion-router.net/Publications/locating-hidden-servers.pdf
[Accessed 8. August 2019].
Zmap Scan , Peter Judge, Zmap’s Fast Internet Scan Tool Could Spread Zero Days In Minutes [Online]
Available at: https://www.silicon.co.uk/workspace/zmap-internet-scan-zero-day-125374
[Accessed 21. August 2019].
Tor or VPN , Bill Man, Tor or VPN – Which is Best for Security, Privacy & Anonymity? [Online]
Available at: https://blokt.com/guides/tor-vs-vpn
[Accessed 8. August 2019].