Hyperlinks: Protecting Yourself from Scammers

Networking Attack Vectors and Hyperlink Exploits: Understanding can go a long way in preventing you from falling prey to scams. Learn to handle your applications and hyperlinks like a pro.

When it comes to protecting yourself from scammers, it’s important to understand the various networking attack vectors that they may use. One common way scammers try to exploit their victims is through the use of hyperlinks. While there is a low probability of getting exploited by the invocation of a hyperlink executing malicious code on any intermediate services, it’s still important to be cautious. When receiving any message, it has proven to be risky, with many exploits completely owning target devices. I mean sending a message literally, not in some strange technical sense; people have been hacked when receiving text messages.

It is very important to keep your applications up-to-date with any security patches and ensure that you have continued security support. Applications here mean any logic, your operating system, firmware, hardware, or any apps you have installed.

Hyperlinks can serve a variety of purposes. They can enrich text with a call to action or link (and resolve) to content on the internet. While these are simple use cases, there are many potential options and mechanisms available. Dangerous hyperlinks are those that are intended to harm, as such links can be as simple as a tailored message for overpriced products.

There are other, more complex attack vectors. For example, when opening a website in a web browser, the website page is given control to execute various scripts and HTTP requests. This allows a malicious website to make requests to third-party origins. While most requests to third parties cause no harm, it is the responsibility of the website owner to implement appropriate security mechanisms to prevent any cross-site request forgery.

You can find example code showing this behaviour written in Python. To run this code, start both the client.py and server.py servers (via python server.py & ; python client.py). Then open the URL http://localhost:8080, which is a web page served from the server server. Then open URL http://localhost:8081, the client server, that makes a request to the server (using sandbox execution) that contains the cookies set by the server.

The first party authority can restrict its cookies by setting attributes upon page load, For example, SameSite=Strict. A fully secured example is as follows:

Set-Cookie: __example-name=example-value; SameSite=`Strict`; Secure; Path=/login; 

The SameSite attribute, along with the Secure and Path attributes, further restricts the cookie to being sent during the HTTP (not sandbox) lifecycle and only being sent on requests for a particular argument (path) of the website.

import time
from http.server import BaseHTTPRequestHandler, HTTPServer
from http import cookies


hostName = "localhost"
serverPort = 8081
class MyServer(BaseHTTPRequestHandler):  
  def do_GET(self): 
    self.send_response(200)    
    self.send_header("Content-type", "text/html")
    self.end_headers()
    self.wfile.write(bytes("<html><head></head><body><script>const run = async () => {const response = await fetch('http://localhost:8080', {credentials: 'include'}); const text = await response.text(); debugger}; run()</script><img src=\"http://localhost:8080\" /></body></html>", "utf-8"))


if __name__ == "__main__":        
  webServer = HTTPServer((hostName, serverPort), MyServer)
  print("Server started http://%s:%s" % (hostName, serverPort))
try:
  webServer.serve_forever()
except KeyboardInterrupt:
  pass
webServer.server_close()
print("Server stopped.")
import time
from http.server import BaseHTTPRequestHandler, HTTPServer
from http import cookies


hostName = "localhost"
serverPort = 8080
class MyServer(BaseHTTPRequestHandler):  
  def do_GET(self): 
    self.send_response(200)    
    self.send_header("Content-type", "text/html")
    cookie = cookies.SimpleCookie()
    cookie['a_cookie'] = "Cookie_Value"
    cookie['b_cookie'] = "Cookie_Value2"

    print("hello: " + self.path)
    print(self.headers.get('Cookie'))
   

    for morsel in cookie.values():
      self.send_header("Set-Cookie", morsel.OutputString())

    self.send_header("Access-Control-Allow-Origin","*")
    self.end_headers()
    self.wfile.write(bytes("<html><head><title>Test Server</title></head><body></body></html", "utf-8"))


if __name__ == "__main__":        
  webServer = HTTPServer((hostName, serverPort), MyServer)
  print("Server started http://%s:%s" % (hostName, serverPort))  
try:
  webServer.serve_forever()
except KeyboardInterrupt:
  pass
webServer.server_close() 
print("Server stopped.")

As of July 2024, by default third-party websites can still send fully authenticated requests to arbitrary third parties using the sandbox execution request lifecyle. Although mechanisms like Cross-Origin Resource Sharing (CORS) and SameSite (cookies) have been added to address some parts of this issue - by allowing the target resource to specify if they permit the request origin to make requests - the fullly authenticated request can still be made via the default lax SameSite cookies attribute. Adding to the complexity, the fetch API defaults to prevent any credential leaks, but can include them if the target adheres to CORS. On the other hand, source requests (such as those from an <img> element) do make authenticated requests by default. Update 14/Aug/2024 Google are currently leading improving implementations with the Private Network Access Secure Contexts, Chrome 0.0.0.0 bug, Safari WebKit 0.0.0.0 bug and Firefox 0.0.0.0 bug. However, all current solutions put the access in the hands of the applications servers.

What are hyperlinks? A hyperlink is an element of a user interface that is enriched by a link to a URL. There are multiple types of hyperlinks in web browsers and HTML clients. Email clients have fewer types of elements and now have some controls, but both contain the most common ones.

The official definition of a hyperlink goes:

an electronic link providing direct access from one distinctively marked place in a hypertext or hypermedia document to another in the same or a different document ~ merriam-webster

Both of the elements mentioned below do not show the URL that will be resolved without some interaction on the user’s part. One thing to notice about many hyperlinks is that they have an interface in the browser where you can see more information.

For example, right-click an anchor (your traditional hyperlink) or an image to see custom options.

Common elements (written in HTML code) of types of hyperlinks:

The image element loads and displays an image in the client. Images may be invisible, with the purpose being to track you instead of showing a picture. Some links, like an image element, resolve the URL before any user interaction. With web browsers, the HTTP response for a web page document controls the policies that allow a page to make further cross origin requests. This works by marking HTTP responses with headers.

<img
  class="fit-picture"
  src="/media/cc0-images/grapefruit-slice-332-332.jpg"
  alt="Grapefruit slice atop a pile of other slices" />

The anchor element is your typical call to action, which, when clicked, navigates to a URL. It may also open an application that has been registered as the default for the scheme of the URL. Specified in the href attribute of an anchor tag, href="https://linkhere.com".

<a href="https://example.com">Website</a>

The website can also declare if the text content within their window should be parsed and enriched with hyperlinks when a format has been detected. This means that any text that starts with https:// prefix will automatically link to the resource that it specifies. It also works with other formats so there is no need to prefix phone numbers and email addresses with the tel: and mailto: scheme. Using format detection any phone number or email address will also gain a browser-controlled right-click interface automatically.

What is a URL?

In short, a URL identifies a resource.

In a typical URL, there are a few components to be aware of. Let’s take a look at the example link: https://marketeer.snowdon.dev.

The address of a URL

The authority or full domain is an important part of the URL. The part that controls the equivalent of a physical address for a postal service.

How the URL is resolved

The scheme is also important, as it’s the part of the URL that indicates how the identified resource will be acted upon. When a URL is executed, control is given to the domain eTLD owner to resolve the resource specified by the URL. In the case of an HTTPS request for a website, the contents are displayed in the browser.

Using URLs in applications

Not all parts of a URL are required; for example, in a web browser, the only required part is the full domain (authority); by default, any request will also default to using the HTTPS protocol and the port number 443. This turns a request for marketeer.snowdon.dev into https://marketeer.snowdon.dev:443.

The authority is also not the full resource locator that clients (the navigator in the case of a web browser) will use to scope their interaction. The full origin of a URL includes a port number that specifies an application on an authority’s application server. Many people overlook the port number because web browsers automatically connect to an application’s default port (443) if it’s not explicitly specified. However, when determining the scope of a website’s origin, the browser must include the port number. This is necessary because servers can host multiple HTTP server applications on a single application server. This means that https://marketeer.snowdon.dev:445 and https://marketeer.snowdon.dev:446 are separate origins. The term origin therefore includes the domain. authority, protocol, and scheme.

Arguments or resources of an origin

URLs also have directories, pages, and arguments. Take a look at the URL, notice one URL takes another as an argument. The first authority domain is vscode.dev and the argument is my CV. If you open that link and have authorization, you can open the source files of my cv in the vscode.dev editor application.

As an example of why it is important to trust the authority domain, URLs (with HTTPS) are used to request a resource, but they may also redirect to another URL. Many operating systems and browsers offer zero notification when a website has directed the browser’s window away from a first-party (first-authority) domain. A mechanic that is used in many legitimate application flows. Sometimes with authorization. Redirects may even be transparent to the primary user flow. Redirecting back to any expected page without the user being aware.

Browser windows can also share hyperlinks in a browser-controlled interface, in contrast to opening a browser-controlled interface for a hyperlink. The navigator (browser) will allow prompts given transient activation, an event within the origin. This means that websites can use JavaScript to invoke the share menu for any URL. It’s important to note that share prompt dialogues do not provide a completely browser-controlled interface, as the user agent has the ability to modify some of the contents.

const webPageUrlShareData = {
  title: 'anything',
  url: 'https://anything.com',
  description: "anything"
};
navigator.share(webPageUrlShareData)

When a hyperlink to a website is executed, there are many attack surfaces to be aware of. First, the type of execution determines the attack surface: have you clicked a hyperlink and navigated to the website it serves via HTTPS, or has the link opened a FTP client instead? When a page loads, it can request other assets, like an image, which are loaded and embedded on the page by either a first-party or third-party authority.

This area has been problematic in the past because browser controls have to balance protections for the website, any third parties, and the user. To take an example of the image element, by default, a first party can’t read the contents of a third party’s website but can embed them. That makes sense; you want to be able to include images on a site without the site being able to modify them.

However, since HTTP requests can set cookies, they can also be used to track users across websites. Many browsers are now implementing features to prevent tracking across the web, but since any requests must still be made with TCP, any complete out-of-the-box default implementation for the internet  remains something of the future.

With loading websites, there are multiple important stages in its life cycle:

  • DNS (Domain Name System) resolution, and;
  • Initial network request to a website server, and;
  • Sandbox execution (HTML, EcmaScript, WebAssembly), and;
  • Escalations of information permission responsibilities.

Still, using a browser is much safer than downloading desktop applications. Web browsers run any code within a sandbox, similar to more modern mobile operating systems.

See Apple support article: Prevent cross-site tracking in Safari on Mac

Most scamming techniques involve tricking a user into believing false information. Operating within the capabilities of the browser’s sandbox, scammers must find ways to exploit your behaviour.

Security issues where criminals can successfully leverage remote code execution exploits are rare. The browser is at least secure with the latest software updates applied.

On an iPhone, when long-touching a hyperlink, it may open the website. Any such window opens you up to all the attack surfaces. In fact, tools exist that can control a browser window owned by a website remotely.



Let’s look at the process of ascertaining if a link is from the purported source. First, you must have downloaded the link from a source. Check if your current context is valid. Second, you must validate any further links (calls to action).

I’m not currently aware of an acceptable way to confirm a link in many contexts on the iPhone. If you touch a hyperlink and drag, you get a preview of the link, which displays the target URL. However, this leads to the risk of making a long-press gesture and opening a preview window by accident, or tapping and opening the hyperlink.

It is worth noting that your computer software and any extensions installed on an application (like Chrome extensions or Safari extensions) may also compromise the web browser window. If you have any installed, they too may alter the contents and must be trusted.

Usually, for websites, the protocol is HTTPS or HTTP (its unsecured version). Addressing web pages requires one of these two, but browsers also know how to handle other schemes, such as mailto: (to open a mail client), so don’t be surprised if you see other protocols.

Understanding contact points

Interacting with a hyperlink is a complicated proposition. When an email is downloaded by a mail client and you open it, they are both points of execution of code. Interactions happen at all points, but for example, when opening an email, it must be displayed, and you may make a remote network request to display an image. Scrolling content into view on some applications may also make a remote network request for the title of a website page.

Reduce information exposure

When making network requests, you may prevent the exposure of your internet protocol address by using a remote machine to proxy those requests. Common tools are available, like trusted VPN software.

Trust your contacts

When you see a hyperlink or something that contains a hyperlink, be cautious. Don’t just click one for the sake of it if you don’t trust them. Even websites that are legitimate from a technical perspective may not be from a business or moral one. I have heard reports of websites being used as a means of farming human validation tests to bypass security measures on other websites. Selling your act of validating your human as a product to monetise. Often, these enterprises must steal to offer products good enough to attract visitors.

First, validate that the link is to a target URL that you expect and trust. Next, it’s important to validate that the target URL is owned by who you expect. The aim is to view the canonical target URL of any hyperlink and not the text sequence that has been enriched.

On desktop, simply hover on them with a mouse to see any tooltip in the bottom left or right click.

Validate the information source

When validating a hyperlink or URL, the canonical target URL may only be displayed as metadata hidden behind some interaction. Be careful to trust the tooltip or source of information you are using. Especially any used to showcase an eTLD.

How are sources valid or not valid?

In an email, the sender may control the text, which is enriched as a hyperlink. Since they can use a subset of HTML, however, they cannot control the tooltip within an email client.

When using parts of a client (email, website, social media feed, messages) outside of its sandbox, they can still be trusted regardless of the user content within.

In a web browser, for example, the url bar and a right-click menu always remain in control of the browser (browser-controlled interface). When interacting with some visual element within the browser’s window, like a hyperlink embedded on a web page, you may be presented with a browser-controlled interface or a website-controlled interface. Sometimes an extra click must be issued to open the browser-controlled right-click interface instead of a webpage-controlled right-click interface.

Websites, on the other hand, may control any element within its window using the full HTML standard. That’s why we must opt to use other methods to view hyperlink metadata.

Reduce the number of risks

Once a hyperlink has been displayed on a device, the risk increases if it is interacted with. More so on a touch-screen device. Simply touching any hyperlink can fully execute a website. Hovering on the desktop works great, but I personally do not like the features here in the iOS world.

  • Plan: Have a public device for reading messages from untrusted sources.
  • Reuse: Store your trusted sources and refer back to them.
  • Filter: Use mail clients with good junk detection.
  • Caution: Approach any untrusted link with caution.
  • Separate: Have separated public contact points (like mine, hello@snowdon.dev).
  • Obscure: Use hidden email addresses for non-public-facing services.
  • Limit: Use private windows to open any less trusted websites.
  • Protect: Hide your IP address using a VPN.

Remember you are human

Humans will make mistakes. No person in the world can tell me they haven’t misclicked a link.

Add further protections

Given the many potential vulnerabilities, it’s important to take steps to protect yourself from scammers. This might include being cautious about clicking on links from unknown sources, reading about issues, using anti-virus software, using only minimal attack surfaces, and preventing any automatic local requests.

How can disabling preview windows help?

On the iPhone or other mobile operating systems You may disable Preview Windows to enable a method of viewing the target URL within the context of a browser-controlled user interface (UI), but this currently leaves the risk of accidentally using the feature by leaving it on. However, with them disabled, users may simply long-touch to view the canonical target URL.

What we need is an approved list of websites; however, this has privacy downsides, storage costs, and considerable challenges in managing such a list.

How can disabling automatic remote content loading help?

When some applications see a hyperlink or an image, they will resolve the link to determine information about it. Many applications use this method to display a picture or the title and description. Any network request to a server will expose the IP address of the calling device. In some cases, attacks may be trivial.

For example, sending an email to a victim and getting their IP address, then starting a denial-of-service attack to overflow their available network traffic, stopping them from accessing the internet.

By disabling the option to load remote content, you can ensure you are not loading anything you didn’t expect. Often, this feature is used with email, where messages aren’t necessarily from trusted sources.

Safety through understanding

Every application can manage its context and how it handles any hyperlinks it receives. To explain how only some contexts are vulnerable, you must first understand how applications commonly handle links. Some applications would simply display text as a sequence of characters, which is perfectly safe. Other applications detect that some sequence of characters is a hyperlink and will enrich the visual element to indicate that it has become a call to action. Herein lies the issue, as such a link may be dangerous.

How does social media stack up?

Some people consider social media websites to be more secure in certain regards. Many don’t allow user control of any logic on users’ devices. From my tests, instead of getting the website meta information on the clients end like other applications, popular social media networks now serve their content from their own CDN. Which stops the sender of the message from being able to perform networking fingerprints. I have tested this with Facebook and LinkedIn, and they are using the CDN. However, it is worth noting that not all social media applications will serve their content from their own servers and instead will serve content from the original source website. If you would like a review of all the popular websites, please email me.

Since a website that is receiving HTTP requests can use that information for insights like impressions, hits, and video play time estimates, it is not known which option is the best. And it is very much up to the user.

How does Email stack up?

HTML email hyperlinks, in contrast to social media, give the sender of an email access to a restricted version of HTML. HTML allows control over the enrichment of the content shown to the user. Often, this method has been used to show hyperlinks that purport to link to a legitimate website while actually linking to a website owned by the hacker. Another important point with email clients is that, by default, they allow the sender to invoke network calls as soon as an email has been opened.

How do chat applications stack up?

Many chat applications allow network requests, like social media. But they do not provide a way to control how content is enriched. However, there are many features that are complicated enough to cause issues in the future. I’m looking at you, Emoji.

For example, a user can send another user a URL, which can then be parsed from the side of the user who receives the message. The receiving application uses information from open graph and HTML meta data to create a card relevant to the URL.

While Apple has recently implemented a guard to ensure users invoke any network call, the features to control this are still very much lacking.

How do other applications stack up?

Many applications support a concept like hyperlinks. A PDF can have links that can lead to a hyperlink, a place in a document, or a file on the device.

Conclusion

Please leave feedback or get in contact if you would like more details, videos, or pictures.