EITR

View Original

Web Retriever: The New Hero in your Cybersecurity Arsenal

Author: Nicholas M. Hughes

In the realm of cybersecurity and technology, where the Internet is both the battlefield and the prize, a new champion has emerged from the shadows of security automation tools. Meet Web Retriever, an open-source API from Hopr and EITR, that acts as a guardian of the gateway between machine workloads and the vast, chaotic realms of the Internet.

Overview

Web Retriever is an open-source API designed as a conduit between machine workloads and the Internet. It is not your typical gatekeeper. It operates more like the bouncer at the Internet's most exclusive club, carefully managing requests from machine workloads, retrieving necessary resources from the Internet, and ensuring a secure and efficient flow of information. Web Retriever is not here to play around; its purpose is to safeguard, optimize, and oversee your online interactions with the precision of a Swiss watch.

With its robust features, Web Retriever aims to optimize and secure the communication process between machine workloads and the Internet, ensuring that it is managed efficiently and effectively.

Web Retriever features

Let’s dive into the key features that make Web Retriever a formidable asset among your security automation tools:

1.    Rule Engine: The guardian angel

At the heart of Web Retriever lies the Rule Engine, a powerful component that scrutinizes each request, allowing or denying them based on a set of predefined criteria. It’s like having a cybersecurity concierge, ensuring that only the most esteemed (read: secure and necessary) communications occur between your machine workloads and the Internet.

2.    Header manipulation: The master of disguise

Web Retriever doesn’t just stop at managing requests; it’s also a maestro at manipulating request headers. It can dynamically add essential elements, such as API tokens, ensuring that sensitive information isn’t scattered like confetti across multiple workloads. It also knows when to hold back, removing specific header information to prevent the leakage of internal or sensitive data to external sources.

3.    Built on Plugin Oriented Programming (POP): The customizable crusader

Web Retriever is not just robust – it’s flexible. Built upon the concept of Plugin Oriented Programming (POP), it allows for high extensibility and customization. This means it’s not just a one-size-fits-all solution but a customizable ally that adapts to your specific needs and environment.

Installation

Getting started with Web Retriever is as straightforward as a superhero landing. With prerequisites like Python 3.8+ and git, you can either install it from PyPI or source. A simple command such as ‘pip install web-retriever’ gets the job done if you’re installing from PyPI.

●     Install from PyPI:

pip install web-retriever

●     Alternatively, install from source:

git clone git@gitlab.com/hoprco/web-retriever.git
cd web-retriever
python3 -m venv .venv
source .venv/bin/activate
pip install .

Basic Usage

Once installed, wielding the power of Web Retriever is intuitive. Configuration parameters can be passed via YAML format, allowing for a seamless and efficient user experience. It’s all about making the process as smooth and user-friendly as possible, ensuring that you can focus on optimizing and securing your interactions without getting bogged down by complexities.

  1. Create a YAML configuration file with your rules. Here is a simple example:

web_retriever:
  rules:
    - rule_type: "allow"
      rule_string: "remote == '127.0.0.1'"

  1. Run Web Retriever with your configuration file:

$ web-retriever -c config.yaml

======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)

  1. Test the setup by making a request to the Web Retriever API. You should see the content from the remote API in the response.

$ curl -s "localhost:8080/api/v1/fetch?url=https://catfact.ninja/fact&type=json" | jq

{
  "status": "success",
  "data": [
    {
      "url": "https://catfact.ninja/fact",
      "type": "json",
      "content": {
        "fact": "A cat’s hearing is better than a dog’s. And a cat can hear high-frequency sounds up to two octaves higher than a human.",
        "length": 119
      }
    }
  ],
  "timestamp": "2023-10-18T20:26:44.648416"
}

Congratulations! You’ve just set up and tested a basic scenario with Web Retriever. Explore further by trying out different rules and configurations to fully harness the capabilities of Web Retriever.

Conclusion

Web Retriever emerges as a leader among security automation tools and a potent ally in managing and securing the communication process between machine workloads and the Internet. It’s not just about doing the job – it’s about doing the job with a level of precision, efficiency, and adaptability that makes it stand out in the crowded cybersecurity landscape. So, the next time you’re navigating the tumultuous waters of the Internet, you might just want Web Retriever by your side.