HTTPXRay is a powerful desktop crawler designed to reveal the full structure of websites and web applications. Using a high-performance HTTPX engine, the crawler maps every discovered URL, asset, and reference within HTML documents and stores the results in a structured SQLite database.
Think of HTTPXRay as a structural X-ray for the web.
Instead of downloading files or mirroring sites, the spider catalogs relationships between pages, scripts, images, forms, and external resources. The result is a complete structural map that can be explored, exported, and analyzed.
HTTPXRay is ideal for researchers, analysts, engineers, developers, and OSINT investigators who need deep visibility into how websites are structured.
The crawler extracts links from common HTML structures including:
a[href] link[href] script[src] img[src] iframe[src] video[src] audio[src] source[src] form[action]
Every discovered resource is cataloged including pages, images, scripts, stylesheets, downloadable files, and external dependencies.
HTTPXRay is written in Python and powered by the high-performance HTTPX networking library.
The crawler uses a queue-driven spider architecture that recursively processes discovered links while preventing duplication through normalized URL indexing.
pip install httpx beautifulsoup4 reportlab
python httpxray.py
pip install pyinstaller pyinstaller --onefile --noconsole httpxray.py
HTTPXRay stores crawl results in a SQLite database, making it easy to analyze or process discovered resources using external tools.