Quick Howto - Extract URLs from RTF Files
Quick Howto - Extract URLs from RTF Files
Malicious RTF (Rich Text Format) documents are back in the news with the exploitation of CVE-2026-21509 by APT28. The malicious RTF documents BULLETEN_H.doc and Consultation_Topics_Ukraine(Final).doc mentioned in the news are RTF files (despite their .doc extension, a common trick used by threat actors).
Here is a quick tip to extract URLs from RTF files. Use the following command:
1
rtfdump.py -j -C SAMPLE.vir | strings.py --jsoninput | re-search.py -n url -u -F officeurls
The components of this command chain are as follows:
rtfdump.py -j -C SAMPLE.vir: This parses the RTF file SAMPLE.vir and produces JSON output with the content of all the items found in the RTF document. Option -C ensures that all combinations are included in the JSON data: the item itself, the hex-decoded item (-H), and the hex-decoded and shifted item (-H -S). So per item found inside the RTF file, 3 entries are produced in the JSON data.strings.py --jsoninput: This takes the JSON data produced byrtfdump.pyand extracts all strings.re-search.py -n url -u -F officeurls: This extracts all URLs (-n url) found in the strings produced bystrings.py, performs deduplication (-u), and filters out all URLs linked to Office document definitions (-F officeurls).
I have found one domain (wellnesscaremed) and one private IP address (192.168…). I found extra IOCs: a UNC and a “malformed” URL. The URL has its hostname followed by @ssl. This is not according to standards. @ can be used to introduce credentials, but then it has to come in front of the hostname, not behind it. Here are the results for the other document: Notice that this time, we have @80. I believe that this @ notation is used by Microsoft to provide the port number when WebDAV requests are made (via UNC).
To read the complete article see: Read full article