From 2683149453b341ae95376527b8123b8172146cb6 Mon Sep 17 00:00:00 2001 From: tma2024-iteg <161044325+tma2024-iteg@users.noreply.github.com> Date: Fri, 1 Mar 2024 12:42:40 +0100 Subject: [PATCH] add rest of figures and explanations --- .../over_time_blocklist_de_strongips_ip.html | 208 +++++++++++++++ _includes/over_time_find_rate.html | 217 +++++++++++++++ _includes/over_time_openphish_dn.html | 208 +++++++++++++++ _includes/ptp_scores_domains.html | 235 +++++++++++++++++ _includes/ptp_scores_ip_addresses.html | 247 ++++++++++++++++++ _pages/figures.md | 29 +- _pages/pipeline.md | 31 ++- _pages/ptp.md | 61 ++++- assets/example_ifip_tma.svg | 1 + assets/schema.svg | 1 + index.md | 8 +- 11 files changed, 1227 insertions(+), 19 deletions(-) create mode 100644 _includes/over_time_blocklist_de_strongips_ip.html create mode 100644 _includes/over_time_find_rate.html create mode 100644 _includes/over_time_openphish_dn.html create mode 100644 _includes/ptp_scores_domains.html create mode 100644 _includes/ptp_scores_ip_addresses.html create mode 100644 assets/example_ifip_tma.svg create mode 100644 assets/schema.svg diff --git a/_includes/over_time_blocklist_de_strongips_ip.html b/_includes/over_time_blocklist_de_strongips_ip.html new file mode 100644 index 0000000..e292ea9 --- /dev/null +++ b/_includes/over_time_blocklist_de_strongips_ip.html @@ -0,0 +1,208 @@ +
+ +
+ +
diff --git a/_includes/over_time_find_rate.html b/_includes/over_time_find_rate.html new file mode 100644 index 0000000..cc13951 --- /dev/null +++ b/_includes/over_time_find_rate.html @@ -0,0 +1,217 @@ +
+ +
+ +
diff --git a/_includes/over_time_openphish_dn.html b/_includes/over_time_openphish_dn.html new file mode 100644 index 0000000..afaa250 --- /dev/null +++ b/_includes/over_time_openphish_dn.html @@ -0,0 +1,208 @@ +
+ +
+ +
diff --git a/_includes/ptp_scores_domains.html b/_includes/ptp_scores_domains.html new file mode 100644 index 0000000..1ca62a7 --- /dev/null +++ b/_includes/ptp_scores_domains.html @@ -0,0 +1,235 @@ +
+ +
+ +
diff --git a/_includes/ptp_scores_ip_addresses.html b/_includes/ptp_scores_ip_addresses.html new file mode 100644 index 0000000..e70e5b6 --- /dev/null +++ b/_includes/ptp_scores_ip_addresses.html @@ -0,0 +1,247 @@ +
+ +
+ +
diff --git a/_pages/figures.md b/_pages/figures.md index 3eee13a..0423767 100644 --- a/_pages/figures.md +++ b/_pages/figures.md @@ -6,16 +6,37 @@ permalink: /figures/ -**Figure 4** Nodes ordered by the fraction of edges they accumulate. The figure shows a high centralization on IP addresses (90% of the edges congregate around only 2% of the addresses). The legend also lists the total number of nodes. +### Figure 4 {% include cdf_edges.html %} +Nodes ordered by the fraction of edges they accumulate. The figure shows a high centralization on IP addresses (90% of the edges congregate around only 2% of the addresses). The legend also lists the total number of nodes. -**Figure 7** Number of nodes identified using an ITEG and the described blocklist as PTP input on each measurement date. Also, showing the percentage of identified nodes that appeared later on the respective blocklist as Appearance Rate. +### Figure 5 -**abuse.ch Feodo** +{% include ptp_scores_domains.html %} +{% include ptp_scores_ip_addresses.html %} + +Cumulative number of domains and IP addresses with a threat probability above the depicted value, found via PTP. Categorized into the Virus Total and Google Safe Browsing labels. + +### Figure 6 + +{% include over_time_find_rate.html %} + +Rate of nodes with a score above the depicted threshold that appeared later on the blocklist. Marked are the s e thresholds providing the maximal rate. + +### Figure 7 + +
abuse.ch Feodo
{% include over_time_abuse_ch_feodo_ip.html %} +
Blocklist.de Strongips
+{% include over_time_blocklist_de_strongips_ip.html %} + +
Openphish
+{% include over_time_openphish_dn.html %} + + +Number of nodes identified using an ITEG and the described blocklist as PTP input on each measurement date. Also, showing the percentage of identified nodes that appeared later on the respective blocklist as Appearance Rate. -to be continued.. diff --git a/_pages/pipeline.md b/_pages/pipeline.md index 9d6cb9f..8fe65ec 100644 --- a/_pages/pipeline.md +++ b/_pages/pipeline.md @@ -16,7 +16,7 @@ cd graph-pipeline git lfs pull {% endhighlight %} -Git-LFS is only used for the example graph (please decompress the data before usage). +Git-LFS is only used for the example data. For simplicity, we excluded the full IPv4 address space scan (performed with [ZMap](https://zmap.io/)) and the scan for an open port 443 on IPv6 Addresses with [zmapv6](https://github.com/topics/zmapv6) in this example. In the following we explain the steps necessary to reproduce the `example-data/` from the repository. @@ -27,10 +27,10 @@ However, we skipt the latter in this example and, instead, use the IP addresses For this example, we used the [Tranco](https://tranco-list.eu/) top 1M domains, the [SSLBL](https://sslbl.abuse.ch/), and the [Feodo Tracker](https://feodotracker.abuse.ch/): {% highlight bash %} -curl https://tranco-list.eu/download/W9K29/1000000 | cut -d , -f 2 > example-data/tranco-input.txt -curl https://sslbl.abuse.ch/blacklist/sslblacklist.csv > example-data/blocklists/abuse_ch_sslbl.csv +curl https://tranco-list.eu/download/G6Z3K/1000000 | cut -d , -f 2 > example-data/tranco-input.txt +curl https://sslbl.abuse.ch/blacklist/sslblacklist.csv > example-data/blocklists/abuse_ch_sslbl.sha1.csv curl https://sslbl.abuse.ch/blacklist/sslipblacklist.csv > example-data/blocklists/abuse_ch_sslbl.ip.csv -curl https://feodotracker.abuse.ch/downloads/ipblocklist.csv > example-data/abuse_ch_feodo.ip.csv +curl https://feodotracker.abuse.ch/downloads/ipblocklist.csv > example-data/blocklists/abuse_ch_feodo.ip.csv {% endhighlight %} ## 2. DNS Resolutions @@ -53,13 +53,14 @@ This creates two files each for the IPv4 and IPv6 DNS resolutions. ## 3. The TLS Scan Our TLS measruements were performed with the [TUM goscanner](https://github.com/tumi8/goscanner.git). +Please download and build the scanner. The scanner needs a single input file: -{% highlight bash %} +```bash cat <(grep -F "#" -v example-data/blocklists/abuse_ch_sslbl.ip.csv | cut -d , --output-delimiter ":" -f 2,3) \ <(grep -F "#" -v example-data/blocklists/abuse_ch_feodo.ip.csv | csvtool -u : col 2,3 - | tail -n +2) \ example-data/dns/*.ipdomain | shuf > example-data/goscanner-input.csv -{% endhighlight %} +``` Then, we can start with the actual TLS measurements: @@ -83,13 +84,14 @@ Note the necessary cache directory that can be deleted after the run. ./create_graph.sh {% endhighlight %} -## 5. Explore and Analyze the ITEG +### 5. Explore and Analyze the ITEG Our parsing pipeline produces multiple files for each type of edge and node that can be analyzed with other tools. -For example, the CSV output under `example-data/ITEG` can be directly imported into Neo4J. +For example, the CSV output under `example-data/tls_graph` can be directly imported into Neo4J. +Although, they need to be decompressed (e.g., with `zstd --rm -d *.zst`). {% highlight bash %} -./import_neo4j.sh example-data/ITEG +./import_neo4j.sh example-data/tls_graph {% endhighlight %} @@ -97,7 +99,16 @@ We experienced that Neo4J provides a convenient Interface to manually explore th The output of the Neo4J schema function: -![Schema of the Example-Graph](/assets/example_schema.svg){:style="display:block; margin-left:auto; margin-right:auto"} +![Schema of the Example-Graph]({{site.baseurl}}/assets/schema.svg){:style="display:block; margin-left:auto; margin-right:auto"} Note that there were IP Addresses embedded as Altertnative Name in some certificates. These were always self-signed certificates. However, we did not consider such cases in the paper. + +#### The TMA website + +tma.ifip.org was not on the tranco top list, but we manually added it for demonstration. +If you are running Neo4J you can see yourself, or have a look at the following excerpt: + +![Example from the Graph]({{site.baseurl}}/assets/example_ifip_tma.svg){:style="display:block; margin-left:auto; margin-right:auto"} + +You can see that tma.ifip.org is quite isolated, it has its own IP address and certificate. However, the parent domain ifip.org has a certificate that reveal the alias ifip.or.at. diff --git a/_pages/ptp.md b/_pages/ptp.md index c274485..aa226e2 100644 --- a/_pages/ptp.md +++ b/_pages/ptp.md @@ -24,7 +24,66 @@ Git-LFS is only used for the example data. The PTP algorithm can be run with docker and spark in the provided container. {% highlight bash %} -./ptp.sh +./run_ptp.sh {% endhighlight %} +### Results + +The following tables show the IP addresses and domains with a score of 100% found with the blocked SSLBL certificates as input. +We checked each entry with Virus Total and appended the aggregated class (according to the paper). +Interestingly, even when scanning just the Tranco Top 1 Million websites, we found several Domains and IP addresses with a high threat score and also Virus Total identifies them as potentially malicious. + +| Domain | VT class | +|---------------------|------------| +| uni.me | malicious | +| assortedrent.best | malicious | +| igoseating.com | malicious | +| cinemacity.live | malicious | +| avstop.com | harmless | +| monnalisa.com | harmless | +| manyhit.com | harmless | +| ccdcn.cn | harmless | +| ilkconstruction.com | harmless | +| eglobaldomains.com | harmless | +| eflowsys.com | harmless | +| imbroadbandmpl.com | harmless | +| 7-live.com | harmless | +| itlalaguna.edu.mx | harmless | +| surtitodo.com.co | harmless | +| ikoop.com.my | harmless | +| ucflower.tw | harmless | +| lamolina.edu.pe | harmless | +| neunet.com.ar | harmless | +| unipol.edu.bo | undetected | +| kcmservice.com | undetected | + + +| IP Address | VT class | +|-----------------|------------| +| 104.243.46.129 | malicious | +| 45.145.55.81 | malicious | +| 216.218.135.114 | malicious | +| 185.16.39.253 | malicious | +| 82.222.185.244 | malicious | +| 80.79.7.197 | malicious | +| 103.145.57.203 | malicious | +| 20.26.126.28 | malicious | +| 203.188.15.2 | harmless | +| 203.174.41.164 | harmless | +| 104.243.37.63 | harmless | +| 40.90.180.148 | harmless | +| 59.92.232.2 | harmless | +| 45.81.115.161 | undetected | +| 121.4.202.96 | undetected | +| 190.14.231.210 | suspicious | +| 45.231.83.134 | undetected | +| 78.46.205.169 | undetected | +| 200.59.236.49 | undetected | +| 187.190.56.90 | undetected | +| 125.229.114.79 | undetected | +| 114.32.146.202 | undetected | +| 200.105.167.174 | undetected | +| 80.211.143.18 | undetected | +| 202.57.128.136 | undetected | +| 103.149.103.38 | undetected | diff --git a/assets/example_ifip_tma.svg b/assets/example_ifip_tma.svg new file mode 100644 index 0000000..3eca943 --- /dev/null +++ b/assets/example_ifip_tma.svg @@ -0,0 +1 @@ +Neo4j Graph VisualizationCreated using Neo4j (http://www.neo4j.com/)RESOLVESSUBDOMAIN_OFPARENT_DOMAINCONTAINSRETURNSDEPLOYED_ONRESOLVESSUBDOMAIN_OFPARENT_DOMAINCONTAINSRETURNSRESOLVESCONTAINSDEPLOYED_ONRESOLVESRESOLVESCONTAINSPARENT_DOMAINCONTAINSSUBDOMAIN_OF tma.ifip.org 213.145.22… CN=tma.ifip… ifip.org 3.69.149.181 www.ifip.org CN=www.ifi… ifip.or.at www.ifip.or.at \ No newline at end of file diff --git a/assets/schema.svg b/assets/schema.svg new file mode 100644 index 0000000..a6399da --- /dev/null +++ b/assets/schema.svg @@ -0,0 +1 @@ +Neo4j Graph VisualizationCreated using Neo4j (http://www.neo4j.com/)PARENT_…SUBDOM…RETURNSDEPLOYED_ONRESOLVESCONTAINSCONTAINSREDIRECTS IP Domain Certificate \ No newline at end of file diff --git a/index.md b/index.md index 86d825b..76be287 100644 --- a/index.md +++ b/index.md @@ -9,7 +9,7 @@ Additional material for the TMA Submission *Propagating Threat Scores With a TLS To supplement our paper, we provide the following additional contributions: -- A graph parsing pipeline that can be used to construct an ITEG -- Example ITEG created from the Tranco top list as input and two blocklists -- The message-passing based PTP implementation used in the paper, and the computed scores on the example ITEgraph -- The Figures from the paper as interactive plots. +- A [graph parsing pipeline]({{ site.baseurl }}{% link _pages/pipeline.md %}) that can be used to construct an ITEG +- Example ITEG created from the Tranco top list and two blocklists +- The [message-passing based PTP implementation]({{ site.baseurl }}{% link _pages/ptp.md %}) used in the paper, and the computed scores on the example ITEG +- The [Figures]({{ site.baseurl }}{% link _pages/figures.md %}) from the paper as interactive plots.