YARA (Yet Another Ridiculous/Recursive Acronym) is a great open-source resource (and language) designed for creating and sharing pattern matching rules. One of the most popular uses for YARA rules is to identify and classify files or data based on specific patterns or characteristics, specifically in malware research and digital forensics.
The Yara room in TryHackMe covers the basics of what is YARA, how to use Yara rules to match strings in some given files, how to use some tools like LOKI to manage and compare YARA rules in bulk, and how to create your own rules with yarGen.
I didn’t have much hands-on experience workign with Yara rules, so going through this room was a great learning experience. I always thought that the best way to apply your learning is to share your learning and findings with others. I’ll share the steps needed to complete each question here.
Task 1 – Introduction
This task was an introduction to the room and included no questions.
Task 2 – What is Yara?
This answer can be found in one of the first lines of the room:
“The pattern matching swiss knife for malware researchers (and everyone else)” (Virustotal., 2020)
With such a fitting quote, Yara can identify information based on both binary and textual patterns, such as hexadecimal and strings contained within a file.
This answer can be understood from the definition of Yara Rules and patterns it can use:
Strings are a fundamental component of programming languages. Applications use strings to store data such as text.
Task 3 – Deploy
This task included a guide on how to connect to a preconfigured machine which contains some example files we will be analysing using Yara and Loki. You can connect to the machine through the in-browser Connect Machine button, or by using SSH when connected to the TryHackMe VPN.
This task didn’t include any questions, so we’ll move on.
Task 4 – Introduction to Yara Rules
In this room, we will create some Yara rules and start using them on some small example files. To complete this room, we will need to follow some steps:
- Make a file named “somefile“:
cmnatic@thm:~$ touch somefile
2. Create a new file and name it “myfirstrule.yar“:
cmnatic@thm:~$ touch myfirstrule.yar
3. Open the”myfirstrule.yar” using a text editor such as nano
and input the following snippet:
rule examplerule {
condition: true
Remember that you can save changes to nano
by pressing CTR + O
, and exit the nano
editor by pressing CTR + X
In this snippet, we can see the value examplerule
. This is the name of the rule. Inside the rule, there is only one condition (with the name condition
). What this rule will do is to see if the file/directory/PID that we define exists by checking if condition:true
. If the file exists, we are given the output of examplerule
We can now test if this rule work by using the rule against the file “somefile” that we created before – If “somefile” exists, Yara will say “examplerule” because the condition has been met:
cmnatic@thm:~$ yara myfirstrule.yar somefile
examplerule somefile
We can also test what the behaviour would be if we use this rule against a file that does not exist, like “someothertextfile“:
cmnatic@thm:~$ yara myfirstrule.yar someothertextfile
error scanning someothertextfile: could not open file
This room had no questions beyond the practical exercise above.
Task 5 – Expanding on Yara Rules
This task expands the previous task, and will go through the multiple conditions Yara rules can include. You can read about all the possible conditions here. The room also demonstrates some of the examples of these conditions:
This section of a Yara rule is reserved for descriptive information by the author of the rule, and is used to include a short description, to summarise what the rule checks for, the date when the rule was created, etc. Any value included in this section will not influence the rule itself.
We can use strings to search for a specific text or hexadecimal string in files or programs. For example, if we wanted to search a driectory for all files containing “Hello World!”, we would create the following rule:
rule helloworld_checker{
$hello_world = "Hello World!"
In the rule above, we define the keyword Strings
, where we’ll define the string we want to search for. In this scenario, the string “Hello World!” is stored within the variable $hello_world
. We can now define a condition, requiring the variable $hello_world
to exist for the condition to be true:
rule helloworld_checker{
$hello_world = "Hello World!"
The rule above would be true for any file that has the string “Hello World!”. However, strings are case sensitive, meaning that this condition would not be met if the string was “hello world” or “HELLO WORLD”. If we wanted to search for any of these ooptions, we would increase the scope of our rule by including multiple strings, and using the condition any of them
rule helloworld_checker{
$hello_world = "Hello World!"
$hello_world_lowercase = "hello world"
$hello_world_uppercase = "HELLO WORLD"
any of them
Now, any file with any of these strings (Hello World! // hello world // HELLO WORLD) will trigger the rule.
We have already used conditions like true
or any of them
. Similar to regular programming, there are some operators as part of the rule conditions, such as:
Less than or equal to>=
More than or equal to!=
Not equal to
One of the examples the room provides is the following:
rule helloworld_checker{
$hello_world = "Hello World!"
#hello_world <= 10
In this rule, the condition #hello_world <= 10 indicates that the rule would only trigger if the string “Hello World!” appears in the file at least 10 or more times.
Combining keywords
Yara rules also allow for keywords such as
- and
- not
- or
We can use these keywords to combine multiple conditions, like the example below:
rule helloworld_checker{
$hello_world = "Hello World!"
$hello_world and filesize < 10KB
This rule will check both if the string “Hello World!” is present, as well as if the size containing this string is under 10KB in size. The rule will only trigger if both conditions are valid. The room also provides the following cheatsheet, Anatomy of a Yara rule, created by Security Researcher Thomas Roccia (@fr0gger_):

This room also did not have any practical questions besides the exercises and examples included in the task itself.
Task 6 – Yara modules
This task just describes some frameworks such as Cuckoo Sandbox or Python’s PE Module – both of which can allow to generate Yara rules based on their scan and sandbox features. The task doesn’t go into practical examples and this is something that is beyond the scope of this room. Further details on the structure of Python’s PE Module and its use are covered in a separate TryHackMe room – MAL: Malware Introductory
This task does not include any other questions.
Task 7 – Other tools and Yara
This task will introduce us to other tools that include Yara rules to leverage their threat hunting capabilities.
LOKI is a free open-source IOC (Indicator of Compromise) scanner created and written by Florian Roth – The creator of YARA. LOKI detection is based on 4 main methods:
- File name IOC check
- Yara Rule check
- Hash Check
- C2 (Command and control server) back connect check
THOR Lite is also developed by Florian Roth – It is a newer multi-platform IOC and YARA scanner. While the THOR service is designed for corporate customers and professionals, THOR Lite is free and include most of the core features.
This 3rd tool is also created by Florian Roth. FENRIR is a light weight bash script, which can be run on any system capable of running bash.
Created by the EFF (Electronic Frontier Foundation), YAYA (Yet Another Yara Automation) is a new open-source tool created to help researchers manage multiple YARA rule repositories.
This task does not include any follow-up questions.
Task 8 – Using LOKI and its Yara rule set
In this task, we’ll take what we learnt in the previous tasks and put it into practice. We will be using the preconfigured machine provided by TryHackMe (Which already includes LOKI), and we will use the predefined Yara rules to scan some files inside that machine.
- First, we’ll need to navigate to the Loki directory. Loki is located inside
cmnatic@thm-yara:~/tools$ ls
Loki yarGen
We can open the directory Loki by running cd Loki
, and then run python loky.py -h
to see what options are available:

We can find the different signature bases, including the directory containing all preconfigured yara rules by navigating to /tools/Loki/signature-base/yara

In this machine, we are also given a folder called suspicious-files, which contains 2 different files (file1 and file2). We can navigate to the file1 directory, where we see there is a file called ind3x.php:

We should be able to run Loki against this file by calling loki.py with the following command:
python ../../tools/Loki/loki.py -p .
Loki output:

This task gives us a specific scenario and multiple questions:
You are the security analyst for a mid-size law firm. A co-worker discovered suspicious files on a web server within your organization. These files were discovered while performing updates to the corporate website. The files have been copied to your machine for analysis. The files are located in the suspicious-files
directory. Use Loki to answer the questions below:
Question 1
We can obtain this information from one of the last lines of the scan:

Question 2
We can obtain this name from the MATCH:
value in the scan.

Question 3
Web Shell
This is defined in the “DESCRIPTION:
” field.

Question 4
We can find this value in the “MATCHES:
” field of the scan.

Question 5
b374k 2.2
These values are found in the “FIRST_BYTES:
” field.

Question 6
This is the question that I found a bit trickier, since we don’t get much context on where to look for these strings.
As the question title mentions, we should find the Yara file that flagged file 1 (ind3x.php). In one of the [INFO] outpouts of Loki, we see the YARA rules processed are hosted in /home/cmnatic/tools/Loki/signature-base/yara

If we access that directory and search for webshell rules (by using ls | grep webshell
), we’ll find 5 YARA rules mentioning webshell:

At this point, and following the mythology theme of YARA rules and tools, I decided to start with thor-webshells.yar
. Using nano
again to open the file, and then pressing F6
to search in nano, I searched for the known Yara rule in question 2, webshell_metaslsoft
. We can see this rule includes only one string, $s7

Question 7
We can scan file 2 using the same syntax we used to scan file 1. Navigating to the file2
directory, we see there’s a single file called 1ndex.php

We can now launch Loky again by using the following string:
python ../../tools/Loki/loki.py -p .
The results look clean and looks like Loki didn’t find anything suspicious or malicious:

Question 8
b374k 3.2.3
Using head
, I listed the first few lines of 1ndex.php
, which included the name and version of this shell, as well as its author:
head 1ndex.php

Task 9 – Creating Yara rules with yarGen
From the previous task, we can infer we have a file that Loki did not detect as malicious/suspicious. Task 9 will be a hands-on take on how to create a new Yara rule to detect this specific shell in our server.
As the suspicious file 2 (1ndex.php
) has over 3500 lines of code, it may be very hard and/or time consuming to sift through them trying to find a string we could use for this specific shell. Instead of this manual process, we can use yarGen to automatically generate Yara rules. This rule will create yara rules from strings found in malware, while removing all strings that also appear in non-malware files (drastically reducing false positives in our scans).
The goal of this task is to use yarGen to generate a Yara rule for file 2 (1ndex.php
). We can do this by navigating to the /tools/yarGen
directory and running the following command:
python3 yarGen.py -m /home/cmnatic/suspicious-files/file2 --excludegood -o /home/cmnatic/suspicious-files/file2.yar
Quick breakdown of the parameters listed above:
is the path to the files you want to generate rules for--excludegood
force to exclude all goodware strings (these are strings found in legitimate software and can increase false positives)-o
location & name you want to output the Yara rule
With this, yarGen should start generating this rule for us:

And if we followed the right steps, we should see yarGen generating 1 SIMPLE rule for us at the end of its output:
[=] Generated 1 SIMPLE rules.
[=] All rules written to /home/cmnatic/suspicious-files/file2.yar
[+] yarGen run finished
Question 1
yara file2.yar file2/1ndex.php

Question 2
Based on the output we obtained in question 1 (_home_cmnatic_suspicious_files_file2_1ndex file2/1ndex.php
), we can see there was a match for our new Yara rule.
Question 3
cp file2.yar /home/cmnatic/tools/Loki/signature-base/yara/
Our new Yara rule is file2.yar
, which is hosted insite the suspicious-files directory. I used cp
to copy this file inside /Loki/signature-base/yara/
Question 4
Since we are already in the file2 directory, we can launch Loki by providing the right path to the Loki tool:
python /home/cmnatic/tools/Loki/loki.py -p .
In the output, we can see a suspicious object was detected:

Question 5
We can find this value in the MATCHES:

Question 6
Similar to Question 6 in the previous task (Task 8), we can use nano
to open our new rule (file2.yar):
nano /home/cmnatic/tools/Loki/signature-base/yara/file2.yar
Since this is a shorter rule, we don’t need to use the F6
search function in nano
. From the main nano view, we can see there are 20 strings:

Question 7
We can see this right under the 20 strings from the previous question, by also opening file2.yar
with nano

Task 10 – Valhalla
Task 10 introduces us to Valhalla, an online Yara feed where we will be able to query specific keywords,ATT&CK techniques, sha256 signatures, rule names and much more. We will be using Valhalla to search for our suspicious files and obtain additional intel.
Question 1
We can find the SHA256 hash in the Loki output from when we first scanned file1:

Searching for this value (5479f8cd1375364770df36e5a18262480a8f9d311e8eedb2c2390ecb233852ad
) in Valhalla, we can see there are 7 results, one of which mentions this tool is part of a Chinese APT group toolset:

Question 2
Using the SHA256 hash from file2 (53fe44b4753874f079a936325d1fdc9b1691956a29c3aaf8643cdbd49f5984bf
) in Valhalla, we see there are 4 results. Assuming the “first Yara rule to detect” question means the first rule created (or the rule with an oldest date),the answer would be Webshell_b374k_rule1:

Question 3
THOR APT Scanner
We can access VirusTotal directly from our Valhalla search:

In the VirusTotal search results page, we can see all matches are from the THOR APT Scanner:

Question 4
We can use VirusTotal’s own search for file2 SHA256 signature (53fe44b4753874f079a936325d1fdc9b1691956a29c3aaf8643cdbd49f5984bf

In the search results page, we can see there are many vendors who did not detect this as a malicious hash:

Question 5
This question threw me off a little bit, as Virus total detected multiple file extensions besides .php and .exe (the expected answer). Other extensions listed are .sys, .php5, .html.
We can see the list of known names and extensions inside VirusTotal by clicking on the Details tab:

Question 6
To find the answer to this question, we will have to pivot from VirusTotal to GitHub, as we will need to find as much info as we can about the code, requirements and libraries used by this shell. Lucky for us, Valhalla does a great job by also linking our search to the known GitHub page for the matching rule hash:

In the GitHub page for the b374k shell, we can see that one of the main requirements mentions this shell uses zepto.js v1.1.2:

Question 7
If we remember Question 7 from Task 8, our first Loki scan of file2 flagged it as benign, and was not part of the default Yara rules included in Loki. However, now that we have some more context about the name of the rule (Webshell_b374k_rule1
), we can also search in the /Loki/signature-base/yara
rule collection for this specific rule:
ls /home/cmnatic/tools/Loki/signature-base/yara/ | grep "Webshell_b374k_rule1"
This search would show no results:

Task 11 – Conclusion
This concludes our walkthrough of the YARA room – During this room, we explored Yara, how to use this tool, as well as how to manually create basic Yara rules. We also learned how to use some open-source tools like Loki and yarGen to automatically generate new rules (like we did for file 2). These are incredible useful resources that a blue teamer would use when investigating a malicious or suspicious file.
Many open-source and enterprise solutions rely on Yara rules to enrichen their detection, and this room was a great introduction to Yara rules, how they work and how to get familiar with them.