Yara writeup – TryHackMe

YARA (Yet Another Ridiculous/Recursive Acronym) is a great open-source resource (and language) designed for creating and sharing pattern matching rules. One of the most popular uses for YARA rules is to identify and classify files or data based on specific patterns or characteristics, specifically in malware research and digital forensics.

The Yara room in TryHackMe covers the basics of what is YARA, how to use Yara rules to match strings in some given files, how to use some tools like LOKI to manage and compare YARA rules in bulk, and how to create your own rules with yarGen.

I didn’t have much hands-on experience workign with Yara rules, so going through this room was a great learning experience. I always thought that the best way to apply your learning is to share your learning and findings with others. I’ll share the steps needed to complete each question here.

Task 1 – Introduction

This task was an introduction to the room and included no questions.

Task 2 – What is Yara?

What is the name of the base-16 numbering system that Yara can detect?
hexadecimal

This answer can be found in one of the first lines of the room:

“The pattern matching swiss knife for malware researchers (and everyone else)” (Virustotal., 2020)

With such a fitting quote, Yara can identify information based on both binary and textual patterns, such as hexadecimal and strings contained within a file.

Would the text “Enter your Name” be a string in an application? (Yay/Nay)
Yay

This answer can be understood from the definition of Yara Rules and patterns it can use:

Strings are a fundamental component of programming languages. Applications use strings to store data such as text.

Task 3 – Deploy

This task included a guide on how to connect to a preconfigured machine which contains some example files we will be analysing using Yara and Loki. You can connect to the machine through the in-browser Connect Machine button, or by using SSH when connected to the TryHackMe VPN.

This task didn’t include any questions, so we’ll move on.

Task 4 – Introduction to Yara Rules

In this room, we will create some Yara rules and start using them on some small example files. To complete this room, we will need to follow some steps:

  1. Make a file named “somefile“:
Creating a file named somefile
cmnatic@thm:~$ touch somefile

2. Create a new file and name it “myfirstrule.yar“:

Creating a file named myfirstrule.yar
cmnatic@thm:~$ touch myfirstrule.yar

3. Open the”myfirstrule.yar” using a text editor such as nano and input the following snippet:

rule examplerule {
        condition: true
}

Remember that you can save changes to nano by pressing CTR + O, and exit the nano editor by pressing CTR + X.

In this snippet, we can see the value examplerule. This is the name of the rule. Inside the rule, there is only one condition (with the name condition). What this rule will do is to see if the file/directory/PID that we define exists by checking if condition:true. If the file exists, we are given the output of examplerule.

We can now test if this rule work by using the rule against the file “somefile” that we created before – If “somefile” exists, Yara will say “examplerule” because the condition has been met:

Checking the behavior of examplerule
cmnatic@thm:~$ yara myfirstrule.yar somefile
examplerule somefile

We can also test what the behaviour would be if we use this rule against a file that does not exist, like “someothertextfile“:

Checking the behaviour of an unmet condition
cmnatic@thm:~$ yara myfirstrule.yar someothertextfile
error scanning someothertextfile: could not open file

This room had no questions beyond the practical exercise above.

Task 5 – Expanding on Yara Rules

This task expands the previous task, and will go through the multiple conditions Yara rules can include. You can read about all the possible conditions here. The room also demonstrates some of the examples of these conditions:

Meta

This section of a Yara rule is reserved for descriptive information by the author of the rule, and is used to include a short description, to summarise what the rule checks for, the date when the rule was created, etc. Any value included in this section will not influence the rule itself.

Strings

We can use strings to search for a specific text or hexadecimal string in files or programs. For example, if we wanted to search a driectory for all files containing “Hello World!”, we would create the following rule:

rule helloworld_checker{
	strings:
		$hello_world = "Hello World!"
}

In the rule above, we define the keyword Strings, where we’ll define the string we want to search for. In this scenario, the string “Hello World!” is stored within the variable $hello_world. We can now define a condition, requiring the variable $hello_world to exist for the condition to be true:

rule helloworld_checker{
	strings:
		$hello_world = "Hello World!"

	condition:
		$hello_world
}

The rule above would be true for any file that has the string “Hello World!”. However, strings are case sensitive, meaning that this condition would not be met if the string was “hello world” or “HELLO WORLD”. If we wanted to search for any of these ooptions, we would increase the scope of our rule by including multiple strings, and using the condition any of them:

rule helloworld_checker{
	strings:
		$hello_world = "Hello World!"
		$hello_world_lowercase = "hello world"
		$hello_world_uppercase = "HELLO WORLD"

	condition:
		any of them
}

Now, any file with any of these strings (Hello World! // hello world // HELLO WORLD) will trigger the rule.

Conditions

We have already used conditions like true or any of them. Similar to regular programming, there are some operators as part of the rule conditions, such as:

  • <= Less than or equal to
  • >= More than or equal to
  • != Not equal to

One of the examples the room provides is the following:

rule helloworld_checker{
	strings:
		$hello_world = "Hello World!"

	condition:
        #hello_world <= 10
}

In this rule, the condition #hello_world <= 10 indicates that the rule would only trigger if the string “Hello World!” appears in the file at least 10 or more times.

Combining keywords

Yara rules also allow for keywords such as

  • and
  • not
  • or

We can use these keywords to combine multiple conditions, like the example below:

rule helloworld_checker{
	strings:
		$hello_world = "Hello World!" 
        
        condition:
	        $hello_world and filesize < 10KB 
}

This rule will check both if the string “Hello World!” is present, as well as if the size containing this string is under 10KB in size. The rule will only trigger if both conditions are valid. The room also provides the following cheatsheet, Anatomy of a Yara rule, created by Security Researcher Thomas Roccia (@fr0gger_):


This room also did not have any practical questions besides the exercises and examples included in the task itself.

Task 6 – Yara modules

This task just describes some frameworks such as Cuckoo Sandbox or Python’s PE Module – both of which can allow to generate Yara rules based on their scan and sandbox features. The task doesn’t go into practical examples and this is something that is beyond the scope of this room. Further details on the structure of Python’s PE Module and its use are covered in a separate TryHackMe room – MAL: Malware Introductory


This task does not include any other questions.

Task 7 – Other tools and Yara

This task will introduce us to other tools that include Yara rules to leverage their threat hunting capabilities.

LOKI

LOKI is a free open-source IOC (Indicator of Compromise) scanner created and written by Florian Roth – The creator of YARA. LOKI detection is based on 4 main methods:

  1. File name IOC check
  2. Yara Rule check
  3. Hash Check
  4. C2 (Command and control server) back connect check
THOR

THOR Lite is also developed by Florian Roth – It is a newer multi-platform IOC and YARA scanner. While the THOR service is designed for corporate customers and professionals, THOR Lite is free and include most of the core features.

FENRIR

This 3rd tool is also created by Florian Roth. FENRIR is a light weight bash script, which can be run on any system capable of running bash.

YAYA

Created by the EFF (Electronic Frontier Foundation), YAYA (Yet Another Yara Automation) is a new open-source tool created to help researchers manage multiple YARA rule repositories.


This task does not include any follow-up questions.

Task 8 – Using LOKI and its Yara rule set

In this task, we’ll take what we learnt in the previous tasks and put it into practice. We will be using the preconfigured machine provided by TryHackMe (Which already includes LOKI), and we will use the predefined Yara rules to scan some files inside that machine.

  1. First, we’ll need to navigate to the Loki directory. Loki is located inside tools.
Listing the tools directory
cmnatic@thm-yara:~/tools$ ls
Loki  yarGen

We can open the directory Loki by running cd Loki, and then run python loky.py -h to see what options are available:

We can find the different signature bases, including the directory containing all preconfigured yara rules by navigating to /tools/Loki/signature-base/yara:

In this machine, we are also given a folder called suspicious-files, which contains 2 different files (file1 and file2). We can navigate to the file1 directory, where we see there is a file called ind3x.php:

We should be able to run Loki against this file by calling loki.py with the following command:

Using Loki to scan suspicious file
python ../../tools/Loki/loki.py -p .

Loki output:


This task gives us a specific scenario and multiple questions:

Scenario

You are the security analyst for a mid-size law firm. A co-worker discovered suspicious files on a web server within your organization. These files were discovered while performing updates to the corporate website. The files have been copied to your machine for analysis. The files are located in the suspicious-files directory. Use Loki to answer the questions below:

Question 1
Scan file 1. Does Loki detect this file as suspicious/malicious or benign?
Suspicious

We can obtain this information from one of the last lines of the scan:

Question 2
What Yara rule did it match on?
webshell_metaslsoft

We can obtain this name from the MATCH: value in the scan.

Question 3
What does Loki classify this file as?
Web Shell

This is defined in the “DESCRIPTION:” field.

Question 4
Based on the output, what string within the Yara rule did it match on?
Str1

We can find this value in the “MATCHES:” field of the scan.

Question 5
What is the name and version of this hack tool?
b374k 2.2

These values are found in the “FIRST_BYTES:” field.

Question 6
Inspect the actual Yara file that flagged file 1. Within this rule, how many strings are there to flag this file?
1

This is the question that I found a bit trickier, since we don’t get much context on where to look for these strings.

As the question title mentions, we should find the Yara file that flagged file 1 (ind3x.php). In one of the [INFO] outpouts of Loki, we see the YARA rules processed are hosted in /home/cmnatic/tools/Loki/signature-base/yara:

If we access that directory and search for webshell rules (by using ls | grep webshell), we’ll find 5 YARA rules mentioning webshell:

At this point, and following the mythology theme of YARA rules and tools, I decided to start with thor-webshells.yar. Using nano again to open the file, and then pressing F6 to search in nano, I searched for the known Yara rule in question 2, webshell_metaslsoft. We can see this rule includes only one string, $s7:

Question 7
Scan file 2. Does Loki detect this file as suspicious/malicious or benign?
Benign

We can scan file 2 using the same syntax we used to scan file 1. Navigating to the file2 directory, we see there’s a single file called 1ndex.php:

We can now launch Loky again by using the following string:

Using Loki to scan file 2 (1ndex.php)
python ../../tools/Loki/loki.py -p .

The results look clean and looks like Loki didn’t find anything suspicious or malicious:

Question 8
Inspect file 2. What is the name and version of this web shell?
b374k 3.2.3

Using head, I listed the first few lines of 1ndex.php, which included the name and version of this shell, as well as its author:

Using head to list the first few lines of 1ndex.php
head 1ndex.php

Task 9 – Creating Yara rules with yarGen

From the previous task, we can infer we have a file that Loki did not detect as malicious/suspicious. Task 9 will be a hands-on take on how to create a new Yara rule to detect this specific shell in our server.

As the suspicious file 2 (1ndex.php) has over 3500 lines of code, it may be very hard and/or time consuming to sift through them trying to find a string we could use for this specific shell. Instead of this manual process, we can use yarGen to automatically generate Yara rules. This rule will create yara rules from strings found in malware, while removing all strings that also appear in non-malware files (drastically reducing false positives in our scans).

The goal of this task is to use yarGen to generate a Yara rule for file 2 (1ndex.php). We can do this by navigating to the /tools/yarGen directory and running the following command:

Using yarGen to generate a Yara rule for file 2 (1ndex.php)
python3 yarGen.py -m /home/cmnatic/suspicious-files/file2 --excludegood -o /home/cmnatic/suspicious-files/file2.yar 

Quick breakdown of the parameters listed above:

  • -m is the path to the files you want to generate rules for
  • --excludegood force to exclude all goodware strings (these are strings found in legitimate software and can increase false positives)
  • -o location & name you want to output the Yara rule

With this, yarGen should start generating this rule for us:

And if we followed the right steps, we should see yarGen generating 1 SIMPLE rule for us at the end of its output:

           [=] Generated 1 SIMPLE rules.
           [=] All rules written to /home/cmnatic/suspicious-files/file2.yar
           [+] yarGen run finished

Question 1
From within the root of the suspicious files directory, what command would you run to test Yara and your Yara rule against file 2?
yara file2.yar file2/1ndex.php
Question 2
Did Yara rule flag file 2? (Yay/Nay)
Yay

Based on the output we obtained in question 1 (_home_cmnatic_suspicious_files_file2_1ndex file2/1ndex.php), we can see there was a match for our new Yara rule.

Question 3
Copy the Yara rule you created into the Loki signatures directory.
cp file2.yar /home/cmnatic/tools/Loki/signature-base/yara/

Our new Yara rule is file2.yar, which is hosted insite the suspicious-files directory. I used cp to copy this file inside /Loki/signature-base/yara/

Question 4
Test the Yara rule with Loki, does it flag file 2? (Yay/Nay)
Yay

Since we are already in the file2 directory, we can launch Loki by providing the right path to the Loki tool:

Launching Loki from the file2 directory
python /home/cmnatic/tools/Loki/loki.py -p .

In the output, we can see a suspicious object was detected:

Question 5
What is the name of the variable for the string that it matched on?
Zepto

We can find this value in the MATCHES: output

Question 6
Inspect the Yara rule, how many strings were generated?
20

Similar to Question 6 in the previous task (Task 8), we can use nano to open our new rule (file2.yar):

Opening file2.yar from our current directory
nano /home/cmnatic/tools/Loki/signature-base/yara/file2.yar

Since this is a shorter rule, we don’t need to use the F6 search function in nano. From the main nano view, we can see there are 20 strings:

Question 7
One of the conditions to match on the Yara rule specifies file size. The file has to be less than what amount?
700KB

We can see this right under the 20 strings from the previous question, by also opening file2.yar with nano:


Task 10 – Valhalla

Task 10 introduces us to Valhalla, an online Yara feed where we will be able to query specific keywords,ATT&CK techniques, sha256 signatures, rule names and much more. We will be using Valhalla to search for our suspicious files and obtain additional intel.

Question 1
Enter the SHA256 hash of file 1 into Valhalla. Is this file attributed to an APT group? (Yay/Nay)
Yay

We can find the SHA256 hash in the Loki output from when we first scanned file1:

Searching for this value (5479f8cd1375364770df36e5a18262480a8f9d311e8eedb2c2390ecb233852ad) in Valhalla, we can see there are 7 results, one of which mentions this tool is part of a Chinese APT group toolset:

Question 2
Do the same for file 2. What is the name of the first Yara rule to detect file 2?
Webshell_b374k_rule1

Using the SHA256 hash from file2 (53fe44b4753874f079a936325d1fdc9b1691956a29c3aaf8643cdbd49f5984bf) in Valhalla, we see there are 4 results. Assuming the “first Yara rule to detect” question means the first rule created (or the rule with an oldest date),the answer would be Webshell_b374k_rule1:

Question 3
Examine the information for file 2 from Virus Total (VT). The Yara Signature Match is from what scanner?
THOR APT Scanner

We can access VirusTotal directly from our Valhalla search:

In the VirusTotal search results page, we can see all matches are from the THOR APT Scanner:

Question 4
Enter the SHA256 hash of file 2 into Virus Total. Did every AV detect this as malicious? (Yay/Nay)
Nay

We can use VirusTotal’s own search for file2 SHA256 signature (53fe44b4753874f079a936325d1fdc9b1691956a29c3aaf8643cdbd49f5984bf):

In the search results page, we can see there are many vendors who did not detect this as a malicious hash:

Question 5
Besides .PHP, what other extension is recorded for this file?
EXE

This question threw me off a little bit, as Virus total detected multiple file extensions besides .php and .exe (the expected answer). Other extensions listed are .sys, .php5, .html.

We can see the list of known names and extensions inside VirusTotal by clicking on the Details tab:

Question 6
What JavaScript library is used by file 2?
Zepto

To find the answer to this question, we will have to pivot from VirusTotal to GitHub, as we will need to find as much info as we can about the code, requirements and libraries used by this shell. Lucky for us, Valhalla does a great job by also linking our search to the known GitHub page for the matching rule hash:

In the GitHub page for the b374k shell, we can see that one of the main requirements mentions this shell uses zepto.js v1.1.2:

Question 7
Is this Yara rule in the default Yara file Loki uses to detect these type of hack tools? (Yay/Nay)
Nay

If we remember Question 7 from Task 8, our first Loki scan of file2 flagged it as benign, and was not part of the default Yara rules included in Loki. However, now that we have some more context about the name of the rule (Webshell_b374k_rule1), we can also search in the /Loki/signature-base/yara rule collection for this specific rule:

Searching for Webshell_b374k_rule1 in Loki’s Yara rule repository
ls /home/cmnatic/tools/Loki/signature-base/yara/ | grep "Webshell_b374k_rule1"

This search would show no results:


Task 11 – Conclusion

This concludes our walkthrough of the YARA room – During this room, we explored Yara, how to use this tool, as well as how to manually create basic Yara rules. We also learned how to use some open-source tools like Loki and yarGen to automatically generate new rules (like we did for file 2). These are incredible useful resources that a blue teamer would use when investigating a malicious or suspicious file.

Many open-source and enterprise solutions rely on Yara rules to enrichen their detection, and this room was a great introduction to Yara rules, how they work and how to get familiar with them.

Leave a Reply

Your email address will not be published. Required fields are marked *