The queen bee controls her very own DECEIVE honeypot from within.
Introduction
Earlier this month, Splunk introduced their new tool for building out honeypots using a program called DECEIVE, DECeption with Evaluative Integrated Validation Engine. David Bianco created DECEIVE with the Splunk SURGe team, where he conducts research in incident detection and response, threat hunting, and cyber threat intelligence (CTI). This program demonstrates the potential of AI in developing cybersecurity tools and solutions and is the first to useAI to generate honeypots. DECEIVE is currently not ready for production-level environments, but does show a lot of promise. It is something that you can use for inspiration in building your honeypots, and you can use it in your own homelab and educational environments. This could be a great way for Collegiate Cyber Defense Competition (CCDC) teams to deploy during the competition to catch red teaming activity. There are a lot of applications for using the DECEIVE program.
Background
The DECEIVE program leverages deception technology to enhance threat intelligence using honeypots. Honeypots are decoy systems designed to attract and deceive attackers, making attackers think that they made it into a company's system. This allows security teams to monitor malicious activity, gather intelligence and improve defenses without risking real assets.
By deploying these traps, DECEIVE creates an environment where adversaries unknowingly reveal their tactics, techniques and procedures (TTPs), providing valuable insight into emerging threats. This proactive approach fits seamlessly within the broader scope of deception technology, which focuses on misleading attackers and slowing their progress while defenders gain the upper hand.
“Using honeypots not only strengthens cybersecurity postures but also enhances threat hunting capabilities, making it a powerful tool for organizations looking to stay ahead of adversaries.”
Want to start Threat Hunting? Start creating honeypots!
Setting up Splunk DECEIVE
Requirements
The DECEIVE program runs on a Unix system, so you’ll need a Unix OS or Windows using Subsystem for Linux to install it on. If you’re setting up a virtual machine (VM), I recommend giving it enough space and RAM for the program to run. For my system, I used Pop! OS, I gave it 50GB hard disk space, 8 processors and 16 GB memory. If you don’t have enough disk space, you’ll run into an error when you install the requirements, and if you don’t have enough memory, the program will hang and not be able to do the AI generation that you ask it to do.
Assuming that you’re going to use the OpenAI LLM model. You’ll need to set up an API key with OpenAI, each time you run the program it costs ~$0.03, so you should be good with loading the account with $5.00 (the minimum you can add) and then if you need more add more later, that 5 dollars should last you a long time for this project. If you decide to go with a different backend Large Language Model (LLM), you’ll need to update the SSH/config.ini file, acquire the appropriate API key and fund your account.
Installation Steps
Once you have the requirements taken care of, you’re ready to set up your DECEIVE honeypot.
In your Unix based virtual machine, clone the DECEIVE repository from GitHub.
Follow the setup instructions in the README and the documentation in the SSH/config.ini.TEMPLATE file to create SSH keys, configure users and passwords, or change the backend LLM (any OpenAI, Google, or AWS Bedrock model will work).
Set any environment variables your LLM backend requires (e.g., OPENAI_API_KEY for the default GPT-4o backend).
Modify the SSH/prompt.txt file to tell it what kind of system you'd like to emulate.
Run it in a lab environment to see how it simulates interactions and generates detailed session summaries.
By default, the system will listen on port 8022/TCP for incoming SSH connections. On a UNIX or Linux system, you can log in with a command like the following:
ssh guest@localhost -p 8022
Note that the config file specifies that the guest account has an empty password, so you won't be prompted to enter one. Set one in the config file if you like.
(Credit from “How to get started” Splunk)
I’ve made a detailed instructional video that you can follow along with to see how to install and set up the program as well as showing the output of what the AI creates for the honeypots, as well as the resulting log files in this YouTube video.
Crafting effective prompts
When crafting a prompt make sure to be specific, the more details of what you want, the better the results. With the prompts here describing users, providing more information leads to more personalized results and generates additional usable data in the honeypot. You can include expected data, users, folders and more.
Keep in mind that the company may be vast and can have multiple departments with users in different roles, these users are also humans with different personalities and interests. Feel free to experiment with different prompts and see what you like.
Harvard provides a list of suggestions for writing out AI generated prompts.
Here is an example of the results of the honeypot with a prompt that included a lot of details.
When logging in to the honeypot, there are a variety of documents in the Documents folder, we see folders for Accounting, Game Development, Rust Projects, Templates and Videos personas and reviewing the meeting_notes.txt file, we can see information about project updates, action items and open discussions. Each time you login, there will be different information.
Example of honeypot files and folders.
Logging 🪵
Logging is a critical component of honeypots, as it provides detailed records of attacker activity, helping security teams analyze threats and improve defenses. Since honeypots are designed to attract malicious actors, comprehensive logging ensures that every interaction, such as login attempts and executed commands are captured for analysis. This information is invaluable for understanding the attackers tactics, techniques and procedures (TTPs), which allows security teams to refine their detection and response strategies.
The logs generated by DECEIVE include a lot of information about the activities that take place by the actor that remotes into the honeypot. The logs contain fields such as task_name, which will contain a string that starts with ‘session-{random numbers}’ that will be a unique identifier to all entries from a single SSH session. A message field will tell you the type of entry it is, whether that’s SSH connection received or closed, authentication success, user input or LLM response, or finally a Session summary. The session summary will contain an AI analysis of the activities performed, as well as a judgement that the AI determines if the activities performed in the honeypot were "BENIGN", "SUSPICIOUS", or "MALICIOUS". As you can tell in this screenshot, the exploratory activities were SUSPICIOUS.
Ssh_log.log example highlighting the AI generated session summary.
Timestamps are always in UTC.
Conclusion
The DECEIVE program that creates AI generated honeypots is a cool new tool to help you make honeypots, although it is not yet ready for production, it shows the potential there is for using AI to build out your honeypots and make life easier. I encourage you, dear reader, to give it a try yourself, and if you do, leave a comment and let us know what you think.
💥
Will be playing with this when I get some time. Thanks for sharing your experience with it!