Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New hostnames / usernames checked for by malware #227

Open
recvfrom opened this issue Feb 12, 2021 · 4 comments
Open

New hostnames / usernames checked for by malware #227

recvfrom opened this issue Feb 12, 2021 · 4 comments

Comments

@recvfrom
Copy link
Contributor

recvfrom commented Feb 12, 2021

New hostname / username that we could add to the known_usernames and known_hostnames checks:

Hostnames checked for by OSTap [1]

VBOX7-PC
JANUSZ-PC
ABBY-PC
DESKTOP-HRW10
AMAZING-LINGON
SANDBOX-O365

Usernames checked for by OSTap [1]

Aimy
fred
Brad

[1] https://twitter.com/GossiTheDog/status/1357019072534355970 (or see https://gist.github.com/kirk-sayre-work/82cdc8f8faba929259bacb8ecea22162)

From ObliqueRAT [2], blocklisted keywords for username and computer name:

15pb
7man2
stella
f4kh9od
willcarter
biluta
ehwalker
hong lee /* Already covered */
joe cage
jonathan
kindsight
malware
peter miller
petermiller
phil
rapit
r0b0t
cuckoo
vm-pc
analyze
roslyn
vince
test
sample
mcafee
vmscan
mallab
abby
elvis
wilbert
joe smith
hanspeter
johnson
placehole
tequila
paggy sue
klone
oliver
stevens
ieuser
virlab
beginer
beginner
markos
semims
gregory
tom-pc
will carter
angelica
eric johns
john ca
lebron james
rats-pc
robot
serena
sofynia
straz
bea-ch

[2] https://blog.talosintelligence.com/2021/02/obliquerat-new-campaign.html

@gsuberland
Copy link
Collaborator

These are far too likely to cause false positives for my liking. Checking if "sandbox" is in the computer name is reasonable, though.

@recvfrom
Copy link
Contributor Author

recvfrom commented Mar 8, 2021

Assuming most people use al-khaser to assess ways in which their malware analysis environment might be susceptible to detection, it seems like it'd be useful for them to be made aware if their environment is using one of these generic user names or host names. It's likely that most or all of these hostnames/usernames are ones that the malware author observed and determined to likely be sandbox-related, and assuming that's correct, it would be especially useful for those sandbox maintainers to be alerted to this if they happened to run al-khaser.

For example, ABBY-PC is likely related to this finding, from Empirical Study to Fingerprint Public Malware
Analysis Services
(https://robotica.unileon.es/vmo/pubs/cisis2017.pdf):

image

@gsuberland
Copy link
Collaborator

I think I'd be more inclined to agree if we had strong empirical data on the matter. I'm unconvinced that the research paper offers any real value in that regard:

  • It's 4 years old, so if "abby" was a popular username for certain sandboxes then there's every likelihood that the publication of this fact has caused sandboxes to change their usernames.
  • They say they sent 1500 samples to the services, but don't show what the submission distribution was. Did they send more samples to the 26 services that started producing results? More to the 44 that didn't respond? Equal numbers to all 70?
  • They discarded some responses since they had "no value" for fingerprinting, but they haven't shown their criteria. They also don't say how many of the results were discarded, or whether their count of 7680 responses is before or after discarding.
  • There's a part where they say they "normalised responses to have an even response distribution across PMAP". The exact meaning of this is unclear, even from context. Perhaps they mean that they population-skewed the distribution statistics by the reciprocal of the response count. However, this requires careful analysis to ensure that single-response outliers don't end up being excessively weighted, and they have not provided details on their process.
  • They make considerations for whether or not the service is a "metaservice" (i.e. a service that calls out to other services) but don't make any attempt to deduplicate numbers for known crossover between service responses based on published information. They make an arbitrary judgement that any service with a ratio over 5 is probably a metaservice, but then do nothing concrete with that information - they just colour the graph bar for the ratios. I'd complain that their heuristic treats any service that runs the sample on a bunch of slightly different Windows machines for comparison as a metaservice, even though it clearly isn't, but that's not really a problem since they don't actually do anything with that other than speculate.
  • Further to the previous issue, they make no attempt to identify different services that provide the exact same results because they are just different names for the same engine.
  • There's no analysis of variance or homogeneity of replies based on repeated submissions to the same service, nor is there any similar analysis for multiple responses coming back from the same service.
  • The graph of response ratios is confusing. They sent 1500 samples and got 7680 responses. The sum of the bar values should either be 7680 if they're showing exact response counts, 1500 if they've normalised it down to proportion of samples sent, 5.12 if they normalised to the amplification factor (7680/1500), or some round number like 100 if they're using percent. The sum of the bar values does not match up with any of these numbers. This further calls the "ratio of 5" heuristic for possible metaservices into question.
  • Displaying the distribution data as a pie chart hides any population effects. Ideally they'd have included a table for each metric showing the response distribution from each service, or at the very least a stacked bar graph with one bar per service and response counts on the vertical axis.
  • Critically, it appears impossible that they have sufficiently decorrelated the artifact distributions vs. the population of service responses, since any analysis of duplicate data is missing. In the case of artifact amplification due to common backends, or excessively weighted outliers, a particular artifact may appear to have greater significance than it does in reality.
  • No raw data, no detailed methods, no analysis scripts, no probe sample source code.

TL;DR - the paper is massively flawed.

I agree on your point about al-khaser's use-case for assessing malware analysis environments, but I'm not sure that we want to include a particularly weak and false-positive heavy check just because one RAT happened to use it once, and especially not when it appears to have been based on one single artifact distribution from a not-very-good paper.

We also have to consider temporality in this situation - anything that becomes widely known to be a sandbox username is likely to be changed within a matter of months, leading to a constant expansion of the "known" sandbox names until the defenders end up randomising usernames or the heuristic becomes so bloated that its false positive rate outweighs its true positive rate.

All that said, I'm absolutely not against a more generic set of "soft heuristics" that might potentially indicate an analysis environment. From a UX perspective it'd have to be clearly labelled as something that is only weakly correlated and liable to cause false positives.

@recvfrom
Copy link
Contributor Author

recvfrom commented Mar 8, 2021

Great points regarding the paper. This OSTap behavior is relatively new AFAIK, so it seems plausible to me that a sandbox still uses 'abby' as the username. Assuming that's true, given that knowledge of this username being associated with sandboxes has been out for 4 years then maybe it's also an indication that the ones managing this sandbox are unlikely to run al-khaser there to learn of this deficiency even if the functionality was implemented. x_x

Regarding the arms race, I go back and forth on it... Adding this type of information in a centralized place like al-khaser does mean that attackers have to do less work to find ways of detecting sandboxes, so more actors might start leveraging these techniques. On the other hand, it is difficult for defenders to track all the one-off techniques that individual actors are using, and by aggregating information that's already been published on it makes it easier for defenders to identify deficiencies in their environment. Regardless, I agree that it has the potential to speed up the "churn" of sandbox usernames. I think defenders have a decent move, though - use a username like 'Administrator' or maybe randomly cycle through a list of extremely common usernames so that malware would become much less reliable if it chose to not infect machines based on that heuristic.

I like the "soft heuristics" idea - that's probably a good way to balance any concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants