regex - Remove Duplicate Yara Rules with PowerShell Regular Expressions -



regex - Remove Duplicate Yara Rules with PowerShell Regular Expressions -

yara rules used observe malware applying regular expressions files specific patterns in binary. maintain of yara rules in 1 text file. new rules, paste them end of text file. i'm trying right powershell 2.0 script parse through yara rules , identify , remove duplicate entries.

here's format of yara rule:

rule [name] { [contents] }

here illustration rule:

rule crowdstrike_csit_14004_02 : loader backdoor bouncer { meta: description = "deep panda compiled asp.net <http://asp.net> webshell" last_modified = "2014-04-25" version = "1.0" study = "csit-14004" in_the_wild = true copyright = "crowdstrike, inc" actor = "deep panda" strings: $cookie = "zwiz\x00" wide $cp = "es-dn" wide $enum_fs1 = "file system: {0}" wide $enum_fs2 = "available: {0} bytes" wide $enum_fs3 = "total space: {0} bytes" wide $enum_fs4 = "total size: {0} bytes" wide condition: ($cookie , $cp) or of ($enum*) }

what want remove duplicates based on rule name. however, have alternative of choosing rule remove if have same name have different contents.

in order accomplish this, planning on creating associative array name of rule beingness key , content of rule beingness value. want parse rules regular expressions , add together them associative array go, if rule(key) in array, either skip rule (if contents same) or display both rules , take 1 maintain (if contents not same).

after going through rules, associative array written out file , duplicates eliminated.

update: works now. here script:

# display proper usage , exit if no file given if ($args.length -ne 1) { write-host "`nusage: .\yara-dedupe.ps1 [full-path-to-yara-rules]" exit } # display info , warning write-host "`nnote: utilize total path of rule file`n" $y = read-host "this script experimental , modify $args. backing file recommended. if still want continue, come in (y)" # exit if y not entered if ($y -ne 'y') { exit} # file path passed on command line $filepath = $args # reads in entire file 1 string multi-line matching $file = [io.file]::readalltext($filepath) # regular look separate rule name $pattern = "(?smi)rule(.*?)\{(.*?condition:.*?)\}\r" # matching rules parsed according regular look $parsedrules = $file | select-string $pattern -allmatches # hash table (associative array) store rules $rules = @{} # add together each non-duplicated rule hash table $parsedrules.matches | foreach { # extract rule name $rule = $_.groups[1].value.trim() #extract rule content $content = $_.groups[2].value.trim() # check if rule in hash table if ($rules.containskey($rule)) { write-host "rule exists: $rule" # if is, check if content identical , skip duplicate if if ($rules.$rule -eq $content) { write-host "skipping duplicate..." } # if not, take 1 take else { # display current rule content write-host "`ncurrent rule content[1]:" $rules.$rule # display new rule content write-host "`nnew rule content[2]: $content`n`n" # inquire user rule content maintain $choice = read-host 'enter 1 maintain existing rule content, 2 overwrite rule content new rule content' # if selection 1, go on next rule if ($choice -eq "1") { write-host "`nkeeping original content`n" } # otherwise overwrite existing rule content new rule content else { $rules.set_item($rule,$content) write-host "`nrule updated!`n" } } # add together rule if not in hash table } else { $rules.$rule = $content write-host "rule added: $rule" } } # erase current file clear-content $filepath # output hash table file $rules.getenumerator() | foreach-object { add-content $filepath "rule $($_.key) {`n $($_.value) `n}" } write-host "de-duplication complete. new rules located @ $filepath"

you this, utilize multiline , dotall modes. tough because {} can appear within rule, if stick a grouping of delimiter constraints, should work ok.

capture grp 1 name, grp 2 rule body, trims whitespace.

# (?s)^rule\s+([^{}]+?)\s*\{\s*(.+?)\s*\}$ (?s) # dot-all modifier (put '(?sm)' here) if engine supports .. # otherwise, set them in flags alternative of regex object. ^ # open delimiter = bol + 'rule' + name + '{' rule \s+ ( [^{}]+? ) # (1), rule name \s* \{ # '{' \s* ( .+? ) # (2), rule, ungreedy \s* \} # '}' $ # close delimiter = '}' + eol

output:

** grp 1 - ( pos 5 , len 51 ) crowdstrike_csit_14004_02 : loader backdoor bouncer ** grp 2 - ( pos 61 , len 552 ) meta: description = "deep panda compiled asp.net <http://asp.net> webshell" last_modified = "2014-04-25" version = "1.0" study = "csit-14004" in_the_wild = true copyright = "crowdstrike, inc" actor = "deep panda" strings: $cookie = "zwiz\x00" wide $cp = "es-dn" wide $enum_fs1 = "file system: {0}" wide $enum_fs2 = "available: {0} bytes" wide $enum_fs3 = "total space: {0} bytes" wide $enum_fs4 = "total size: {0} bytes" wide condition: ($cookie , $cp) or of ($enum*)

regex parsing powershell

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -