Category Archives: Uncategorized

OSINT Data Sources: Trust but Verify

Thanks to @seamustuohy and @ginsberg5150 for editorial contributions

For new readers, welcome, and please take a moment to read a brief message From the Author.  This article’s primary audience is analysts however if you are in leadership and seek to optimize or maximize the analysis your threat intelligence program is producing you may find this walk through valuable.


In the recent blog Outlining a Threat Intel Program,, steps 4 and 5 discussed identifying information needs/requirements and resources.  This blog is going to expand on the latter, identifying the resources.  Previously we discussed that when identifying resources, you start with what you have inside your organization and that sometimes the information is in a source to which you do not have access or it is sold from a vendor.  There is another category not previously mentioned, an obvious one, open source, freely available (usually on the Internet).  Free is good. Free is affordable. Free can be dangerous.  Before you decide to trust/use a free source you need to adequately vet that source.  Below I will talk about a recent project, a free source that was considered, how it was vetted and the takeaways which are 1) ensure accuracy and completeness of OSINT sources before they are considered reliable or relevant; 2) know the limitations of your OSINT sourced data; and 3) thoroughly understanding any filtering and calculations that occur before source data is provided to you.


Raw Data on github here.  file name 20170719_OSINT_Data_Sources_Trust_but_Verify_Dyn_Original_Data.7z

Over the years I have worked on a few projects with ties to the Ukraine.  During a conversation with colleagues, we came up with a question we wanted to answer:  Is there any early warning indicator of a cyb3r attack on a power grid that might be found in network outages?  We reasoned that even though businesses, data centers and major communications companies have very robust UPS/generators available, an extended power outage would surely tap them and thereby cripple communications which could have a ripple effect during an armed conflict, biological epidemic, civil unrest etc.  So, I decided to see what data was out there with respect to network outages and then collect data related to power outages.  I settled on my first source,  Dyn’s Internet Outage bulletins from here (filtered on Ukraine). who according to their site the organization “helps companies monitor, control, and optimize online infrastructure for an exceptional end-user experience. Through … unrivaled, objective intelligence into Internet conditions… “

The effort to script this versus the time it would take to copy/paste the few pages of data I needed to retrieve led me to opt to do this old school and copy the data straight off the web pages.  If this is something that I find I will do more than two or three times, then I will consider asking for API access and automating this task. However, for the time being, this was a one-off-I-am-curious-and-bored investigation & manual labor settled it.  I scraped the approximately 480 entries available at the time, going back to the first entries in March 2012 (there are more now of course).

I used a mix of regex-fu & normal copy/paste to morph the data into semicolon delimited lines of data.  I chose semicolons because there were already commas, spaces, and dashes in the data that I was not sure I wanted to touch and since I had less than 2 million rows of data and I love MS Office, I opted to use semicolons so that I could manipulate it in MS Excel more easily.

Below is an example of a typical entry:

6 networks were restored in the Ukraine starting at 16:03 UTC on April 11. This represents less than 1% of the routed networks in the country.

100% of the networks in this event reached the Internet through: UARNet (AS3255).

Let the metamorphosis begin:

replace -> networks were restored in the Ukraine starting at ->

6;16:03 UTC on April 11. This represents less than 1% of the routed networks in the country.

100% of the networks in this event reached the Internet through: UARNet (AS3255).

delete ->  on

6;16:03 UTC April 11. This represents less than 1% of the routed networks in the country.

100% of the networks in this event reached the Internet through: UARNet (AS3255).

replace -> . This represents -> YYYY; [I saved each year’s data as separate text files to make this search/replace easier later so that I can do one year per file]

6;16:03 UTC April 11 YYYY; less than 1% of the routed networks in the country.

100% of the networks in this event reached the Internet through: UARNet (AS3255).

replace -> less than ->  with < less than symbol

6;16:03 UTC April 11 YYYY; < 1% of the routed networks in the country.

100% of the networks in this event reached the Internet through: UARNet (AS3255).

replace ->  of the routed networks in the country. (including the newline characters) -> ;

6;16:03 UTC April 11 YYYY; < 1% ; 100% of the networks in this event reached the Internet through: UARNet (AS3255).

replace ->  of the networks in this event reached the Internet through:  -> ;

6;16:03 UTC April 11 YYYY; < 1% ; 100;UARNet (AS3255).

replace -> (AS -> ;AS

6;16:03 UTC April 11 YYYY; < 1% ; 100;UARNet;3255).

delete -> ).

6;16:03 UTC April 11 YYYY; < 1% ; 100;UARNet;3255


At this point, each of the individual original files contains mostly lines that look like above with the added caveat that I replaced YYYY with the appropriate year for each file.

And like any good analyst I began to scroooooooolllllllllll through over 400 lines hoping my eyes would not catch anything that wasn’t uniform.  Of course, I wasn’t that lucky, and instead I found entries that were ALMOST worded the same, but just different enough to make me wonder if this “reporting” was automated or if it was human.

  • 59 networks experienced an outage…
  • 30 networks experienced a momentary outage…
  • 6 networks were restored…

Do you see it?  I now had three kinds of entries, “restored” which implies that there was an outage.  I also had “outage” and “momentary outage”.  I performed a few more rounds of semicolon replacement so that I could keep moving and noted the issues.


I noted that I had inconsistencies in the entries, but I wasn’t exactly ready to write off the source.  After all, they are a large, global, reputable firm and I believed that there could be some reasonable explanation, and continued to prepare the data for analysis.

The statement “This represents less than 1%” isn’t an exact number so I reduced the statement to “;<1%” to get it into my spreadsheet for now. I wanted to perform some analysis on the trends related to the impact of these outages and planned to use the ‘percentage of networks affected’ value.  To do this, I was going to need a number, and converting them seemed like it should be relatively easy.  I considered that new networks are probably added regularly, although routable networks not very often, and decided I should try to estimate the number of networks that existed just prior to any reported outage/restoration based on a known number.  So, I turned to the statements that were more finite.  Unfortunately, to my surprise, there were some (more) serious discrepancies.  For example, on 5/8/2012 there were two entries one for 14:18 UTC, the other 11:39 UTC.  The first stated that 97 networks were affected which represented 1% of the routed networks in the country (not less than 1%, exactly 1%), and math says 97/.01 = 9,700 networks. The second entry stated that 345 networks were affected which was 5% of the routed networks, and math says that 345/.05 = 6,900 networks.  How could this be?  In less than three hours, did someone stand up 2,800 routable networks in Ukraine?  I don’t think so.  I performed the same calculations on other finite numbers for very close entries most of them on the same day and found the same problem in the data.  This is no small discrepancy and I can assume that the percentages being reflected on the web page have undergone some serious rounding or truncation, and therefore decided to remove this column from the data set and not perform any analysis related to it.  Instead I kept the value that, for the time being, I trusted, the exact number of networks affected.

In the data set, there were 10 entries that did not list a percentage of networks that were able to reach the Internet either during the outage or after the restoration (the range of networks affected was from 7 to 105), and these 10 entries were removed from the data set.  These 10 entries also lacked data for the affected ASN(s) as well. To avoid skewing the analysis, these incomplete entries were removed.  There was one entry of 23 affected networks 2/17/2014, where the affected ASN was not provided, but all other information was available, and an entry of UNK (unknown) was entered for this value.

Additional name standardization was completed where the AS number (ASN) was the same, but the AS names varied slightly such as LLC FTICOM vs. LLC “FTICOM”.

The final data set content that remained was 97.8947% of the original data set with data from March 22, 2012 16:55 UTC to July 5, 2017 21:40 UTC.

Let’s review the list of concerns/modifications with the data set:

  1.  inconsistent reporting language
  2.  lack of corresponding outage and restoration entries
  3.  inconsistent calculations for the percentage of routable networks affected
  4.  incomplete entries in the data set
  5.  AS names varied while the AS number remained the same


First, I started with the low hanging fruit.

  • Smallest number of networks affected: 6
  • Largest number of networks for one ASN affected: 651 (happened on 5/18/2013)
  • Average number of networks affected: 74.93534483
  • How many times were more than one ASN affected? 253 which is slightly more than half of the total events.
  • Were there any outages of the same size for the same provider that were recurring? YES
    • ASN 31148, Freenet (O3), had three identical outages of 332 networks 10/2/2014, 1/8/2015, 11/25/2015
    • ASN 31148, Freenet (O3), had four identical outages of 331 networks 8/11/2014, 10/6/2014, 11/5/2015, & 8/3/2016

At a glance, it seems like the low hanging fruit are what was expected, but I am still not sold that this data set is all that it is cracked up to be, and I decided to check for expected knowns.

The Ukraine power grid was hit on Dec 23, 2015 with power outage windows of 1-6 hours; however, the only network outages reflected in the data for Dec 2015 are on 1, 2, 8, 27, 30 & 31 December. The next known major outage was more recent, just before midnight on 17 December 2016 and again, there is not a single network outage recorded in the data set.  This puzzled me, so I began to look for other network outages that one would reasonably expect to occur.  In India, 2012 there were outages on 30 & 31 July and according to The Guardian, sources “Power cuts plunge 20 of India’s 28 states into darkness as energy suppliers fail to meet growing demand” and if you follow the article, one of many, you get reports that power was out up to 14 hours.  [I encourage readers to check sources, so you can find 41 citations on this wiki page].  Then I went back to the data source, filtered on India, and magically there were no reported network outages that paralleled the power outages.  Then I decided to start spot checking USA outages listed here because this list has the following criteria:

  1.     The outage must not be planned by the service provider.
  2.     The outage must affect at least 1,000 people and last at least one hour.
  3.     There must be at least 1,000,000 person-hours of disruption.

And sadly, I chose three different 2017 power outage events none of which seemed to be reflected in the data set of network outages.  It was time to decide if the data set I wanted to use was going to meet my needs.  After reviewing what I had learned after normalizing and analyzing the data I summarized the concerns:

  1.  inconsistent reporting language
  2.  lack of corresponding outage and restoration entries
  3.  inconsistent calculations for the percentage of routable networks affected
  4.  incomplete entries in the data set
  5.  AS names varied while the AS number remained the same
  6. known events aren’t found in the data set

All things considered, I opted not to use this OSINT data source for my analysis as there were too many discrepancies, especially the lack of “known” events being reflected.


Because this was a sudo personal endeavor that only paralleled some of the work I have previously done, I took a very loose approach to the analysis and I did not verify that the data set I was considering met my needs regarding completeness & accuracy.  I did however assume that because the data was being made public by a large, reputable, global company that they themselves were exerting some level of quality control in the reporting and posting of the data.  This assumption of course is flawed.  Currently, I am unable to come up with an explanation for so many multiple data points being absent from the dataset.  It is possible that this major global Internet backbone company simply has no sensors in the affected 71% of India or they are only in the 29% that weren’t affected.  Perhaps all their sensors in the Ukraine were down on the exact days of the power outages there.  It is also very possible that there were enough UPSs and generators in place in Ukraine and India that despite power going out, the networks never went down.  At this point, I’ve decided to search for answers elsewhere and am not spending time to try to figure out where the data is as this was just a personal project.


Data sources need to be vetted for accuracy and completeness before they are considered reliable or relevant.  When considering a source, check it for known events that should be reflected.  If you do not find these events and you wish to use the source, ensure you understand WHY the events are not being reflected.  Knowing the limitations of your OSINT sourced data, is critical and thoroughly understanding any filtering and calculations that occur before it is provided to you is just as vital to performing successful analysis.  This kind of verification and validation should also be repeated if a source is used in OSINT collection.  Because a data source is not appropriate for the kind of analysis that you wish to do for one project does not mean that it is invalid for other projects.  All OSINT sources have some merit, even if it is that they are an example of what you DO NOT want.




People Search Sites – Erase Me Please

The good folks over at Divine Intel (Twitter @divineintel) asked to borrow a little space on my blog as they are still getting their website set up. They’ve recently tweeted 21 URLs where you can go to submit requests to have your information removed from the people search sites, and in some cases phone numbers too. The tweets are tagged with #eraseme and #privacy to make them easy to find as well.

There are over 60 sites they work with to help remove information from, and the list below is not exhaustive, but it is a complete list as far as they know of those sites allowing for web-based submissions for records removal requests.

Some sites only accept requests by mail, such as PeopleLookUp, US Search, & Zaba Search which can all be reached at the same address. Although, one site tells you that you can only fax in your request, another says you can only mail it, and one allows you to fax and/or mail it… ironically, they all share overlapping fax or mailing information. So one letter, to this address requesting removal from all three sites should help take care of this one. Be sure to check out their OptOut requirements as you have to send supporting documentation you’ll need this form too Yes, that form is hosted on the Intelius domain.

Privacy Officer / Records Removals
P.O. Box 4145
Removal Bellevue WA 98009-4145

Here is the consolidated list of URLs that they tweeted out separately earlier this evening.

Be sure to follow their twitter account for helpful nuggets on deleting your personal information and reducing your digital footprint.

Strategic Threat Intelligence in the Digital Realm

Thank you @Ngree_H0bit and @TXVB for your editorials on this blog.

Imagine if someone walked up to your job, and fired an automatic weapon at the building or detonated a bomb in the lobby. Then the police showed up the conversation went like this:

LEO: “Did anyone die or get shot?”
Company: No
LEO: “Is there any damage to the facility that can’t be repaired?”
Company: No
LEO: “Ok, that’s all we wanted to know, our work is done here. You can go back to what you were doing.”
Company: “Wait?! Don’t you care who did this?”
LEO: “No, you’re safe now, the threat is contained”
Company: “Aren’t you going to try to figure out WHY they did this?”
LEO: “No, that’s not important, you’re not in danger anymore.”
Company: “How do you know that?”
LEO: “The threat has been contained, the attacker is gone/dead”
Company: “But what if there are more attackers?”
LEO: “Well, you better install some bullet proof glass, wear a Kevlar vest everyday, and hope for the best.”
(2 weeks later)
Company: “All employees are required to buy a Kevlar vest…”
Company: (to the property manager) “We need an upgrade to the building next year if we’re going to renew our lease….”

We would be absolutely oozing with disgust and screaming from the tops of our lungs at how incompetent and dismissive the police were at protecting us if this happened. Yet, in the InfoSec world we do it all the time. Let’s change the conversation slightly:

SCENARIO: Major digital attack against a company

Management: “Was anyone’s data lost?”
Strategic Threat Hunter: “We’re not sure, but it doesn’t look like it.”
Management: “Is there any damage to the computers that can’t be repaired?”
Strategic Threat Hunter: “No.”
Management: “Ok, that’s all we wanted to know, your work is done here. You can go back to what you were doing.”
Strategic Threat Hunter: “Wait?! Don’t you care who did this?”
Management: “No, we’re safe now, the threat is gone.”
Strategic Threat Hunter: “Aren’t you going to try to figure out WHY they did this?”
Management: “No, that’s not important, we’re not in danger anymore.”
Strategic Threat Hunter: “How do you know that?”
Management: “The threat has been contained, the attacker is gone/malware is blocked.”
Strategic Threat Hunter: “But what if there are more attackers in the group?”
Management: “We should improve security, buy/install a new security widget, and hope for the best. Oh and no, you can’t have any more resources to do this.”

Management comes running when something catastrophic happens yet all they care about is a damage report, the immediate impact. Even when Incident Response teams respond to a major breach, little, if any, time is spent after the event trying to understand why they were targeted or asking any of the questions above. Now don’t get me wrong, I’m not saying NOBODY EVER asks, I’m just saying that more often than not, nobody cares or is asking. Few companies put ANY investment in strategic intelligence efforts that can identify threats. Instead they sit and wait for the FBI to call them to tell them they are about to (or already do) have a problem.

It is this gap that concerns me the most, and what the remainder of this blog post will seek to touch on. I dare say address as that is a goal that I doubt I can achieve fully in one blog post.


The inability to gather strategic intelligence and conduct “target development” in the digital space, at lower echelons of the military or in the civilian sector, is troubling. Nonetheless, it is critical to for us to anticipate the adversary and to defend against them, and, frankly, to act offensively or pre-empt their actions.

One of the things we do in the military to prepare is train, train, train – and we train like we fight. If the enemy’s landscape is a desert – train in a desert, if it’s a jungle – train in the tropics, a winter wasteland – train in the arctic etc. Training like you fight isn’t limited to the environment either; it includes using the tools and weapons available to you in scenarios you may find yourself in. If your enemy might deploy chemical weapons, you might have to wear full chemical protective gear and fire your weapon to save your life. So you put on all that chemical gear, go to the range and, fire your weapon. You train in the environments, scenarios and, gear you may face; you train to all of it. You train to the point that it is a natural reflex, muscle memory, so you don’t even have to think about it. I can’t tell you how many times I responded to “Gas! Gas! Gas!” and ran full speed ahead, weapon in hand, and dove into a fighting position – “just training.”

Then there’s the intelligence teams; what intelligence are they gathering to support that ground troop? Ask them to tell you how they leverage Cyber Command to gather strategic intelligence for the warfighter, and I’ll show you a politician doing the Cotton Eye Joe at warp speed ( . They have no idea, the politicians that is. They’ll tell you that’s the NSA’s job, and they’ll still have no idea. I’m not going to go down a list, but there are other agencies such as DIA, DNI & DHS to name a few that also have cyber operations, who, to some degree, all suffer the same gap, discussed here. What of the civilian companies that support critical infrastructure or even city, county, and state governments? What about USCERT/DHS & ISACs? After all, isn’t this kind of support *THEIR* job? Ask them and they’ll tell you strategic intelligence is about targeted threats and APTs. *cough-ulzhit* No I’m not making that up, a C-level executive from state-level county government and US government officials have actually told me that. They have no idea either.

One of my favorite analogies that explains tactical, operational and strategic leadership came from a Stephen Covey presentation on the levels of leadership, first line managers, middle management, and senior leadership. However, it translates well here as first line managers are tactical, middle managers are at the operational level, and senior leadership are at the strategic level. Tactical intelligence tells you how to eliminate the threats in the jungle where you are working. Operational intelligence tells you where you should be in the jungle, and what kinds of threats are in each area of the jungle. Strategic intelligence is when someone yells “we’re in the wrong jungle!” (his presentation was on his book 7 Habits of Highly Effective People)

In the civilian world, our digital intelligence is heavily tactical, it is overwhelmingly focused on how malware executed or the fact that there is an 0-day in a piece of software. Tactical intelligence is important, it has a place, it serves a purpose, but it is focused on winning a battle, not a war. So how do we do this in a digital realm? How do you train to fight there? What does strategic intelligence to support a digital war look like? What does a tactical aggressive vs. strategic covert attack from the enemy look like in a digital war? What does it take to defend against it? What does “guard duty” look like when you’re defending 1’s and 0’s? Surely it isn’t pacing back and forth with an M16 in front of concertina wire if you’re a soldier. It isn’t going to be a roving watch like the border patrol. If you’re a civilian, is it simply sitting in a SOC staring at a dashboard for 12-hours looking for alerts/waiting for alarms? So just what do passive and active digital reconnaissance look like and how are they executed?

Strategic intelligence in support of a physical or digital fight – isn’t always in your logs, your dashboards or anything else digital. Development targets, predicting what your enemy would do and what you might need to do to win a fight, will almost certainly involve technology; however, more often than not, it is going to focus on gaining a greater understanding of your enemy as person, a human being with objectives that need resources and have motivations, habits, skills, and weaknesses. It will be less concerned with how the malware executed, than it would be with the knowledge required to design the malware to execute in the manner it did. Strategic intelligence would be more focused on derived metadata about the attacker that would go toward profiling skill/expertise/training/origins etc. Examples of questions to ask: Does the distribution or content indicate a country of origin? Did the execution require specific knowledge about the affected target’s design that indicates insider knowledge?? If yes, maybe your attacker is a former or current employee? If no, did it require knowledge of proprietary information? Let’s assume it did and, everyone is trusted/vetted; are you looking at a possible breach or data loss that hasn’t been detected? Again, we are less concerned with the tactical intelligence surrounding being protected and more concerned with strategic intelligence and understanding the person that is behind the attack/malware.


Next we’re going to get a 30,000-foot view of what strategic intelligence is with respect to the digital world, because understanding what it is sets the foundation for me to explain, in a future blog post, the kind of person(s) needed on your team and why they are critical to winning the war, not just the battles, that we, face as a country and commercial companies.

Typically, InfoSec people hate the word “cyber” we consider it as profane as most people would consider the F-word. Because we’re going be discussing intelligence gathering and analysis in this post, I’d rather say DIGINT, a collective term for digital intelligence, instead of CYINT. DIGINT is not its own intelligence domain, rather it is a component of all others. If I were to draw a diagram of the intelligence silos, DIGINT would run horizontally across all of them. Blasphemy you say? Let me ask you this, can you name a part of your life not affected by technology, something digital? Even a stroll in the park without an iPod or cellphone isn’t sacred as the cell phone and iPod rely on tens of thousands of lines of code and have multiple RF transmitters. Streetlights are powered by electricity, on a grid managed and monitored by technology, programmed to come on at a specific time or use solar power and light detection. Your walk on a beach with no cellphone and no smart watch – I bet you drove a car to get there that had electronic fuel injection, GPS or a digital radio. Anyway, you get my point…

DIGINT is best defined as the intelligence gathered from digital sources, and much like HUMINT is gathered from humans, SIGINT is gathered from “anything that goes through the air” etc. DIGINT can be found in an open source, in which case it would be digital intelligence from an OSINT source (a book, magazine, the news, the Internet etc). In the case of signals, SIGINT, it could be logs or transmission captures. If the source is human, their behavioral data captured in the apps they use and how they use them, the GPS history in their phone, their social media posts – all digital intelligence sources that can be leveraged for strategic intelligence gathering missions that support and enrich tactical intelligence operations.

So what exactly is Strategic Threat Intelligence and how is DIGINT factored in? Let us first understand what Tactical Threat Intelligence (TTI) is in the digital world, as most of us will be able to relate to this much more easily. Tactical Threat Intelligence in the digital world is very similar to the tangible world it is sometimes referred to intelligence developed from and in support of incident response and is easily likened to fighting fires, playing whack-a-mole, smack-a-RAT, bash-a-bot etc; you may have even heard the term Indications/Indicators of Compromise (IoC). It is the kind of intelligence that supports addressing an immediate threat, one that is right in front of you, either presently attacking/affecting your assets or running rampant in the wild and could be on your network’s doorstep at any moment. These kinds of threats include malware (viruses, Trojans, RATs, ransomware), DDoS tools/networks, spam etc. TTI is “current” information that allows you to take an action to prevent or address these impending threats. It is easily recognizable to anyone who’s defended against an attack or been part of a penetration testing team on the offensive.

To understand what Strategic Threat Intelligence (STI) is and how it translates to the digital space we also need to understand the characteristics of it. The easiest way to do this is by reviewing what we know about tactical intelligence thereby identifying what strategic intelligence is NOT. Below are some of these examples of TTI vs STI that commercial companies might need, along with the characteristics of each.

Timely != Current

TTI is “current;” that means it is dealing with the here and now, immediate threat. For those of you who have been to a gun range you might call it “the 50-meter target.” STI, on the other hand, is TIMELY, not necessarily current. This means it is actionable and relevant to the timeline of achieving an objective. Timely does not arbitrarily translate into long range. For instance, you might find that a client is opening a new office or manufacturing plant, or perhaps an agreement of some sort is going to be signed in 3-6 months. Timely in this sense would mean identifying digital threats to one of these targets in a timeframe that allowed identification, detections and/or protections to be developed relating to the event. The artifacts of this research would be considered strategic.

A timely piece of STI in one of those scenarios is any significant local cultural, religious, educational or competitor activities scheduled to occur in the same location around the same time. Also, identifying relatives of key corporate staff or engineers that hold proprietary information that may be targeted for a phishing or social engineering attack could be helpful. Taking that a step further, strategic DIGINT could determine if there is there evidence of online activity related to events that can be used to mask a pending attack, for example a distributed denial of service (DDoS). An often-overlooked form of STI is historical activities. In this case, answering questions surrounding what “digital challenges” or “cyber threats” [I feel gross just saying that] has the client (or your organization) faced in this region in the past for regions with similar economic/cultural composition? None of these would necessarily help you defend or protect against an immediate attack, but they could all be used to prepare (train) for a future attack, identify risks, and identify information & information sources that could be leveraged to provide a company the upper hand against a digital attack.

Deep Analysis != Long Range

STI, much like TTI, involves analysis where you collect data, vet the source and content, assign a value to it, interpret it and convert it into intelligence. A common misconception is that STI is long-range because it requires deep analysis and deep analysis takes a very long time, thus is reserved for long-range projects. This is simply not true. Sometimes a raw piece of data itself, given a relevant situation can be immediately relevant. The term “deep” is relative to the mission/objective. Deep could mean, finding out who really owns/runs a company, especially considering that what is often on paper doesn’t reflect real-world dynamics. This deep analysis could take a couple hours or it could take a couple weeks.

Another example: you might learn that a company from a global power (US, Russia, UK, Germany etc.) is planning a joint venture to build critical infrastructure in another country and, this project could have huge economic impact on the cities involved and the country it is in. If you provide services such as travel, communications, HR, accounting etc. to this region or any of the parties involved or do business with your customer(s), this might be considered a piece of strategic intelligence. Why? Because this information could help you identify where or what types of threats might emerge to attack the communications, electronic resources, and infrastructure of the parties involved in the deal, thereby also making you a potential target. Just search for “data breach” and you can create a list of your own of companies that were compromised when an attacker pivoted from a subcontractor or partner’s network. While learning of this business venture is considered raw data, it has immediate value impacting a strategic objective and can result in an action being taken such as focusing the next round of data gathering in a new direction, changing what’s being searched for in logs/telemetry data. The list of responses to this kind of intelligence will vary depending on your organization, the service(s) you provide, and your own objectives among many other things.

Indicator of Attack != Indicator of Compromise

The acronym IOC (or IoC) is something every TTI analyst or researcher is likely familiar with. An Indicator of Compromise (IOC) is something developed from analysis of an event that has already occurred, or malware that has already been discovered. It is a piece of metadata that helps identify a threat hiding in other places where it may not have yet been discovered. The difference with STI is that it seeks to identify threats on the horizon, an indication of a future attack, or better called an Indicator of Attack (IoA). An IoA is simply identifying the fact that a threat is developing and an attack is probable.

Let’s consider a physical fight first and some progressively obvious indicators of attack brewing. To start, you observe a country suddenly shipping large quantities of equipment, supplies and troops to an area that is declared a training facility only meant to support a small number of soldiers for a brief time. That might be an indicator that something is developing. Then later, you observed these activities occurring outside of any scheduled military training that might further support a theory that something is about to happen. Finally you noticed missiles loaded, armed and pointed at your location. This is probably a pretty good indication that an attack is coming. On a smaller scale, if you noticed a person snapping pictures, it might be reconnaissance or he/she could just be a tourist. If you noticed the same person, at the same place multiple days, maybe even at approximately the same time, snapping pictures that is probably a little more suspicious and it could be argued it is more likely indicates a reconnaissance activity, something that usually happens before an attack.

So how do we identify the suspicious person from a DIGINT perspective? A very simple example of an IoA in the digital realm is port scans on your firewall from an IP address that’s never scanned you before. Another less obvious IoA would be an IP from a strange subnet that pings, scans, or attempts a connection to just a few ports, every 12 hours. Maybe this activity occurs only on Sundays or during hours when nobody is working, and they’ve been doing it for the last six months. Another way you could develop an IoA would be from a human intelligence source in a digital space. In the old days, you’d be eavesdropping on conversations at a coffee shop whereas today it could be something learned from hanging out in a chat room or forum. If you found an archive of the forum or chat logs, it could be argued that this is DIGINT. The tactics and techniques the old days such as in-person eavesdropping and reconnaissance, aren’t forgotten or antiquated. This is why the paranoid InfoSec person of today won’t talk about a pending attack or sensitive topics online. Either way (online or in person), you might learn of someone discussing the fact that your client is going to have a really bad day once “their friend” is finished with X activity/development/recon etc. All of these could be considered an indication of an attack that will be played out in the digital space. Of course, like any other form of threat intelligence, it needs to be reviewed, assessed, put into context with other pieces of intelligence from other sources etc. to develop a true threat intelligence report.


Strategic intelligence is essential for long-range success in any war, whether you are fighting it with boots-on-ground or in a digital space. It requires investments of time and money and it requires leadership to insist on deeper understanding. It means that we need to spend time thinking like the enemy, doing target development, and figuring out where the next strike could happen so that we can look through relevant indicators in order to develop DIGINT related to that target with a new analytic perspective. We should be going back over the history of attacks we’ve endured at our companies as if they were cold cases that never got solved and, we should be looking at them with a new objective – that of profiling the adversary through his attack. Look at your “crime scene” and ask, what kind of person did this and why?

I encourage you to start pushing your leadership to ask these higher level questions; insist that you stop simply being victims building yet another/higher wall for the enemy to scale. Start doing some reconnaissance of your own, and look for adversaries in their planning stages so you can foil their plot. Catch the bad guys in their recon stages of your assets and start figuring out what might be on the horizon so that if you do have to defend, at least you’ll know what you’re up against and when it’s coming. I leave you with this final thought: If you keep doing what you’ve always done, you’ll keep getting what you’ve always got.

Stay tuned for a future blog post on what skill sets to look for in potential strategic intelligence team members.

Beyond Whack-a-Mole “Intel”

In recent days I had some conversations with folks regarding the common INFOSEC comprehension of threat intelligence and what it really is, and we all come back a marketing buzz phrase “actionable intel”. My concern is that the definition of “action” seems to be getting diluted these days and at worst it has been morphed into “write a signature to prevent X” or create some hot new technology that uses artificial intelligence to anticipate ABC and block the attack. Also, everyone wants to be first to blog about the latest threat that hit the landscape. Researches spend hours trudging through dashboards, PCAPs, log files and retro hunting with yara rules looking for that needle in a mountain of needles that is sitting inside of their grandmothers sewing bench, and hope they don’t prick themselves wasting time with “unrelated” data or false positives. We’re inundated and consumed with the tactical execution. Why? Money, and possibly a case of nearsightedness.

Businesses are consumed by needing to show immediate value (nearsighted), and value is usually measured in the number of bad things blocked. Thus the tactical war against malicious actors saturates every aspect of our information security programs, our hiring for INFOSEC roles, the reports we produce, the metrics we pull our hair out trying to develop, and most of all BUDGET – where we spend our money. We are at constant war, just ask any incident response, forensics, malware reverse engineer, threat researcher, or SOC analyst – it is an all out 24/7 war against bad guys, and one thing you need to win a war besides soldiers, beans and bullets? Strategy.

Strategic operations are nothing new to any military organizations. Nor is strategy new to any successful CEO trying to position his company to gain a competitive advantage over a market share, but strategic planning and execution to an INFOSEC threat intelligence team seems to be as foreign as a fully nude woman standing in the flesh in front of a virgin. The concepts of profiling, understanding, and anticipating your enemy so that you can not only win battles but win the war, are something I find few people grasp. Make no mistake, I am not saying that the tactical activities mentioned above are without merit, they are 100% critical and vital to protecting assets both tangible and intangible, and even lives. What I am saying is that, organizations that have reached a maturity level where they are effective with near-surgical precision in squashing malware and phishing attacks, should be looking to take things to the next level.

I tweeted recently something to the effect that the words “new malware” literally have a Pavlov’s effect on threat researchers. Everyone gets excited about the shiny new malware, we all want to rip it apart, see how it works, hopefully find flaws in it, & blog about and HOPEFULLY to share indicators of compromise (IOCs) with the whole world to make the Internet a safer place. (Side rant – if you blog about threats and don’t share IOC’s and actionable intel, IMHO you are a douche nozzle being used for an enema) Back to the topic at hand…..We want to tell everyone how the malware did it’s backflip, blindfolded, across hot coals and broken glass, shit a peanut that turned into a malware tree, that bloomed ransomware buds whose pollen poisoned the threat landscape and that’s how we got money to grow on trees. Okay, not exactly, but close enough. But then what? Then we all go back to looking for the next shiny piece of malware, cuz we can never have too many in our collection right? Well, this all falls into tactical operations, a very instrumental element to protecting and defending our orgs and current customers, not to mention attracting new customers. The race is to be the one that finds it first, blogs first, and makes current and potential customers feel safer – basically whack the mole the fastest and most accurately. Heaven forbid if another organization blogs about some new major threat and you didn’t, your org is destined to get a tsunami of “are we protected” inquiries. And of course, that’s what the business is worried about – happy customers who feel safe because that’s what pays the bills. So I ask again, but then what?

In all of this, after the hours spent finding it, ripping it apart, and figuring out which IP or domain it came from so you can write a signature, blacklist and block it, what have you learned about your enemy? Better yet, what have you converted from an observation into codified knowledge that can be used later – that is not an IOC? What do you know about their objectives, short and long term? What do you know about their resource needs, infrastructure, motivations (are they political or financial)? Trying to teach strategic threat research in one blog post is insane, so I’ll try to give an example via an imaginary conversation.

Do you know or understand why *THAT* malware was used against *THAT* organization? NO

What about that domain, have you run down the registrant to see what other domains he/she owns and if there’s any other malware associated with them? YES

Oh really! Well do you know if it’s the same kind of malware? It’s not

It’s not? Well bad actors are kind of like serial killers they usually have a modus operandi (method of operation) aka M.O., a habit, that they rarely deviate from, so why did you actor change his M.O.? I don’t know

OK, go figure out if something caused your actor to change his M.O or if this indicates multiple actors sharing the same registration information.

Is it on a dedicated/shared IP? SHARED, on an ISP that only owns 200 IPs and only hosts 100 domains, and they’re behind bulletproof hosting

Do you have enough information based on victims to build a potential target profile so we can figure out where/who they might attack next? NO

What vertical was that attack against? Transportation

What org? a trucking company

What geographic region? Timbuktu

Are there any key political figures headed to that region? sporting events coming up? tourist or entertainment events planned in the near future? Yes

Really, hmm what other resources are needed to support (X from previous question)? Catering? Power? Decorating? Air Travel? Yes

Now the scenario above is completely made up, and there is an entire line of questions that could follow. In fact, changing the answer to any one question can change the next round of questions that would follow. Nonetheless I think you get my point. And if you really do get my point, then you’ll understand why a massive “threat intelligence feed” from a company is practically useless. You’re better off just ingesting a black/whitelist from some trusted asset with an understanding that you may have false positives, but you’d rather be secure and inconvenienced. It is time the INFOSEC community take threat intelligence to a new level and start looking past the shiny new malware and actually start trying to understand attackers.

It kind of reminds me of the sci-fi movie I watched this weekend (I won’t name it because I don’t want to get sued). Basically our planet had been attacked in the past and we defeated the enemy. Then the humans studied the technology left behind from the aliens. They used it to advance the human race and unite the world. But then years later, another alien shows up, without hesitation we blasted it out of the sky, then a bigger alien shows up and threatens the planet again. However, a group of scientists takes the time to study and understand the first alien that showed up those years later. They come to find out the motivation behind that alien, learn from their observations and if they apply the knowledge correctly they can then ultimately defeat the massive alien force that now threatens them.

The key here, is that they took time to study – let me type that out a little more slowly “T H E Y T O O K **T I M E**” to “S T U D Y” – of course it was after they whacked the mole, but they did do the deeper investigation. This is where we all need to be headed. After we’ve honed our skills at quickly finding and annihilating the immediate threat, let’s start adding a new function to our INFOSEC portfolios: teams to do strategic analysis, enemy profiling, and developing threat intelligence that allows us to take proactive measures to prevent attacks or at the very least identify behaviors that indicate a larger (measured by impact not volume) threat on the battlefield.


BTW, Business people – please pick your faces up off the floor, I know, I just said we need to invest time and money into something that has long-term payouts and not immediate ones. Let me know if you need me to pay your co-pay for your hospital visit.

As always, thanks for reading and supporting.

How’d They Know $PrivateDetails ?


Today a friend and colleague of mine shared that he got a really really good gmail login phish purporting to come from his home owners association president. Immediately my brain spins up because this is my friend and I asked some critical questions.

1) How did the phisher know who the HOA President was?
2) How did they get that individual’s email?
3) How did they know my friend was in that specific HOA?
4) How did they know my friend’s personal email?

Of course the list of questions can go on and on, but the plot thickens when he says the email was sent to his gmail address attempting to get his gmail creds, but that he does not use his gmail account to converse with the HOA President, nor does he remember EVER using his gmail to contact him.

And there we have it, a spearphish executed on a non-work resource.


Now my friend is in Information Security, so naturally he avoided the compromise, discarded the email and is going to take the necessary follow-up steps, but do you know what they are?

So YAY he’s not a victim, but what’s next? Discard them email of course, however the train doesn’t stop there and it shouldn’t. This kind of incident, although on a private email address that was (most likely) accessed from a home computer, still needs to be reported to your Information Security Department **making sure the information makes it all the way to your SOC & Threat Intelligence Teams**. After all, this was a spearphish, not a generic blast-all phish hoping for a random victim. CONGRATS YOU ARE A WALKING PIECE OF INTELLIGENCE! Finally, a “reminder to be vigilant” with details on the spearphish should go out to key leaders. Why? Well let’s play the what-if game.


What if….my friend’s wife had gotten the email, on a shared family computer, fell victim to it, and a key logger (or other malware) was installed? Someone went to great lengths to find all this information out about my friend, they obviously don’t mind investing time into a target.

What if….another non-security-savvy key leader, who reuses work passwords at home (cuz that never happens), has gotten a similar email on a personal email at home, and s/he fell victim to it? Getting the “be vigilant” reminder may have him pressing the ZOMG button and in turn report that s/he got something like that as well.

What if….the email appeared to be from your children’s school, regarding grades/bad behavior/parent event/free ice cream etc. and it was sent to the child?

What if….the email appeared to be from your child/spouse’s $hobby group (that is publicly plastered all over social media)?

Remember, someone went to the length to figure out where my friend lived, what the name of the HOA was, who the president of it was, the president’s email address, and my friend’s email address. That is a lot of effort to just get gmail credentials. This likely indicates they’re probably after something bigger.


Now something I don’t see or hear a lot of companies doing is having security awareness training for spouses and family members. People laugh, but I remember when the Iraq war broke out and family members were plastering all over social media tons of pictures of everyone gathered in a gym preparing to leave saying “Gonna miss my husband so much! God bless the $military unit” or “My husband is finally coming home! They should be landing at $YYMMDD:HH:mm:ss.” We would sit back and say “Dear spouse, please stop helping the bad guys determine our troop strength and travel plans!” There hasn’t been a #facepalm meme invented yet that could accurately depict the military commanders.

So for the non-military folks out there here’s an idea. Put together a a “fun day”, invite out the spouses and the children, and teach them about phishing (and the various specific forms, phone, spear, social media etc) and OPSEC! Sure you might be the security person in the home, and you have your personal firewall all tightly locked down, your wifi is wrapped up nice with strong passwords, MAC filtering, SSID broadcast shut off blah blah blah, but what about your spouse/children’s phones and their social media activities? Security is a BEHAVIOR or a STATE OF MIND if you will, not just technology. Educating the family is just as important as educating the employee.


If you are in security you are a target. While my friend holds a significant role at his employer, the risk would be no less if he was a “lowly” systems administrator (I say that with sarcasm cuz pffft it’s only domain admin creds at stake). Your family is a target, ensure you take the time to educate them and grow a security-minded culture at home as well as at work. If you find yourself spearphished, using personal non-work information you need to be asking yourself “How’d they know that?” and possibly have a conversation with the entity that was impersonated or review how much private data you are sharing.

Report spearphishing whether it is at work or at home, you or your family etc. as this is precious intelligence that any intelligence team needs. Finally, DO NOT do work on a non-work computer especially one you share with the family, be vigilant, remind your co-workers regularly to be vigilant, and share what you know with others.

A reader reached out with a question that made me realize I needed to clarify something above. While I did note that you should delete the spearphish email, the implied task was that you captured it (full header and all contents) BEFORE deleting it as the email along with the report of the spearphishing attempt needs to be provided to the Threat Intelligence team. There can be valuable information in the email header that you do not want to destroy. You can accomplish this by attaching the email to a new email and sending it to the proper team/individual etc. or saving it down and attaching it to an incident report.

Large Foot Prints and Loud Noises

So milling around in some spam while on another research project, I started noticing something strange… how so many seemingly unrelated domains appeared in the Reply To address of the same spam campaign. I began digging into the domains for multiple campaigns and I am currently monitoring the behavior and working on mapping the associations of the bad guys. Granted, there’s nothing glorious about discovering spammers or shutting them down, but uncovering a large enterprise of what appears to be individuals working together kind of intrigues me.

Anyway, while working on this, I found someone that sticks out like a sore thumb. Why, because they are “loud” and have a huge footprint. This individual has registered over 4,000 domains in 11 days. I’m sure he/she has reasons, but none that I’m currently interested in hearing. Not surprisingly, the actor is also associated with over 100K other active domains. I’m not sure what you plan to do about it, but in the mean time, I’m blocking these.

For the record, at this present moment I have only tied this actor to spam (in my resources) and malicious sites as indicated by other research sources, but I have not personally tied them to specific malware.  If I do, I will update this blog with those details provided it doesn’t compromise any other OPSEC.

As a general rule, I’d recommend blocking non standard gTLDs and allowing your users to request an exception.

Here’s a link to the blacklist thanks T-byrd for hosting it.

Check back to that link for future updates.

That’s all for now, if this was helpful to you please let me know.


**UPDATE** 2016-03-29 (0446-UTC)

Additional investigation shows the registrant is a Chinese reseller (  I’ve personally linked many of the domains to spam, and   others are blacklisted by Domain Tools and other resources.   The seller’s page reveals the price they sell domains at is (2.9-3.5 Yuan, take the avg) less than 50 cents (3.27 Yuan = 0.50 USD) and pricing is also per month in many cases.

So let’s math a little here (assuming all his domains are “rented” for the next 12 months at the average rate)

100,135 domains
x 50 cents/mo
x 12 month

$600,810/year gross

Now I’m not sure what their cost is, but let’s assume they bought a .download domain from ALPNAMES for the advertised $0.60 for 1 year.  They just rented it for .50 x 12 = $6.00 – .60 = profit of $5.40

ROI = 5.40/.60 = 9

9 x 100% = 900% return on investment of 60 cents for one domain.

So we have a low cost of entry to do [bad] business (both the folks buying & renting the domain), links to multiple spam campaigns, some with phishing elements, and links to other confirmed spam campaigns.  I don’t care what they are re/selling them for, at that price, nothing good is going to come of it IMHO.  So the list has been made available WITHOUT WARRANTY you may do with it what you wish.