Preventing Exposure of Sensitive Information
GPO isn't alone. Any agency that publishes or shares information with the public is at risk for posting information that shouldn't have been shared. There are desktop, peer-to-peer file-sharing applications that can leak data, simple configuration changes that can expose information and standard user error. Most of us already suffer from information overload, and it's impractical to think we can stay on top of every configuration change that may lead to a problem.
Steps must be taken to prevent inadvertent data exposures, and one of the most effective ways to know what you may be exposing is to sit down and take a look at your agency's public footprint. Easier said than done, in particular when it's not uncommon to find agency sites posting tens of thousands of pages. So how can you go about making sure you aren't spilling highly-sensitive information?
There are two kinds of organizations: those that have inadvertently exposed sensitive information and those that will.
One approach is to conduct a targeted, user-driven analysis to identify high-value information assets that are in the public domain. Find some areas where someone may find "juicy" tidbits of information. Start with search engines - either internal or external - and do some searches for terms that would clearly indicate a problem: Social Security numbers, taxpayer identifiers and dates of birth seem to be good targets these days. Develop a few worst-case scenarios based on the sensitivities of data managed by your organization, and use these situations to frame your analysis and testing.
Another approach makes use of automated methods to scan for sensitive information. There are some commercial products that can accomplish this, but if it's not in the budget, there are some very sound, free, open-source solutions that can be used to scan large websites or search engines for patterns of interest. Automated scans could be configured to extract information from documents in bulk, looking for classification markings, sensitive keywords, or non-public records that are deemed sensitive or confidential.
Taking this outside-in perspective can help identify problems before they escalate into disasters. Even if you don't identify any immediate concerns, you will gain a perspective on your public footprint that will come in handy if you are called to respond to a major incident involving the loss or exposure of sensitive information.
Eric M. Fiterman is a former FBI special agent and founder of Methodvue, a consultancy that provides cybersecurity and computer forensics services to the federal government and private businesses.