020 7193 4905

Using keyword searches to locate evidence

Keywords are a great way to locate data related to your investigation but they come with some noteworthy downsides. In my last blog article, I talked a little about what makes a good keyword (uniqueness) and a few of the points to keep in mind when instructing your data analyst or forensic examiner about what to search for.

In this post, I’d like to expand on those themes a little and then look at some of the other points which are important when faced with a large quantity of data and tight deadlines.

We already know that uniqueness is a great thing to have in a potential keyword – one of the key artefacts that I always home in on when selecting keywords to search for are any misspellings of common words which point directly to the subject of the investigation. We all have words that we commonly misspell, for me, one of them is ‘necessary’. I always get the wrong number of ‘s’s and ‘c’s, so in looking for documents that I have authored, my misspelled version of ‘necessary’ would be a great word to throw in the mix.

If you have access to a collection of material written by the subject, then a few minutes of reading may highlight particular phrases or misspellings unique to that individual that will help focus your searches and reduce the amount of false positives you have to review.

I spoke last time about longer phrases – when searching for legal contracts or original copies of existing emails, try selecting a six or seven word phrase from the document, making sure to copy exactly any punctuation, misspellings, abbreviations and so on. It’s often the case that I’m provided with printed copies of legal contracts and asked to trawl through a dataset to locate the original electronic copy. By selecting a phrase unique to that contract, I can really improve my chances of getting successful hits on the correct document and reduce the number of false positives.

Avoid standard template text which is common to all of the contracts, unless of course, you want to locate all similar documents, perhaps to uncover evidence of other related activities that may have gone unnoticed.

Another important aspect to consider, which ties in nicely to uniqueness is alternative spellings. This is especially important when dealing with people’s names. I most often encounter this when the names are from cultures that don’t use an English or European character set. Arabic names are a good example of this – there are many different English interpretations of the spelling for the name Muhammad, for example.

So Your keyword strategy needs to cover both the precise and the fuzzy – in other words, the exact text copied from an authoritative source, like a contract or email, and other words or phrases for which the spelling, arrangement and order may be unclear or unknown.

In my next article, I want to look at ways you can reduce the amount of data you have to search and techniques for optimising your searches by combining many searches into one – this is a great time saver, especially when your boss is pressing you for answers.

By John DouglasTechnical Director, First Response

More articles related to this category: Digital Forensics
Other related subjects: