ILTA White Papers

Risky Business

Issue link: http://read.uberflip.com/i/45522

Contents of this Issue

Navigation

Page 54 of 73

Technology helps meet the challenge. High- speed, highly optimized platforms are available that can search across all data repositories, find what is required according to policy and move this data into an archive. Using such technology, corporate data can be managed as follows: • Index the data. In order to search the data, it must first be indexed. In order to index large volumes of data, you must have a system that is scalable and efficient. Understand the speed of indexing. If a system says it can index 1TB of data each day, that will be very slow compared to one that can index 1TB per hour. Also look for an indexing system that will not require a large amount of storage for the index itself. If you index 1TB of data and the index requires 500GB of storage (index is 50 percent of data size) then you will have a challenge to scale. If you use an indexing platform that only needs 50GB of storage (index is 5 percent of data size) then this is more reasonable and allows indexing of larger volumes of data. • Index legacy data. Don't forget about the data hidden away on backup tapes. This is a major liability and needs to be included in any sound information governance strategy. To process and index the data, utilize direct indexing technology that will scan the tapes and generate a searchable index. You don't want to restore all the tape content in order to discover what you need; direct indexing avoids the need for total restoration. • Take existing policy, and turn it into a query. Queries will be applied in order to cull down the data to the relevant subset. Understanding what policies are enforced, queries can easily be defined. If you are 56 Risky Business ILTA White Paper looking to archive "John Doe's" mailbox from June 2008 to July 2010, this is a simple query. If you are looking for all intellectual property containing the keyword "secret recipe" and contained in PDFs and MS Word documents, this is a more complex query. • Set up the queries to run against the data on a scheduled basis. This will be your policy engine that will run against the data when created and extracted into the archive as a result of the queries. • Automate the process. Automated deduplication and deNISTing will eliminate all the useless information. Automated culling tools to eliminate spam email and content out of date range will eliminate additional content. Use automation to your advantage to save time and money. • Unify platforms. Utilize a single discovery solution that supports online and legacy content so that you can take advantage of deduplication and culling across all platforms. This will save time and resources, allowing you to take one pass across the data in order to find the relevant content. • Make the process defensible. You will have to defend the process at some point in time. Make sure you have a process that utilizes standard processes and procedures. Use a single solution that does not require translation of the data, a solution that has been utilized in the courts and has stood up to scrutiny. IT PLUS LEGAL: PROTECT THE DATA, PROTECT THE ENTERPRISE FROM HARM The corporate practice of hoarding data must come to an end. This can only happen with a more integrated

Articles in this issue

Archives of this issue

view archives of ILTA White Papers - Risky Business