How AI and machine learning can help solve IT’s data management problem – TechRepublic

Image: iStock/surfleader

According to Samsung, global internet traffic surpassed one zettabyte or one billion terabytes in 2016. That number is huge, but it doesn't begin to approach the total data that companies are storing.

Even more concerning is the possibility that, at most companies, data "under management" is a misnomer.

Key areas of data management challenge are:

IT departments struggle in these areas for the following reasons:

The question now is: can machine learning, artificial intelligence (AI) and analytics provide assistance in the area of data managementespecially for the large amount unstructured data?

SEE: As EU's General Data Protection Regulation (GDPR) looms, tech vendors ready pitches (ZDNet)

Here is where machine learning, AI and analytics can help:

Sorting through dark data

Every corporate system, and every business department, has troves of data that have accumulated but that people know nothing about. By using machine learning and combining its power with algorithms that address how to sort and handle different types of emails, documents, images, etc., stored on servers, machine learning, AI and analytics can go to work on this unplumbed data and pre-sort it for you. A knowledgeable human can then review what the automation recommends as a data classification scheme, tweak it, and perform the scheme. Part of the process could also address data retention, with the analytics producing a set of recommendations on which data could potentially be purged from files.

Deciding what to throw away

Machine learning, analytics, and AI can objectively identify data that is seldom or never used, and recommend that you throw it away, but it doesn't have the same discernment abilities that employees do. For instance, these processes can pick out pieces of data or records that haven't been accessed for more than five years, indicating that the data could be obsolete. This saves an employee time hunting down this potentially obsolete data, because now all they need to do is to determine whether there is any reason to keep it.

Aggregating data

When analytics developers determine the kinds of data they need to aggregate for queries, they often produce a repository for the application, and then pull in various types of data from different sources to make up an analytics data pool. To do this, they must develop integration methods to access the different sources from which they pull data. Machine learning can make this still very manual process more efficient by automatically developing "mappings" between data sources and the application's data repository. This cuts down integration and aggregation times.

Organizing data storage for best access

Over the past five years, data storage vendors have made significant inroads into automating storage management, thanks to the development of lower cost solid state storage. These technology advances have enabled IT departments to use "smart" storage engines that use machine learning to see which types of data are used most often, and which are seldom or never used. The automation can be used to automatically store data in fast or slow storage, based on the business rules inserted into machine algorithms. The automation saves storage managers from having to address storage optimization manually.

Data management is a major IT challenge that is not close to resolution in most organizationsand it is going to get worse as the data continues to stream in.

CIOs, data architects, and storage managers need to highlight the issue to C-level executives, but data management projects are not easy "sells."

Nevertheless, by pointing out the value of faster times to market for analytics and potential person power and storage cost reductions for data management, IT managers at least have viable entry points into C-level discussions about how to increase strategic agility and reduce cost of operations at the same time.

View original post here:

How AI and machine learning can help solve IT's data management problem - TechRepublic

Related Posts

Comments are closed.