
Artificial Intelligence (AI) is possibly the most widely discussed technology trend. There are strong views about the applications and benefits of AI, while some feel that it will bring in the apocalyptic day of machine controlling human beings; a few others believe that AI will be extremely beneficial for humans. In the near term, AI and machine learning is becoming critical to solve important business and social problems. In many cases, machine intelligence is being used as a supplement to humans, thereby “augmenting” human intelligence.
It is commonly believed that “mathematically complex” algorithms are the biggest challenges for leveraging the power of machine intelligence. However, a detailed discussion with any practitioner will reveal that this is mostly not the case. Integrating data across multiple sources and resolving data quality issues is often the most important challenge in leveraging machine intelligence. This is also the biggest challenge even if one desires to derive simple rules or insights from data.
Due to data integration and data quality challenges, most data driven initiatives within organisations tend to take very long to demonstrate tangible results. This causes budget escalation and frustrations among senior leadership.
Traditionally, organizations used to adopt a linear approach to data integration, wherein detailed Extract, Transform and Load (ETL) processes were created to obtain data from various systems and establish relationship between hundreds of data tables that belong to various source systems. These data warehousing initiatives used to take atleast 18 to 24 months to complete. In addition, it was often difficult to incorporate unstructured data like images, text files, streaming data from sensors etc.