(Dusit/Shutterstock)
Artificial intelligence is becoming more ubiquitous and necessary these days. From preventing fraud, real-time anomaly detection to predicting customer churn, enterprise customers are finding new applications of machine learning (ML) every day. What lies under the hood of ML, how does this technology make predictions and which secret ingredient makes the AI magic work?
In the data science community, the focus is typically on algorithm selection and model training, and indeed those are important, but the most critical piece in the AI/ML workflow is not how we select or tune algorithms but what we input to AI/ML, i.e., feature engineering.
Feature engineering is the holy grail of data science and the most critical step that determines the quality of AI/ML outcomes. Irrespective of the algorithm used, feature engineering drives model performance, governs the ability of machine learning to generate meaningful insights, and ultimately solve business problems.
Feature engineering is the process of applying domain knowledge to extract analytical representations from raw data, making it ready for machine learning. It is the first step in developing a machine learning model for prediction.
Feature engineering involves the application of business knowledge, mathematics, and statistics to transform data into a format that can be directly consumed by machine learning models. It starts from many tables spread across disparate databases that are then joined, aggregated, and combined into a single flat table using statistical transformations and/or relational operations.
(NicoElNino/Shutterstock)
For example, predicting customers likely to churn in any given quarter implies having to identify potential customers who have the highest probability of no longer doing business with the company. How do you go about making such a prediction? We make predictions about the churn rate by looking at the underlying causes. The process is based on analyzing customer behavior and then creating hypotheses. For example, customer A contacted customer support five times in the last month implying customer A has complaints and is likely to churn. In another scenario, customer As product usage might have dropped by 30% in the previous two months, again, implying that customer A has a high probability of churning. Looking at the historical behavior, extracting some hypothesis patterns, testing those hypotheses is the process of feature engineering.
Feature engineering is about extracting the business hypothesis from historical data. A business problem that involves predictions such as customer churn is a classification problem.
There are several ML algorithms that you can use, such as classical logistic regression, decision tree, support vector machine, boosting, neural network. Although all these algorithms require a single flat matrix as their inputs, raw business data is stored in disparate tables (e.g., transactional, temporal, geo-locational, etc.) with complex relationships.
(Semisatch/Shutterstock)
We may join two tables first and perform temporal aggregation on the joined table to extract temporal user behavior patterns. Practical FE is far more complicated than simple transformation exercises such as One-Hot Encoding (transform categorical values into binary indicators so that ML algorithms can utilize). To implement FE, we are writing hundreds or even thousands of SQL-like queries, performing a lot of data manipulation, as well as a multitude of statistical transformations.
In the machine learning context, if we know the historical pattern, we can create a hypothesis. Based on the hypothesis, we can predict the likely outcome like which customers are likely to churn in a given time period. And FE is all about finding the optimal combination of hypotheses.
Feature Engineering is critical because if we provide wrong hypotheses as an input, ML cannot make accurate predictions. The quality of any provided hypothesis is vital for the success of an ML model. Quality of feature is critically important from accuracy and interpretability.
Feature engineering is the most iterative, time-consuming, and resource-intensive process, involving interdisciplinary expertise. It requires technical knowledge but, more importantly, domain knowledge.
The data science team builds features by working with domain experts, testing hypotheses, building and evaluating ML models, and repeating the process until the results become acceptable for businesses. Because in-depth domain knowledge is required to generate high-quality features, feature engineering is widely considered the black-arts of experts, and not possible to automate even when a team often spends 80% of their effort on developing a high-quality feature table from raw business data.
Feature engineering automation has vast potential to change the traditional data science process. It significantly lowers skill barriers beyond ML automation alone, eliminating hundreds or even thousands of manually-crafted SQL queries, and ramps up the speed of the data science project even without a full light of domain knowledge. It also augments our data insights and delivers unknown- unknowns based on the ability to explore millions of feature hypotheses just in hours.
Recently, ML automation (a.k.a. AutoML) has received large attention. AutoML is tackling one of the critical challenges that organizations struggle with: the sheer length of the AI and ML project, which usually takes months to complete, and the incredible lack of qualified talent available to handle it.
While current AutoML products have undoubtedly made significant inroads in accelerating the AI and machine learning process, they fail to address the most significant step, the process to prepare the input of machine learning from raw business data, in other words, feature engineering.
To create a genuine shift in how modern organizations leverage AI and machine learning, the full cycle of data science development must involve automation. If the problems at the heart of data science automation are due to lack of data scientists, poor understanding of ML from business users, and difficulties in migrating to production environments, then these are the challenges that AutoML must also resolve.
AutoML 2.0, which automates the data and feature engineering, is emerging streamlining FE automation and ML automation as a single pipeline and one-stop-shop. With AutoML 2.0, the full-cycle from raw data through data and feature engineering through ML model development takes days, not months, and a team can deliver 10x more projects.
Feature engineering helps reveal the hidden patterns in the data and powers the predictive analytics based on machine learning. Algorithms need high-quality input data containing relevant business hypotheses and historical patterns and feature engineering provides this data. However, it is the most human-dependent and time-consuming part of AI/ML workflow.
AutoML 2.0, streamlines feature engineering automation and ML automation, is a new technology breakthrough to accelerate and simplify AI/ML for enterprises. It enables more people, such as BI engineers or data engineers to execute AI/ML projects and makes enterprise AI/ML more scalable and agile.
About the author: Ryohei Fujimaki, Ph.D., is the founder and CEO of dotData. Prior to founding dotData, he was the youngest research fellow ever in NEC Corporations 119-year history, the title was honored for only six individuals among 1000+ researchers. During his tenure at NEC, Ryohei was heavily involved in developing many cutting-edge data science solutions with NECs global business clients, and was instrumental in the successful delivery of several high-profile analytical solutions that are now widely used in industry. Ryohei received his Ph.D. degree from the University of Tokyo in the field of machine learning and artificial intelligence.
Related Items:
Are We Asking Too Much from Citizen Data Scientists?
NECs AutoML Spinoff Takes Off
Making ML Explainable Again
Read the original:
What is Feature Engineering and Why Does It Need To Be Automated? - Datanami
- The Automation Conference [Last Updated On: December 9th, 2016] [Originally Added On: December 9th, 2016]
- The Best Home Automation Systems of 2016 | Top Ten Reviews [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- Compact Automation - Actuators, Hydraulic Cylinders, Linear ... [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- What is Home Automation? | Home Automation Systems [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- Job Seekers - Automation Personnel Services [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- iAutomation [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Beyond Automation - hbr.org [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Automation The Car Company Tycoon Game on Steam [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Automation - Wikipedia [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Build automation - Wikipedia [Last Updated On: December 26th, 2016] [Originally Added On: December 26th, 2016]
- Home - Enerwave Home Automation [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Automation | Technologies | Systems | Integrator ... [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Automation - DESHAZO [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Custom Automation & Machine Design | Automation GT [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- IT Automation - BMC [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Werner Electric | Automation [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automationtechies | Automation Engineering Recruiting [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automation - Mazak Corporation [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automation | Food Engineering [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Test Automation Services for Development of Regression ... [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- UI Automation Overview - msdn.microsoft.com [Last Updated On: February 5th, 2017] [Originally Added On: February 5th, 2017]
- The Evolution of Automation and What It Means for the Integration Industry - Commercial Integrator [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Automation, robots could replace 250000 public sector workers in the next 15 years - Computer Business Review [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- New telecom transformation goals require service automation - TechTarget [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Automation expected to displace insurance underwriters, real estate brokers - CIO Dive [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- The Perks Of Automation And The Risks: Why To Think Twice About Getting Into That Driverless Uber - Forbes [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Voices Reinventing enterprise finance by overhauling AP automation - Accounting Today [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- DFLabs Launches the First Security Automation and Orchestration Platform based Upon Supervised Active Intelligence - Business Wire (press release) [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- VIDEO: Going Big on Automation in a Small Footprint Facility - ENGINEERING.com [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Building a better model of human-automation interaction - Phys.org - Phys.Org [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Cruise Automation Is Testing an App For Hailing Self-Driving Cars - Fortune [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- AlixPartners examines automation in manufacturing and logistics management - Logistics Management [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Women need to look out for each other in automated workplaces - The Guardian [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Automation vs. the H-1B visa program: Which matters to employees? - TechTarget [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Automation is the unavoidable future of the economy - The Daily Cougar [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Speeders beware: Legislation would allow automation crackdown ... - SFGate [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Robots versus bureaucrats: Why public sector work is ripe for automation - Financial Post [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Rockwell Automation Surged 10% in January as Growth Picked Up Steam - Motley Fool [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Global Medical Automation Market to Reach Approximately $75.6 Billion by 2025 - By End User, Application ... - PR Newswire (press release) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Automation 'key' to advancing Thai production - The Nation [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- WorkWave Releases New Lead Management And Marketing ... - PR Newswire (press release) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- 'We employ insane levels of automation' Kris Canekeratne - Times of India [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Most people are optimistic about workplace automation, social data suggests - ZDNet [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Yes, there's a job creation argument for automation and technology ... - The Hill (blog) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Technobabble: Automation and the modern worker - CIO Dive [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Improving Behavior Through Automation of Vehicle Systems - School Transportation News (blog) [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Automation Nightmare: Philosopher Warns We Are Creating a World Without Consciousness - Big Think [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Why Don't We See More Automation in Federal Networks? - Nextgov [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Automation can revitalize the US workforce - Fox News [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Readers Write (Feb. 12): The moose population; jobs, start-ups and automation; diversity in the funny pages - Minneapolis Star Tribune [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Automation can replace bureaucrats and save taxpayers money - Hot Air [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- TigerStop hopes to ride automation to new heights - The Columbian [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Your Most Valuable Resource is Time Get More of it through Automation - CMS Critic (press release) (blog) [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- What Does Device Automation Mean for Users? - Medical Device and Diagnostics Industry (blog) [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- How To Beat Automation And Not Lose Your Job - Forbes [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Logistics firm gets automation boost - The Straits Times [Last Updated On: February 14th, 2017] [Originally Added On: February 14th, 2017]
- PP Control & Automation launch new video to kick-start exciting plans for 2017 - Manufacturer.com [Last Updated On: February 14th, 2017] [Originally Added On: February 14th, 2017]
- Automation's Impace on Data Center Monitoring Alerts - The Data Center Journal [Last Updated On: February 14th, 2017] [Originally Added On: February 14th, 2017]
- Hollysys Automation Technologies Reports Unaudited Financial Results for the First Half Year and the Second Quarter ... - PR Newswire (press release) [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- 4 Automation Hacks to Save You Money and Manpower - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Istuary Innovation Group and Bluewrist Partner to Bring Robotics and Automation into China's Manufacturing Sector - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Redwood Software Named a Strong Performer in Independent Robotic Process Automation (RPA) Report - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Boeing ramps up automation, innovation as it readies 737MAX | The ... - The Seattle Times [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Robots and AI are coming for our jobs, but can augmentation save us from automation? - Digital Trends [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- The Impact of Bad Data in Automation: Why Quality Management is Critical - R & D Magazine [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Automation: Are We Empowering Human Interaction Or Displacing It? - Business 2 Community [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Life in the Fast LaneAutomation with Software-Defined Intelligence - InfoWorld [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Luddite Lefty Journalists Apparently Think Workplace Automation is Conservatives' Fault [VIDEO] - Daily Caller [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Will automation define the future of network technology? - TechTarget [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Editorial: Improving automation - The Motorship [Last Updated On: February 17th, 2017] [Originally Added On: February 17th, 2017]
- TigerText Unveils Role-based Scheduling Automation, Amazon Alexa integration - HIT Consultant [Last Updated On: February 17th, 2017] [Originally Added On: February 17th, 2017]
- 89% people want automation at workplace: Adobe - Economic Times [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Delta veers to EV parts, automation - Bangkok Post [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Robotic process automation makes nearshore outsourcing more ... - CIO [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- The working-class job that Trump could save from automation - Washington Post [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- China must be ready for automation - Basic Income News [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Bill Gates Says Robots Should Be Taxed Like Workers - Fortune [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Trump and automation challenge India's IT industry - VentureBeat [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Both Trump and Automation Are Challenging India's IT Industry - Fortune [Last Updated On: February 20th, 2017] [Originally Added On: February 20th, 2017]
- 89% people want automation at workplace: Adobe - ETCIO.com [Last Updated On: February 20th, 2017] [Originally Added On: February 20th, 2017]