news

The ABCs of Data Science Algorithms

Spread the love
Data science algorithms are never a one-size-fits-all solution. Do you know what makes sense for your business?

Today, big and small companies around the world are racing to adopt the latest tools in artificial intelligence and machine learning. While data is often positioned as the blanket cure for every business malady, those who work in the field understand all too well that data science algorithms are never a one-size-fits-all solution.

Image: nobeastsofierce - stock.adobe.comImage: nobeastsofierce - stock.adobe.com

As the field rapidly evolves, there are a growing number of advanced algorithms available for businesses to deploy in their day-to-day operations. From tools based on deep neural networks, clustering algorithms to time-series analysis, these solutions can resolve a wide range of business problems. However, out of this mass of options, the biggest challenge for an organization may be as simple as sourcing the right data and asking the right questions. 

Adaptability: The importance of long-term thinking

Before shopping for a cutting-edge data science algorithm, the first step in any enterprise purchase is defining the problem. Leaders should consult representatives across the company’s business units for insight into recurring questions and areas where increased efficiency is needed.

It’s not enough to solve for today’s problems, however. A comprehensive understanding of the company’s future goals in the context of a broader digital transformation strategy is crucial to maximizing the investment of money and labor required to deploy a new enterprise AI solution.

For this reason, leadership must understand how a data science platform — including data repositories and data processing pipelines — will be called upon in one, five- or 10-years’ time. While the accelerating evolution of data science methods will inevitably affect the digital transformation strategy, considering the big picture from day one will ensure the efficient deployment of AI. Embracing open industry standards for data science model representation, like the Predictive Model Markup Language (PMML) or the Open Neural Network Exchange (ONNX), will ensure long-term interoperability and independence from single-vendor lock-in.

Big data needs a flexible infrastructure

An organization’s raw data is the cornerstone of any data science strategy. Companies who have previously invested in big data often benefit from a more flexible cloud or hybrid IT infrastructure that is ready to deliver on the promise of predictive models for better decision making. Big data is the invaluable foundation of a truly data-driven enterprise. In order to deploy AI solutions, companies should consider building a data lake — a centralized repository that allows a business to store structured and unstructured data on a large scale — before embarking on a digital transformation roadmap.

To understand the fundamental importance of a solid infrastructure, let’s compare data to oil. In this scenario, data science serves as the refinery that turns raw data into valuable information for business. Other technologies — business intelligence dashboards and reporting tools — benefit from big data, but data science is the key to unleashing its true value. AI and machine learning algorithms reveal correlations and dependencies in business processes that would otherwise remain hidden in the organization’s collection of raw data. Ultimately, this actionable insight is like refined oil: It is the fuel that drives innovation, optimizing resources to make the business more efficient and profitable.

Consult with domain experts

As new methods and tools become popular, data science is witnessing increased specialization. As such it is necessary for businesses to consult a host of experts before updating its data science strategy. If a company doesn’t have the appropriate in-house talent, it’s best to work with a trusted partner to consult on the first couple of projects together. These experts can offer unique insight into the available options and troubleshoot for how these tools will be applied in a business.

Data-driven decision making

To ensure maximum ROI and a smooth transition to a new data-driven strategy, it’s crucial the executive team buy into supporting the entire process, especially when a new program is introduced to the organization. The truth is that there will be challenges and not all projects will find long-term success, so executives must offer the data science team room for experimentation and enable them to find the right algorithms and practices.

Evaluation is evolution

After the painstaking work of identifying an appropriate algorithm to deploy in the business and rolling out the new solution, it’s vital to observe and continuously evaluate the entire process. It’s likely that time-to-market took precedence over perfection when developing and deploying the program, so it is critical to remain true to an agile/iterative process and allow for changes post-deployment. By designing a process that allows use cases to be easily shared internally, all teams can be part of the learning process and ultimately accelerate adoption.

Fundamentally, when a team of data scientists sets out to transform and evolve business practices

Michael Zeller serves as the secretary and treasurer for the Association for Computing Machinery (ACM) SIGKDD, organizing body of the annual KDD conference, the premier interdisciplinary conference bringing together academic researchers and industry practitioners from the fields of data science, data mining, knowledge discovery, large-scale data analytics and big data. KDD 2020 will take place virtually August 23-27. Zeller is also the currently Head of AI Strategy Solutions at Temasek.