Introduction

With the use of Machine learning you can solve challenging problems that impact everyone around the world. Machine Learning (ML) and Artificial Intelligence (AI) are rapidly emerging technologies that have the potential to change our world with speed that humankind has never experienced before. Machine Learning and Artificial Intelligence are not the same, although the current technologies developed for ML do help research and developments on AI. ML can be characterized with a stricter definition from an engineering perspective. Trying to define AI raises more philosophical discussions on what intelligence is. This publication is focused on machine learning. But beware that the terms machine learning and artificial intelligence are intertwined and many so called AI applications are in fact driven by machine learning technology.

You should be aware of the commercial buzz and fads surrounding AI and ML: Machine Learning, deep learning and a lot of tools developed are not ‘a universal solvent’ for solving all current problems. There is no perfect or magic machine learning tool or method yet that can solve all your complex problems. Machine learning is just a tool to solve a certain type of problems. In future the use of machine learning tools and can be applied to a broader landscape of problems. But do not try to solve all your problems with one (new)technology or toolset.

Artificial Intelligence and Machine Learning are now again in the forefront of global discourse, garnering increased attention from practitioners, industry leaders, policymakers, and the general public.

But despite the hype and money invested in machine learning technology last 5 years, one big questions remains: Can machine learning technology already help us within time to solve hard and complex business problems like climate change, health welfare for all humans and other urgent problems? This publication gives you a reality check. You will learn what is easily possible using new machine learning technologies and tools, what the current potential is and what still remains wishful thinking for the future.

Innovation needs openness. This is also more than valid for machine learning technologies. Without real openness new developments and innovations in machine learning will be impossible. As a practitioners in your business domain and with your unique expertise you can start making a difference. This publication gives you a starting point for trying to apply machine learning technology on your unique use cases.

What is covered in this book?

Nowadays many people are talking about the transformative power of machine learning and how it will revolutionize the economy, but what does that mean for your business and how do you get started? How to get solid independent advice to learn and how to apply machine learning? Can improve or disrupt your business using FOSS machine learning tools that are widely available? This book gives you an introduction to get started with applying machine learning.

Machine learning concepts are mostly taught by academics and for academics. That’s why most learning material is dry and maths heavy. The theory behind machine learning is great, but requires also a very deep understanding within statistics and math. There is a large gap between theory and practice. Practice counts, because in a practical business context you want to determine if you can solve your problems with machine learning tools. Or at minimum do a short and cost efficient run to determine if a project has potential and more investments makes sense.

This publication is created for applying machine learning in practice for real world use cases. This is where the rubber meets the road. So the core focus is on the ‘How’ questions. So in architecture vocabulary key concepts are outlined and a conceptual and logical reference machine learning architecture is given to empower you to make use of machine learning technology in a simple and efficient way.

To apply machine learning in real business use cases other skills besides some feeling for statistics and math are required. You need e.g. be able to have some knowledge about all typical IT things that are still needed before you can make use of the new paradigm that machine learning brings.

The field of machine learning is making rapid progress. Do you know what kind of applications for direct business use are already possible today? Are you aware of the currently low entry barriers that exist, to take direct advantage of machine learning? Is your knowledge of free and open source solutions available in the machine learning eco system up to date? How do you classify safety, security and privacy risk when using machine learning? These and other relevant questions for using machine learning in a business context are the foundation of this book.

Within the machine learning domain new toolsets, applications and companies are being created on a daily basis. So it is difficult to get a hold on what ML applications are viable, and which are a hype, fads or simple a hoax. Especially when the terms ML and AI are intertwined. This book guides you to tangible working open source machine learning software. The provided FOSS machine learning software in this publication is used at large. For real business use cases, maybe with large similarities with your use case.. And because a lot of ML software and tools needed is open (FOSS) software, solutions and tools available can be studied and improved.

Given that machine learning tools and techniques are already an increasingly part of our everyday lives, it is crucial for professionals in the IT industry to gain more knowledge on machine learning. And start asking critical questions and try to do some simple experiments. What will you be doing with machine learning tools and applications the coming 3 years? Are you really aware of the safety and privacy concerns evolving that are part of this technology? Do you really understand and control the working?

This book is all about taking advantage of the new FOSS machine learning technologies for your business. The major machine learning concepts are explained, but the main emphasis of this book is to give insights in the various possibilities that are available within the open source machine learning ecosystem. This so you can start applying machine learning in your business today, without unclear dependencies or unknown strings attached towards a vendor.

This book gives an overview of all important FOSS machine learning frameworks and FOSS machine learning support tools that you can use for prototyping or to use for real production systems.

This book will not explain and dive into the statistics and deep mathematical algorithms behind machine learning. Also the algebra functions that form the foundation under machine learning algorithms and software libraries are only explained if needed for practical use and experiments. If you are interested in learning the mathematical foundations on which machine learning is developed you can find good open starting material in the reference section of this book.

This book aims to cover the high level machine learning concepts and gives you information to get started to work with machine learning for your business use case.

So this book is concentrated on machine learning aspects where software, business and technology touch each other.

Domains touching

(* When we write Open Source Software or OSS in this report we explicitly mean FOSS as defined by the Free Software Foundation - FSF.org )

Who should read this book?

This book is created for everyone who wants to learn and get started with machine learning without being already forced into a specific solution. Creating mMachine learning applications is possible with the use of FOSS building blocks only and on premise. So you do not need to use sometimes expensive Cloud infrastructure or commercial software packages. So if you like IT architecture, simple concepts and want to be empowered to play with machine learning and create your own solution, than this publication if for you.

This book is primary written with software developers, system administrators, security architects, privacy controllers, IT managers, directors, business owners, system engineers, quality managers, IT architects and other curious people interested in open technologies in mind.

This book crucial outlines concepts, but will not go into too much mathematical or technical details. However after reading this book you will have a more complete and realistic overview of the possibilities applying machine learning (ML) for your use cases.

Why another book on Machine Learning?

There are many books, courses developed and tutorials that you can use to learn you what machine learning is. However most of these books and courses are focused on hands on learning and require you to program. Also many books are focused on explaining concepts without a clear focus on how tools can be used on real business use cases. Also a good publication that is truly open and is focused on the broad landscape that is needed for Free and Open Machine learning was simple not available.

Despite the enormous buzz and attention for machine learning it is proven to be hard to apply machine learning for real profitable use cases. Applying machine learning starts with a broad overview of the concepts, the architecture, constrains and insights in the technology components with pitfalls that are present.

Is Machine Learning complex?

You might get the impression when visiting presentations from commercial vendors that machine learning is simple. The hard work is already done and all you have to do is get your credit card and make use of the incredible machine learning cloud offering. This machine learning as a service (MaaS) will take your company to the next level and the advise of the sales consultant is clear: Using their MaaS service is so simple that entering your credit card number is probably the hardest part. Maybe it will take a minute, maybe more. But in the end you will find out that solving problems using machine learning are not that simple after all. The great offerings of many large and small vendors selling MaaS from a fantastic cloud offering will not solve your business problem in a simple way. As with all new technologies and especially IT technology: There are over promises on advantages and getting the return on your investments is not simple. You will be confronted with complex terminology, a machine learning back-box from your vendor that is of course great at billing, data collection and data cleaning problems you had never heard of, and security, privacy and even safety issues. And if you think it can not get worse also legal and ethical issues will slow your project down.

By using an open approach (tools, methods, datasets) for machine learning a lot of risks can be mitigated. E.g. it is easier to control spending in the important ramp up phase of your project. If needed for production and scalability you can always move calculation to a cloud platform in a later stage.

There have been tremendous advances made in making machine learning more accessible over the past few years. This book outlines some great OSS applications ready to be used, even if you really hate difficult mathematical formulas. Multiple developments are in progress that now really make it possible to drop your data and let a complex ML algorithm do the hard work.

But don’t be fooled. Solving some type of problems using machine learning tools remains a relatively ‘hard’ problem. So equipped with the rights knowledge, tools and resources it is possible to get great results. Solving soft business problems with machine learning requires far more than a good computer scientist alone. Using ML for soft problems requires a variety of disciples and a lot of creativity, experimentation and tenacity.

Organization of this book

The topics explored in this book include: Chapter ‘tbd ’ outlines why openness and OSS is so important for machine learning. Chapter ‘tbd ’ dives into the basic concept and terms that come with machine learning.

Todo

This part will be created when all key parts for the 2020 version are committed.

Errata, updates and support

We have made serious efforts to create a first readable version of this book. However if you notice typos, spelling and grammar errors please notify us so we can improve this book. Since the world of machine learning is rapidly evolving some parts of this book will needs updates to present to you the latest machine learning solution building blocks. That’s why there is also an on-line version of this book available that will incorporate the latest updates.

If like to contribute to make this book better: Please CONTRIBUTE! See [section HELP]