Machine Learning is a subfield of Computer Science and refers to algorithms and techniques through which systems "learn", autonomously, with each of the tasks they perform. In this way, we can say that the computer improves its performance in a given task every time it is performed. These algorithms consist of training a model using sample inputs to make predictions or decisions guided by data rather than simply following explicitly programmed instructions. While in Artificial Intelligence there are two types of reasoning (the inductive, which extracts rules and patterns from large datasets, and the deductive), Machine Learning only cares about the inductive.
Supervised Learning is the term used whenever the program is "trained" using a predefined set of data. Based on the training using pre-defined data training, the program can make accurate decisions when it receives new data. Example: You can use a human resource data set to train the Machine Learning algorithm, which has tweets marked as positive, negative, and neutral, and thus train a sentiment analysis classifier.
Unsupervised Learning is the term used when a program can automatically find patterns and relationships in a data set. Example: analysis of a set of e-mail data and automatic grouping of e-mails related to the topic, without the program having any previous knowledge about the data.
Reinforcement Learning is concerned with how an agent should act in an environment that maximizes some sense of reward over time. Reinforcement Learning algorithms try to find the policy that maps the states of the world to the actions that the agent must take in those states. Reinforcement Learning is distinguished from the problem of Supervised Learning in the sense that correct input / output pairs are never presented, nor are sub-optimal actions explicitly corrected.
Classification algorithms are a sub-category of Supervised Learning. Classification is the process of taking some kind of input and assigning it a label. Classification systems are generally used when predictions are of a distinct nature, i.e. a simple "yes or no". Example: mapping an image of a person and classifying it as male or female.
Another subcategory of Supervised Learning used when the value being predicted differs from a "yes or no" and follows a continuous spectrum. Regression systems could be used, for example, to answer the questions: "How much?" or "How many are there?".
Clustering is an Unsupervised Learning method which consists of assigning a set of observations to subsets (so-called clusters) so that observations within the same cluster are similar according to some pre-designated criterion or criteria, whereas observations made in different clusters are not similar. Different Clustering techniques make different assumptions about the data structure, often defined by some similarity metrics and evaluated, for example, by internal compactness (similarity between members of the same cluster) and separation between different clusters. Other methods are based on density estimates and connectivity graphs.
Recommendation systems are methods based on Machine Learning to predict the classification that users would give to each item and displaying to them those items that were (probably) well classified. Companies like Amazon, Netflix and Google are known for the intensive use of Recommendation systems with which they gain great competitive advantage.
A Decision Tree is a decision support tool that uses a tree chart or decision model and its possible consequences. A decision tree is also a way to visually represent an algorithm.
Support Vector Machines
Support Vector Machines (SVM) are a set of supervised type Machine Learning algorithms used for classification and regression. Given a set of training examples, each marked as belonging to one or two categories, an SVM training algorithm constructs a model that predicts whether a new example falls within one category or another.
In probability and statistics, a Generative Model is a model used to generate data values when some parameters are unknown. Generating models are used in Machine Learning for any data modeling directly or as an intermediate step for the formation of a conditional probability density function. In other words, we can model P (x, y) in order to make predictions (which can be converted to P (x | y) by applying the Bayes rule), as well as to be able to generate probable pairs (x, y ), which is widely used in unsupervised learning. Examples of Generator Models include Naive Bayes, Latent Dirichlet Allocation and Gaussian Mixture Model.
Discriminative models or Conditional models are a class of models used in Machine Learning to model the dependence of a variable y of a variable x. As these models attempt to calculate conditional probabilities, that is, P (y | x) they are often used in supervised learning. Examples include logistic regression, SVMs, and Neural Networks.
A Genetic Algorithm is a heuristic search that mimics the process of natural selection and uses methods with mutation and recombination to generate new genotypes in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms have found some utility in the 1980s and 1990s. In reverse, Machine Learning has been used to improve the performance of genetic and evolutionary algorithms.