Basically Machine Learning consists of 3 phases:
Figura 3 - Machine Learning Workflows
Features are the set of data that define the instances that we will use as training data. Later, through the model defined with Machine Learning with the characteristics of the instance to be predicted, we will obtain the label that defines the new instance.
Feature Representation Examples:
Figura 4 - Example of feature representation
Python is the language used to implement Machine Learning models. Within Python, the libraries used by Machine Learning are the following:
is an open source library that unifies the main algorithms and functions of Machine Learning under a single framework. In this way, it greatly facilitates all stages of creation, evaluation and optimization of predictive models. Links to the library documentation:
Provides a variety of useful scientific computing tools. These include statistical distributions, function optimization, linear algebra, and a variety of specialized mathematical functions. With scikit-learn, support is provided for sparse matrices, a way of storing large tables consisting mostly of zeros. Links to the library documentation:
Provides fundamental data structures used by scikit-learn, particularly multidimensional arrays. In general, data input to scikit-learn will be in the form of a NumPy array. Links to library documentation:
Provides key data structures such as DataFrame. Additionally, it supports import and export reading and writing data in different formats. Links to the library documentation:
Finally, the following libraries for graphical data representation:
![]()
Features are the set of data that define the instances that we will use as training data.