Hybridizing Feature Techniques using Incremental Machine Learning
- 0 Collaborators
Online learnning is a medthod of machine learning where data becomes available in sequential order and is used to update our best predictor for future data at each step as opposed to batch learning techniques which generates best predictor by learning on the entore training datasets at once. ...learn more
Project status: Published/In Market
            Intel Technologies
            
              
                Intel Integrated Graphics
              
            
          
Overview / Usage
Online learning is a method of machine learning where data becomes available in sequential order and is used to update our best predictor for future data at each step as opposed to batch learning techniques which generate the best predictor by learning on the entire training datasets at once.
- 
Online learning is used when it is computationally infeasible to train the entire datasets, requiring the need for algorithms that can process data that are too large to fit in the computer memory at a time. 
- 
It is used when it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function in time, e.g stock price prediction 
The general usage of this model is to implement incremental machine learning with hybrid feature engineering techniques.
Methodology / Approach
- 
We scrapped a news website for different category of news using beautiful soup package in python. 
- 
We perform text preprocessing such as : 
- stop words removal
- type casting
- stemming,
- e.tc
with feature extraction technique using tf-idf bag of words.
- The output feature representation was passed to different feature engineering techniques such as :
- Chi-Square
- Mutual Information Gain
- PCA
for dimensionality reduction.
Technologies Used
Python3
Pandas
Numpy
Matplotlib
Sklearn
Repository
https://github.com/princesegzy01/incremental-machine-learning-techniques-
