Increasing Trends of Social Media

Social Media has immensely evolved in the past decade, it’s a platform that engages people from around the globe so that they could communicate and facilitate easily wherever they are. Now a day’s…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




How to impose Principal Component Analysis on a House Price Regression

The last several posts I have written about has concerned how to reduce the features of a dataset to hopefully remove redundant or nonessential information, reduce noise and improve accuracy of predictions. A recent post I have written regarding feature selection can be found here:- https://medium.com/mlearning-ai/how-to-select-features-using-selectkbest-in-python-c5a5239969f0

One way to reduce the features of a dataset that is not necessarily feature selection is principle component analysis, or PCA. PCA is a linear reduction technique using Singular Value Decomposition of the data to project it to a lower dimensional space. In linear algebra SVD is a factorisation of a real or complex matrix. It generalises the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any m * n matrix.

PCA is used to decompose a multivariate dataset in a set of successive orthogonal components that explain a maximum amount of the variance. The input data is centred but not scaled for each feature before applying the SVD.

In this post I will illustrate how PCA can be used to reduce the dimensionality of a latest with 79 features, the Ames House Price dataset. This dataset can be found in the Kaggle website under the competitions section. I have made copious submissions to Kaggle in an endeavour to improve my score, so I decided in this instance to give PCA a go to see if dimension reduction will help in this area.

I have written this program in Kaggle’s free online Jupyter Notebook. One good thing about this notebook is the fact that it is stored on the Ames House Price competition file and is easily retrievable.

The problem statement for the program I created is in the screenshot below:-

Once the program was created, I imported the libraries that I would need, which were numpy, pandas, sklearn, matplotlib and seaborn. It is important to keep…

Add a comment

Related posts:

I Have No Poems Of Light

A poem about surrender. “I have no poems of light” is published by Uma Bode.

Mythology is coming towards future?

My name is Manyata Pattnaik I love mythology . Guys if we are INDIANS we should know about Dashavatar. if you don’t know I will tell . It’s about Lord Vishnu’s 10 avatars. In Hindi and Sanskrit we…

The Best Way To Get Started With Digital Writing

Digital writing is fascinating. The ability to stop the scroll and hold attention is one of the most valuable skills in the world. I haven’t been this excited about something since Bitcoin in 2017…