Photo by Volodymyr Hryshchenko on Unsplash

The curse of dimensionality as we Data Scientists like to call it. When you have too many features for a segmentation project but you also don’t know which ones to pick. To avoid losing information your model needs to make good clusters, you stick them all in instead, hence you are cursed with a dimensionality problem.

For a more concise definition — The dimensionality problem occurs when machine learning algorithms cannot efficiently train on data because of the sheer size of the feature space.

What if there was a better way to pick only the valuable information needed and drop…


image_source

When I started my journey as a Data Scientist, Spark was one of the few industry buzzwords I couldn’t wrap my head around at first. My questions were:

  1. What is Spark?
  2. Why use Spark?
  3. What stage do I have to use Spark in my Data Science journey?

My first few months of dealing with data involved jupyter notebooks and pandas dataframe. I wasn’t delving into much of big data and what it means to process big data on your local machine. But then big data is relative — there is no clear definition of big data.

Some people might define…


NETWORK ANALYSIS

Complex Network Analysis studies how to recognise, describe, visualise and analyse complex networks. The most prominent way of analysing networks is using Python Library NetworkX which provides a prominent way for constructing and drawing complex Neural Networks.

The prominent reason for the explosion of CNA research and applications is due to two factors - one is the availability of cheap and powerful computers that enables researchers and scientists with advanced training in mathematics, physics and social sciences to perform top notch researches and the other factor is the ever increasing complexity of social, behavioural, biological , financial and technological aspects…


Convolutional Neural networks have consistently proved its prowess in image recognition, detection and retrieval but what can we say about its ability in video classification. Research performed by a group of researchers at Stanford in 2014 led by Andrej Karpathy who is currently the director of AI at Tesla, identified several challenges to applying CNNs in this level of application, with one being no benchmarks that can match the variety and magnitude of existing image datasets because videos are significantly arduous to collect, comment on and store. …


GRID TRANSMISSION LINES

Grid Frequency has always been a challenge to manage in many developing countries most especially in Africa which is my home continent. Ever since embarking on the data science journey having started as an Electrical Engineer. One question has been plaguing my thoughts — “Can Grid Frequency Prediction make Nigeria (my home country) electricity system a lot better than what it is today.

The Federal Republic of Nigeria, located on the western bend of the African continent is the world’s ninth-largest exporter of oil. The republic was declared in 1960 and has since become a nation with a $375.8 billion…


Neural Network

Neural Networks provide a lot of service to data science, machine learning in particular. Although there are various algorithms used in training a model.I am going to explain the rationale behind Backpropagation. To understand Backpropagation, it is a paramount to know what a Multi-layer Neural Network (MNNs) is which is a type of Feed Forward neural network.

Artificial neural network can mimic the learning ability of its biological counterpart and is used as an approach to Machine Learning. Machine learning involves adaptive mechanisms that enable computers to learn from experience and/or by example. …

Yusufaolodo

Data Scientist and Machine Learning Engineer in the Retail Industry

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store